“How can I integrate PHP and R?”
I know I’m not the only one who’s asked this question. After all, with great content management systems like Drupal, it would very cool to be able to drop an R module into some PHP code and instantly have a web app popping out some snazzy looking ggplot graphics. After spending some time Google searching for an easy implementation and finding very little in the PHP + R space, I was able to piece together a method for integrating the two. It uses no JavaScript… no AJAX… just plain old PHP.
Read the rest of this entry »

After a month of on-again, off-again coding, I’ve finally completed a web site geared towards calculating the Value at Risk of the average investor’s portfolio. The site is visualvar.com. The big idea was to combine the statistical and visualization tools of R (especially ggplot2) with the web interface of Drupal. While I’m happy with the results, I think this may only be the tip of the iceberg in mashing up these technologies. As a side note, I took a bit of a shortcut and I don’t actually have R running directly on the web server, which means I had to settle for ‘overnight’ calculations rather than ‘On-Demand’. But I still think it is a good proof of concept for combing Drupal with R.
Read the rest of this entry »
I recently posed a question on stackoverflow on whether anyone knew an efficient way to save an R plot to a MySQL database as a BLOB. My plan was to use my personal desktop to perform R routines and save them to a web server, where they could then be accessed and displayed on a web page using a little PHP magic. After getting numerous responses on what a terrible idea this was, I was able to piece my own solution together. The steps are fairly simple. First, save the plot as a temp file, Second, read it back into R as a binary string. Third, insert the binary text into the database using the RODBC library. The code snippet is below.
Read the rest of this entry »
Just for fun, lets use a little web voodoo to combine Yahoo Finance data with Google Charts. Why?… because we can. All we need is a little PHP, a web server, and a URL. The goal is to download Yahoo Finance daily price data for two stocks, convert it to an array of daily returns, and plot the results using Google Charts simply by calling a URL. For instance, to produce this:

We just need to type this into our browser: http://stotastic.com/php/scatterPlot.php?tick1=BP&tick2=SPY&mm=01&dd=01&yyyy=2008
Where tick1 and tick2 are the stock tickers; and mm, dd, yyyy represent the starting date from which data is grabbed. So lets dig in and find out how PHP is used to make all this web magic work.
Read the rest of this entry »
I recently needed to generate multivariate normal values in a web browser. Since I was unable to find a simple implementation, I wrote the following JavaScript code. This code is not the fastest, and it also makes use of the Math.random() method which I can’t vouch for, but it seems to do a decent job. The first two functions are needed to generate normal values via the Box-Muller method. This implementation uses the basic trig functions of the JavaScript Math object. The third function performs a Cholesky decomposition, which is need to decompose a correlation matrix. Finally, the last function combines these functions to generate iid normals and then transform them to correlated normals.
Read the rest of this entry »
With the stock market freaking out and all, I figured I should take a look at how volatility was being priced in the option market. The CBOE generously provides snapshots of market data for anyone interested to download. By using this data, we can calculate the markets ‘implied volatility’, or level of ‘freaking out’. For those not familiar with the concept of implied volatility, essentially we can take the prices of options in the market and back out the volatility implied by those prices using the Black-Scholes formula. Its been shown over and over again that the assumptions of the Black-Scholes model don’t hold up to empirical data; but its an easy calculation to perform, and so implied volatility is a widely used metric. Anyway, below is my Black-Scholes option pricing function and the function used to back out implied volatility (written in R of course). Since implied volatility can only be found numerically, I used the Bisection Method to calculate it since it was easy to implement, but there are faster methods out there.
Read the rest of this entry »
If you go to Google and search for “Black Scholes” you are bound to come across a long list of articles that derive the Black-Scholes PDE and Call Price formula. Before I learned about the more technical issues of stochastic calculus and martingale measures, I would read these derivations and assume the authors were the experts. There always seemed to be some hand waving going on, but I figured it was just a complex subject and I just didn’t fully understand all the details. Now, some years later and after having formally learned the material, I find myself in disbelief at how sloppy and often wrong some of these derivations are. As an example, take a look at Wikipedia’s article on the Black Scholes model. Now forget everything you just read. This article is my attempt to straighten things out. I will try to be more rigorous than most, but I may skip over some of the regularity conditions which concern the pure math types.
Read the rest of this entry »
I find options fascinating because they deal with the abstract ideas of volatility and correlation, both of which are unobservable and can often seem like wild animal spirits (take the current stock market as an example). Understanding these subtle concepts is never easy, but it is essential in pricing some of the more exotic options which involve multiple underlying stocks. To set the scene, let’s pretend that your neighbor wants to make a bet with you where he will pay you $100 if Google (GOOG) and Apple (APPL) are above 500 and 240 respectively after 1 year, but you have to pay him $25 today. How would we determine if $25 is a good deal or not?
Read the rest of this entry »
The Case-Shiller Home Price Indices measure residential home values for 20 cities in the US, with some indices going all the way back to the 80s. With housing prices all the rage these days, we should perform a quick-and-dirty analysis using R to see what we can glean from this rich dataset. First things first, the data needs to be downloaded from S&P’s website, converted into a CSV format, and then imported into R.
Read the rest of this entry »
A common model used in the financial industry for modelling the short rate (think overnight rate, but actually an infinitesimally short amount of time) is the Vasicek model. Although it is unlikely to perfectly fit the yield curve, it has some nice properties that make it a good model to work with. The dynamics of the Vasicek model are describe below.

Read the rest of this entry »