Fun with color maps: visualizing financial time series
October 2, 2009
Here’s an interesting visualization of daily stock returns for 50 components of the S&P 500. I used the same kind of heat map plot from my previous post.
Again, this plot conveys a lot about the multiple series, but you have to look for a minute to see why. Once you start to look, you see a surprising amount of information come out – to me, more information than from plotting all these series in the usual chart formats.
The plot shows the percent change in price for 50 random components of the S&P 500 (on the Y-axis, one stock per row) for the 250 periods (time is the x-axis, left to right) prior to October 1, 2009. The 250 periods corresponds to just short of one year of trading, so here we see summarized a year of trading in 50 stocks.
Note: The returns are capped at the lower 20% quantile and the upper 80% quantile. So while the legend shows -3% to 3%, really we round anything above or below these values. This ensures that really high and low values don’t throw off the colors. A better approach would be a custom coloring scheme designed for this kind of data.
At first, yes, this looks like colorful noise.
But I see many patterns (admittedly, I have good eyesight):
- October 2008 was a bad period for these stocks, as shown by all the red to the left.
- In this bad period, we also see plenty of big price movements, as shown by rows having alternating red and blue. That represents sequences of plus or minus almost 3% on alternating days.
- Also during the bad period, some stocks fared better than others. We see several rows with red on the left, but becoming green much faster than the others
- October 2009 is much more calm, as shown by all the green, representing small changes, on the right.
- Yet there are some stocks that see much more volatility than others. Just pick out rows of alternating reds and blues from the fields of green. These “colorful” rows are more volatile stocks.
I’m sure there are plenty of other meaningful patterns in here.
So overall, an interesting technique for visualizing many time series together.
The data comes from here. The R code to process it is below, and you’ll need to install the Heatplus library per my previous post (not from CRAN).
stocks.raw=read.csv("sp500hst.txt", header=FALSE)
names(stocks.raw)=c("date", "symbol", "open", "high", "low", "close", "volume")
stock.close=tapply(stocks.raw$close, stocks.raw$symbol, function(x){x})
stock.close.cleaned=stock.close[lapply(stock.close, length)==251]
set.seed(1234567)
stock.names=sample(names(stock.close.cleaned), 50)
stock.1=stock.close.cleaned[stock.names]
stock.returns=t(sapply(stock.1, function(d) {(d[2:251]-d[1:250])/d[1:250]}, simplify=TRUE))
heatmap_2(stock.returns, col=rainbow(length(stock.returns[1,]), end=4/6), Rowv=NA, Colv=NA,
do.dendro=c(FALSE,FALSE), scale="none", legend=2,
main="Stock returns", trim=.8)
