Remember me

Register  |   Lost password?


 

Two-way Sorting in R

Sun, 10 Mar 2013 15:06:42 GMT

One of the popular ways to motivate a two dimensional model is to double sort your data based on quantiles. This makes empirical asset pricing models exceedingly easy to understand. If I had already defined the breakpoints of the sort, I could just use a function like “plyr” to apply a function like “mean” to the data subsets. For my application, I want to create five portfolios based on one characteristic and then form five portfolio within each of the first five based on another characteristic. To generalize from a 5×5 sort, I include the parameter q to allow for quarters or deciles. The x and y parameters are the column names of the characteristics you want to sort by. As you can see from the last line, I am sorting by CDS premium and liquidity premium. I use latex for my publication needs so I output the dataframe to xtable. I also use the awesome foreach package to speed things up a tiny bit but if you don’t use a parallel backend with foreach, then just replace “%dopar%” with “%do%”. Happy sorting!

require(foreach); require(xtable) doublesort=function(data,x,y,q) {  breaks1=quantile(data[,x],probs=seq(0,1,1/q))  tiles=foreach(i=1:q,.combine=cbind) %dopar% {  Q=data[data[,x]>=breaks1[i] & data[,x]<breaks1[i+1],y]  breaks2=quantile(Q,probs=seq(0,1,1/q))  foreach(j=1:q,.combine=rbind) %dopar% mean(Q[Q>=breaks2[j] & Q<breaks2[j+1]])  }  colnames(tiles)=foreach(i=1:q) %do% paste(x,i)  row.names(tiles)=foreach(i=1:q) %do% paste(y,i)  tiles } xtable(doublesort(data,"CDS","lp",5),digits=4)

, , , , , , ,