In a previous post, I examined a simple stock trading strategy: Find the high point over the last 200 days, and buy the stock if it's been less than 100 days since that high. Otherwise, have no position.
What if we use different parameters than 200-day high and hold 100 days? How will that affect our strategy? First of all, we have to reload the data for the S&P 500 index and re-define the functions used to implement our strategy.
Because my processing power is limited, I'm only going to look at every 5th value in this parameter space. The first order of business is to calculate a matrix containing each n-Day high series, where the first column is the number of days since the 5-day high, the second column is the number of days since the 10-day high, etc. This matrix has 100 columns:
Next, I make a list with 100 elements. Each element represents a holding period, which I will apply to a copy of the "n-Day high matrix" from the previous step. For example, the 1st element in the list is a matrix representing a 5-day holding period. The first column in this matrix represents buying at the 5-day high, and holding for 5 days. This is equivalent to buy-and-hold. The second column represents buying at the 10-day high, and holding for 5 days. The third column represents buying at the 15-day high and so on. I repeat this process for each element in the 100-matrix list, which gives us an object representing every possible permutation of our strategy.
It is then a relatively easy thing to calculate the returns associated with each permutation of the strategy, by using the "sweep" function to multiply each column of each matrix by the daily returns for our stock
Now we have a list of matrices of returns. Each column of a matrix represents the returns of our strategy, using a different set of parameters. This allows us to calculate cumulative returns for each set of parameters, and make a nifty graph that shows the relationship between nHigh, nHold, and returns.
This graph uses a custom color ramp function, which was created by Andrie on StackOverflow. The color of each point in the corresponds to how high the returns are at that point. The X axis is number of days to use for the nHigh, and the yAxis is the number of days to use for nHold. As you can see, 100 days seems to be a solid holding period across many values of nHigh, but by using a different value of nHigh, we could increase returns substantially.
Of course, just because these values worked in the past doesn't mean they will work in the future. Still, it's good to see that our arbitrary parameters (which performed well in the last post), fall inside a wide range of parameters that yield a positive return for our strategy. This brings up an interesting question: how DO we select parameters for our strategy? How can we tell how well our parameter selection strategy would have performed in the past, given that we've optimized our selection based on of our knowledge of the past?
For homework, think about how overfitting and cross-validatation apply to this problem...
BONUS CODE: This creates some nifty 3D charts, using the rgl library.

Addressing your last questions, which are to me a major point in designing automated strategies, you should make the most of your historical data.
ReplyDeleteThat is, start by splitting its history into, backtesting period, paper trading period, and finally live testing period (the latter not being part of your historical datasets indeed).
This can give you a first insight into your optimization. But you may easily argue that you rely too much on the backtesting-optimization period and paper trading period - So what should you do ?
You may bootstrapped your historical data to randomly create subset of your historical data into backtesting and paperTrading periods, arbitrarily varying their duration, to further assess the robustness of the parameter optimizations.
Now, with that in hand you may be pretty happy about your level of insights .... I always love to benchmark this against the traditional buy-and-hold (already done), but most importantly by a random function for your parameters, that is "is your optimized strategy better than luck ?"
Thx for sharing.