Analysing Backtests - what to look for in a Simulation Result

Not all strategies are good. Profits alone are not an indicator for a stable strategy – only for one that has ultimately a profit. There are important factors in judging a strategy – more important factors than the complete profit. Because the real profit is always also depending on the required capital. Let us have a look some factors when analyzing a back test.

The Drawdown: less capital means higher real return

The absolute profit of a strategy is irrelevant – if not seen in context of the amount of capital required. This amount of capital is identical with the amount needed for trading (the margin) and the maximum loss). If I need 10.000 USD to open and hold my positions, and incur a temporary loss of 20.000 USD – then I need a total of 30.000 USD or I could not have traded the strategy like this. If another setup with the same profit and capital requirement only incurs a 10.000 USD loss as drawdown, then this is superior – because I can trade 1.5 times the amount compared to the first strategy.

Obviously this would be reckless – a good number is to take capital plus 3 times the statistically corrected drawdown (as estimate) or an amount of risk capital identical to only a 1% risk of disaster as per statistical simulations. Still this comparison does prove the point.

A good way to get this number is to look at the Calmar ratio, which divides the profit in a timespan by the largest drawdown. If ignoring the required capital (which is doable in day trading futures because margin requirements are really miniscule) then this number directly compares the profit of 2 strategies.

The Sharpe Ratio: More consistent profits mean less stress

The Sharpe ratio is a comparison between the profits made in a certain interval compared to a zero risk investment. This is hard for day trading – the zero risk investment is effectively 0. Which leads to a much simplified formula that we are using to compare the weekly profits.

We use what we call the Willken Ratio – a much more simplified calculation. In this we divide the average profit per week by the standard deviation of this number. There is no “zero risk return” because it has no meaning for day trading. What is measured with this ratio is variability of profits. The more stable profits are in the compared time intervals (and we use weeks as a basis, like with everything else we are doing), the less stress the trader (or risk manager) will have. The higher this ratio, the less likely is a large drawdown, and the less likely is a long drawdown.

The Profit to Cost Ratio: Some strategies are dependent on perfect execution

The ratio of gross profit to costs of trading is a significant indicator for the sensitivity of the strategy to variations of costs or execution. If you make an average of 12.50 USD profit per trade (1 tick in ES) and have a cost of 5 USD per round trip then the slightest variation in cost or average execution will take the strategy negative. If on the other hand the average profit is 125 USD – then losing an average of half a tick in execution will not take the strategy into dangerous thin territory.

Many aspiring traders are falling into the trap of creating strategies that take a lot of very small trades. While this can be viable (it is typical in HFT – High Frequency Trading) it requires permanent oversight of the price structure and perfect precision in executions, which also have to be measured. Because a small change in either of them will take a viable strategy very close to the 0 profit line.

A good strategy will still make profit when getting a slippage of 1-3 ticks on average. Even if this is not a lot of profit, this stability means that the strategy is later more likely to make real and significant profits when trading real money.

If you look at those possible dangers then you can see how a profitable strategy can still be bad.

The Parameter Stability: A winning Plateau is a lot better than a parameter spike

The last sign is to look for a stable plateau of winning parameters. Every automated strategy does have parameters. How sensitive is the strategy to small changes in this parameters? Some strategies show a single set of parameters that is highly profitable, but even a small variation in a parameter leaves over the cliff and into loosing territory. These spikes are similar to over optimization – but they may simply be inherent in the strategy. Still, this is a set of parameters that should be avoided. It is better to take another set of parameters, even if they are less profitable, if this alternate set shows a more stable behavior.

This is tricky to automate because – in order to fall off the cliff, one has to make a significant change in parameters. But not all parameters are created equal. Changing a stop loss (let’s say in crude oil, that trades in USD with a granularity in cents) from 10 USD to 10.01 USD is not a significant change. The problem is not when this creates a change from a profit to loss – that is good and shows super sensitivity. But if both stop loss values are big in profit, this is meaningless, because the change in stop loss is too small.

On the other side another parameter will show a change in a fundamental indicator. For example a change in using a time or a renko bar chart. This will be a simple enumeration (1, 2, and 3) in values – and a change of one stop here is a significant change (and a meaningless at all, I would never reject a strategy being profitable on renko but loosing on a time based chart).

As such, the selection of parameter steps is a little more complicated – too long to be evaluated in this post and definitely a topic for one or more blog posts in the future. For now, a little “the stops must be sensible and significant” really is all that has space.

A bad strategy – warning signs

There are a couple of bad strategies that are typical.

On one side, we have the unstable strategy. Large profits and large losses, or extensive loss periods. This leads to a low ratio. Even if profitable over time, this is a challenge for the risk management side, and quite often the drawdown requires a significant investment. As profit is irrelevant – the relevant number is the return on investment – the large investment leads to a lower ROI. And all with the risk that after a loss run the strategy finally fails and never recovers.

The second bad strategy is trading a lot and making profit, but it is super tight. The average profit per trade often is very small. This is balanced by a great many trades. But it has the risk on the back testing side (is the result really realistic?) and changes in the market.

A good strategy – what to look for

According to our criteria, a good strategy will have a high Sharpe or Willken ratio. This means it will be profitable most of the time and the profit expectation is significantly higher than the losses it occurs in the preferably rare moments of a losing week. It also will have a high enough profit per trade and a high enough profit to cost ratio that it is not overly sensitive to small variations in both the execution and cost structure. This means that even a small error in the simulation will not turn it into a losing strategy.

Good strategies are hard to find, but they are there

We have hundreds of validated setups in various stages of optimization. It takes hard work, but it is doable. A good filter along the criteria we have shown here is critical in evaluation whether or not such a strategy shall move forward – from showing a profit in an initial optimization into being tracked as worth to be traded on a trial account. The more of this step can be automated, the less time is spent in evaluating a strategy – something we are working on with an automated rating system.