High Speed C# Trading - Classes and the Garbage Collector

Automated trading, even if not doing high frequency trading, can always take advantage of fast program execution. In C# - and any other high level language that has garbage collection – the garbage collector can slow down a program. But there are alternatives that can make a program faster

The garbage collector – good in principle, slow when it matters

Garbage collection has many advantages. Memory allocation errors – when an area of memory has been freed but the program is still using it – are notoriously hard to debug and error symptoms occur seemingly at random intervals. Strategies for freeing memory can be complex and result in suboptimal algorithms. In programmers spending more time to program along a line of “not losing memory” than “using the fastest approach”. Not always can you allocate (and free the object) on a wrapping method, or have a clear source/sink flow for messages so that the allocation and freeing strategy is easy.

It is not only about the time in the GC – beware locality and CPU caches

.NET uses a generational garbage collector that is splitting memory into three areas (for three generations). An object is always allocated in generation 0 (except for very large objects that get special treatment). Objects surviving a collection in their respective area are moved into the next area.

Moving objects takes time, and the mark and sweep garbage collector also takes time to mark and then collect the objects. But the real impact is not – or rather, not only – the time the garbage collector is spending. A major factor for processing performance is locality and cache coherence. This is 2014, and processors are fast. A lot faster than the memory. As a result processors have added cache – and in case of the Intel processors (let’s just ignore the slow and outdated AMD processors that are available at the moment) this is up to 3 levels of caching.

Cache is a lot faster than memory – even with the new multi-channel DDR 4 access paths available in the newest not really available yet Xeon processors. As such, making good use of the cache is important. And the cache is not very big. Which means that data should be close together and should be limited to what fits into the first or second level cache, if possible.

Sadly the garbage collector moves objects around and can – at least in .NET – not take access patterns into account. Not to the degree a programmer can do it. The Garbage Collector also will – for the processor cores that are running it – touch a lot of memory. Which means a lot of cache invalidations. There goes the performance.

Avoiding the garbage collector by pre-allocating and reusing objects

There are strategies to avoid the garbage collector. And one of them is to simply not allow it to collect objects because they are never released. Instead of for example getting a new object for every market update or tick, we allocate a set of 1000 objects at program start and reuse them. No garbage means no garbage collector.

This is not as ridiculous as it may sound – this strategy has always been used in gaming. Or in small system development where the load of the garbage collector was simply too much to bear at all. Embedded Java hat at least for a long time no garbage collector at all (not sure about the current status).

Avoiding the garbage collector by using structures instead of classes

For .NET, though, there is another alternative approach that can be used. Move things out of classes into structures. Structures are not garbage collected themselves – they only are garbage collected when they are boxed (stored for example in a variable of type object), or when they are part of a garbage collected object (because they are a variable). And they only have overhead in the first of these two cases.

A structure is normally copied by value – which means an assignment will copy the value to the new storage space. But .NET does have pointers when using unsafe code. And unsafe code is not automatically evil – there are places it does make sense and bring a significant performance benefit. Deep in the core of a trading framework, well tested, there is little harm unsafe code can do.

High Performance Programming for Trading is all about the market updates

When trading there is one main source of new objects – market updates. Not orders and executions. These are rare. But all the ticks, all the order book updates, that can be a lot of new objects. If one is generating a new class for each of those, and puts a memory profiler to use, you likely will find 90% of all objects generated to be of this kind – or of an equivalent for storing indicator values, should a programmer have decided to use objects for this.

This means that by replacing the indicator values and the ticks with structures, one can take a lot of load off the garbage collector. Enough to make a difference. And when programming smart, possibly also include the cache hit ratios.

For a high .NET performance – beware of your frameworks

An obvious source of new objects may be frameworks. Depending how low level one is programming – there may be little a programmer can do. When using a high level framework such as our always buggy NinjaTrader (http://www.trade-robots.com/tags/ninjatrader) – there really is nothing a programmer can do on the framework level.

But even when programming your own infrastructure – such as we do with the Reflexo Trading Framework… trading means connecting. And connections are often done using some libraries, either in open standards (like QuickFixN to talk the FIX protocol), or by proprietary software development kits – such as the Rithmic SDK that we use for our trading with Rithmic (http://www.rithmic.com). If your connectivity library is generating a class per update – then you can likely avoid generating one yourself (by using a structure) – but you will not be able to avoid the classes generated by the connectivity library. This is one reason we deal with for example NxCore – our Nanex data feed – in C++. The (automatically generated) C# wrappers are just generating too much overhead.

And if you use something like NinjaTrader or any of the other high level frameworks, there is not a lot you can do. Sad as it is. Because by the time the updates hit your strategy – it is simply too late. You can only look at the extreme high amount of garbage being created in a memory profiler and – then realize there is nothing you can do.

Our own Reflexo Trading Framework – Guilty as charged

An article about high performance programming in C# - and we maintain our own trading infrastructure, the Reflexo Trading Framework. That immediately raises a question: how do we fare? And the answer is – not too good. We are guilty as charged at the moment, generating an object for every update. This is one of the parts we have identified for a rework once the new trading server is finished. Optimization? We lose time in the garbage collector. Backtest? Even more. Trading? Yes, at the moment there are more important things to fix. Once this is done, though, we will have another look at our garbage collection situation.