Share |

The Low Latency Canary

The one constant in the world of financial markets IT is that nothing stays the same for very long. The world of market data feeds and the applications that process them is no exception.

As data rates increase, and as processing systems (or any component of them) are upgraded and modified, performance can be hit. The trick is to constantly monitor the environment to pre-empt potential problems. But what should one monitor?

According to the boffins at 29West, the answer is straightforward. To quote from an article in their most recent newsletter: “Focus more on measuring application latency and less on measuring data rates”. In fact, 29West reckons that latency is the “canary in the coalmine” when it comes to early warning indicators.

The reasoning, they say, is that when one has a measure of latency, it is possible to work to improve it. That’s in contrast to measuring data rates, which are essentially out of one’s control. 29West suggests measuring latency message by message, day in, day out.

29West also point out that data rate measurements are really averages, and often information such as the time period for the sample are not provided, or are measured over inappropriate sample times, that won’t show up problems.

More on this subject from 29West here. Please get back to us with your own views on how best to monitor systems. Do you agree with 29West, or have you found some other metric to monitor?

Until next time … here’s some good music.

[tags]29West, data latency, low latency, latency measurement[/tags]


Comments

Measuring and interpreting latency is relatively easy. However, ignoring other metrics just because they are difficult to interpret is not a good strategy IMO. Actually, we can measure data rate at a specific point quite easily and precisely (it's not an average as suggested by 29West) - it's the time elapsed between a message and the subsequent message (more precisely it's inverse value - 1/x - of it). Of course, interpreting this value is hard. Low data rate may be caused simply by low data rate of messages being published. It can be caused by various bottlenecks in the network, etc. Anyway, my point is that data rate is orthogonal to latency and can be used as another problem indicator a.k.a. coalmine canary.
I had refrained from plugging my own firm's products. However, in response to Alex's comment, see http://www.low-latency.com/2007/05/11/ts-associates%e2%80%99-tipoff-moni...
Measuring latency in real-time without affecting the performance of the underlying trading system is a core benefit of the TradePVR product from Helium Systems. Using a non-invasive, network monitoring approach, the product analyzes messages at the protocol level (e.g. FIX or MDP) and provides latency measurements on several different levels: 1. Round-trip through a system. 2. Single-ended line-latency based on sending and receipt times (e.g. for market-data feeds). 3. Intra-hop latency when deployed at different points within an infrastructure. When considering how much firms pay for DMA connections to achieve low-latency, as well as IT costs of managing these connections, a cost-effective monitoring tool is a must. The TradePVR is also a recorder that provides an 'instant-replay' of messages, should the IT staff need to investigate and incident. The TradePVR is installed at great companies like the Intercontinental Exchange (ICE), and we are aggressively pricing the product to meet the needs of firms of all sizes.
Bob @ 29West is right. You can monitor latency, or you can monitor all of the metrics that end up contributing to latency - bandwidth utilisation, memory queues, CPU utilisation, etc. A problem with any of these anywhere in the data path shows up as increased latency. So, yes, that is the canary in the coalmine. And when you see your canary starting to turn green, then you can drill down using your standard data centre moniting tools to see what the cause is. Latency is the single best indicator of overall system health. To comment on Vikas' point, products are available today that use non-invasive passive monitoring techniques to give micro second accurate measurements of every message that flows through a market data system, and some firms do price solutions competitively. The critically important consideration is to measure the actual business critical traffic flows, and not simply to fire tracer bullets through the system, as so many solutions do. This will serve only to tell you the latency of your tracers rather than your real traffic, the latency of which is related to factors such as non-linear update bursts, etc. That's the difference between active and passive latency monitoring.
Latency between which two points? There is a huge heisenberg principle in play here. Your measuring instruments should be capable of measuring sub-millisecond latency. Needless to say such instruments will be more expensive than the servers whose latency one wishes to measure. There is really no business case for creating such a instrumentation system.
29West definitely has the right approach. The goal, however, should be to measure end-to-end latency. See Nati Shalom's blog post on this issue: http://www.gigaspacesblog.com/2007/05/04/network-latency-vs-end-to-end-l... Geva Perry GigaSpaces http://www.gigaspaces.com

Add comment

Member Login or Join the Community to post comments