Pete’s Blog: The Low Latency Canary

The one constant in the world of financial markets IT is that nothing stays the same for very long. The world of market data feeds and the applications that process them is no exception.

As data rates increase, and as processing systems (or any component of them) are upgraded and modified, performance can be hit. The trick is to constantly monitor the environment to pre-empt potential problems. But what should one monitor?

According to the boffins at 29West, the answer is straightforward. To quote from an article in their most recent newsletter: “Focus more on measuring application latency and less on measuring data rates”. In fact, 29West reckons that latency is the “canary in the coalmine” when it comes to early warning indicators.

The reasoning, they say, is that when one has a measure of latency, it is possible to work to improve it. That’s in contrast to measuring data rates, which are essentially out of one’s control. 29West suggests measuring latency message by message, day in, day out.

29West also point out that data rate measurements are really averages, and often information such as the time period for the sample are not provided, or are measured over inappropriate sample times, that won’t show up problems.

More on this subject from 29West here. Please get back to us with your own views on how best to monitor systems. Do you agree with 29West, or have you found some other metric to monitor?

Until next time … here’s some good music.

Technorati Tags: , , ,

AddThis Social Bookmark Button AddThis Feed Button

6 Responses to “Pete’s Blog: The Low Latency Canary”

  1. Geva Perry Says:

    29West definitely has the right approach. The goal, however, should be to measure end-to-end latency.

    See Nati Shalom’s blog post on this issue:
    http://www.gigaspacesblog.com/2007/05/04/network-latency-vs-end-to-end-latency/

    Geva Perry
    GigaSpaces
    http://www.gigaspaces.com

  2. Vikas Deolaliker Says:

    Latency between which two points? There is a huge heisenberg principle in play here. Your measuring instruments should be capable of measuring sub-millisecond latency. Needless to say such instruments will be more expensive than the servers whose latency one wishes to measure. There is really no business case for creating such a instrumentation system.

  3. Henry Young Says:

    Bob @ 29West is right. You can monitor latency, or you can monitor all of the metrics that end up contributing to latency - bandwidth utilisation, memory queues, CPU utilisation, etc. A problem with any of these anywhere in the data path shows up as increased latency. So, yes, that is the canary in the coalmine. And when you see your canary starting to turn green, then you can drill down using your standard data centre moniting tools to see what the cause is. Latency is the single best indicator of overall system health.

    To comment on Vikas’ point, products are available today that use non-invasive passive monitoring techniques to give micro second accurate measurements of every message that flows through a market data system, and some firms do price solutions competitively. The critically important consideration is to measure the actual business critical traffic flows, and not simply to fire tracer bullets through the system, as so many solutions do. This will serve only to tell you the latency of your tracers rather than your real traffic, the latency of which is related to factors such as non-linear update bursts, etc. That’s the difference between active and passive latency monitoring.

  4. Alex Malone Says:

    Measuring latency in real-time without affecting the performance
    of the underlying trading system is a core benefit of the
    TradePVR product from Helium Systems. Using a non-invasive,
    network monitoring approach, the product analyzes messages at the
    protocol level (e.g. FIX or MDP) and provides latency
    measurements on several different levels: 1. Round-trip through a system.
    2. Single-ended line-latency based on sending and receipt
    times (e.g. for market-data feeds). 3. Intra-hop latency when
    deployed at different points within an infrastructure.

    When considering how much firms pay for DMA connections to
    achieve low-latency, as well as IT costs of managing these
    connections, a cost-effective monitoring tool is a must. The
    TradePVR is also a recorder that provides an ‘instant-replay’ of
    messages, should the IT staff need to investigate and incident.

    The TradePVR is installed at great companies like the
    Intercontinental Exchange (ICE), and we are aggressively pricing
    the product to meet the needs of firms of all sizes.

  5. Henry Young Says:

    I had refrained from plugging my own firm’s products. However, in response to Alex’s comment, see http://www.low-latency.com/2007/05/11/ts-associates%e2%80%99-tipoff-monitors-middleware-for-latency-issues/

  6. Martin Sustrik Says:

    Measuring and interpreting latency is relatively easy.
    However, ignoring other metrics just because they are difficult
    to interpret is not a good strategy IMO.

    Actually, we can measure data rate at a specific point quite easily
    and precisely (it’s not an average as suggested by 29West) - it’s
    the time elapsed between a message and the subsequent message (more
    precisely it’s inverse value - 1/x - of it).

    Of course, interpreting this value is hard. Low data rate may be caused
    simply by low data rate of messages being published. It can be caused
    by various bottlenecks in the network, etc.

    Anyway, my point is that data rate is orthogonal to latency and can be
    used as another problem indicator a.k.a. coalmine canary.

Leave a Comment

You are not logged-in