Intelligent Data Centres Issue 56 | Page 31

INDUSTRY INTELLIGENCE INDUSTRY INTELLIGENCE

How to tolerate the tail : Addressing long-tail latency in data centres

Digital Transformation has increased data centre demand but their response capabilities is the key driver for revenue . Rajkumar Vijayarangakannan , Lead of Network Design and DevOps , ManageEngine , discusses how to minimise long-tail latency and improve data centre operations .

The widespread adoption of data centres today has prompted numerous businesses to embrace and deliver advanced , highly interactive , real-time services via Edge networks distributed worldwide . The proliferation of users , mobile apps and the 5G revolution has contributed massively to the growing scale and demand for these services .

Demand aside , it is the speed and responsiveness of these services that drive their revenue and reliability as they heavily rely on instant response times . The strong need for quick and reliable delivery has driven businesses to seek distributed platforms and microservices architectures to deliver these services .
To enhance responsiveness , these complex architectures slice and parallelise end-user requests into several sub-operations that are executed across a large number of shared , multitenant physical machines either as virtual machines ( VMs ) or as containers , therefore response times become less predictable . The larger the scale of operations in data centres , the greater the impact of latency variability .
The most prevalent long-tail latency challenge affecting the holistic performance of data centres manifests as an elongated spectrum of variable latencies . The familiar adage ‘ the tail wags the dog ’ finds relevance here , wherein certain niche factors or rare occurrences lead the performance of data centres . In modern data centres , this concept is vividly exemplified , as various subtle or infrequent events surprisingly tend to dominate the overall data centre ' s network performance .
In such complex environments , for effective performance , responses from each subprocess must exhibit consistent low latency before a final response is delivered to the client or the overall operation response time will be tragically slow . With thousands of microservices executing in parallel , the process that exhibits a slow response determines the overall response time of the user facing real-time web services .
What causes long-tail latencies ?
The causes of long-tail latency lie not just in the availability of resources but also in interactions at the data centre component level . The various factors causing long-tail latencies include :
1 . Resource contention : This could arise from different concurrent workloads operating within the same shared environment or even resource contention within a single workload , www . intelligentdatacentres . com
31