Intelligent Data Centres Issue 59 | Page 27

I N F O G R A P H I C

TRAINING A LARGE LANGUAGE MODEL : THE IMPACT OF GENERATIVE AI ON INFRASTRUCTURE AND OPERATIONS

Altman Solon , one of the largest global strategy consulting firms dedicated to the TMT sectors , examines how Generative AI impacts network infrastructure .

Altman Solon believes growth in enterprise-level Generative AI tools will lead to incremental demand for compute resources , positively benefiting core , centralised data centres where training occurs , and local data centres where inference occurs .

Network providers should see a moderate increase in demand for data transit networks and a boost for private networking solutions . Altman Solon used a four-step methodology to understand infrastructure impact , accounting for the average compute time requirement per Generative AI task , the volume of overall Generative AI tasks , the incremental compute requirements needed and the quantifiable impact on the infrastructure value chain . To meet this demand , service providers will need to start planning for adequate compute resources and network capacity .
Using data from its survey of 292 senior business leaders , Altman Solon calculated the hourly volume of Generative AI tasks across four business functions : software development , marketing , customer service and product development / design .
Developing and using Generative AI solutions : Training and inference ’ s infrastructure impact
When building and using Generative AI tools , compute resources are required during two distinct phases : training the model and then using the model to respond to queries ( also known as ‘ inference ’).
Even though iterating massive amounts of data to train a model is computeheavy , training occurs only during model development and stops once the model is finalised . Over an LLM ’ s lifespan , training will take up 10 – 20 % of infrastructure resources .
By contrast , inference is where most workloads take place . When a user enters a query , the information is fed through the Generative AI application ’ s cloud environment .
The majority ( 80 – 90 %) of compute workloads occur during inference , which only increases with use . To use the example of ChatGPT , training the model supposedly costs tens of millions of dollars , but running the model is reported to exceed the training costs weekly .
Generative AI models tend to be trained and stored in centralised , core data centres because they have the GPUs necessary to process large amounts of data that cycles through LLMs . Models are also housed and trained in the public cloud to eliminate ingress and egress costs since training data already resides there .
As adoption grows , Altman Solon expects inference to be conducted more locally to alleviate congestion in the core , centralised data centres where models are trained . However , this has some limits : generative models are too large to house in conventional Edge locations , which tend to have higher real estate and power costs and cannot be easily expanded to accommodate resourcehungry AI workloads .
It believes the uptick in Generative AI tools will thus have the most significant impact on the public cloud , and providers should focus on regional and elastic capacity planning to support demand .
In the near future , as models mature and as critical compute resources become cheaper , private cloud environments might begin housing and training LLMs . This could be particularly useful for industries in regulated spaces like pharmaceuticals , financial services and healthcare , which tend to prefer developing and deploying AI tools on private infrastructure . �
www . intelligentdatacentres . com 27