U N C O V E R I N G T H E L A Y E R S
the greater the accuracy of the prediction of the desired outcome . ChatGPT-3 was built on 175 billion parameters . But for AI , the number of parameters is already rising rapidly . WU Dao , a Chinese LLM first version used 1.75 trillion parameters . WU Dao , as well as being an LLM is also providing text to image and text to video . Expect the numbers to continue to grow .
With no hard data available it is reasonable to surmise that the computational power required to run a model with 1.7 trillion parameters is going to be significant . As we move into more AI video generation , the data volumes and number of parameters used in models will surge .
Transformers
Transformers are a type of neural network architecture developed to solve the problem of sequence transduction , or neural machine translation . That means any task that transforms an input sequence to an output sequence . Transformer layers rely on loops so that where the input data moves into one transformer layer , the data is looped back to its previous layer and out to the next layer . Such layers improve the predictive output of what comes next . It helps improve speech recognition , text-tospeech transformation and such .
How much is enough power ? What researchers , analysts and the press are saying
A report by S & P Global , titled POWER OF AI : Wild predictions of power demand from AI put industry on edge , quotes several sources : “ Regarding US power demand , it ' s really hard to quantify how much demand is needed for things like ChatGPT ,” said David Groarke , Managing Director at Consultant Indigo Advisory Group , in a recent phone interview . “ In terms of macro numbers , by 2030 AI could account for 3 % to 4 % of global power demand . Google said right now AI is representing 10 % to 15 % of their power use or 2.3 TWh annually .”
A calculation of the actual power used to train AI models was offered by RI . SE – the Research Institute of Sweden . It said : “ Training a super-Large Language Model like GPT-4 , with 1.7 trillion parameters and using 13 trillion tokens ( word snippets ), is a substantial undertaking . OpenAI has revealed that it cost them US $ 100 million and took 100 days , utilising 25,000 NVIDIA A100 GPUs . Servers with these GPUs use about 6.5kW each , resulting in an estimated 50GWh of energy usage during training .”
This is important because the energy used by AI is rapidly becoming a topic of public discussion .
Data centres are already on the map and ecologically focused organisations are taking note . According to the site , 8billiontrees , there are no published estimates as of yet for the AI industry ’ s total footprint , and the field of AI is exploding so rapidly that an accurate number would be nearly impossible to obtain . Looking at the carbon emissions from individual AI models is the gold standard at this time . The majority of the energy is dedicated to powering and cooling the hyperscale data centres , where all the computation occurs .
Conclusion
As we wait for the numbers to emerge for past and existing power use for ML and AI what is clear is that it is once models get into production and use , we will be in the exabyte and exaflop scale of computation . For data centre power and cooling , it is then that things become really interesting and more challenging . �
66 www . intelligentdatacentres . com