A revolution in AI infrastructure: the 3-nanometer technology and fivefold increase in cache of Graviton5 promise unprecedented speed for future AI services.
Meta has entered into a large-scale agreement with Amazon, involving the deployment of tens of millions of AWS Graviton processor cores. These powerful chips will significantly strengthen the company's computing base needed to create a new generation of AI services. This encompasses a wide range of tasks: from advanced search and code generation to multi-step agents capable of quickly planning actions and executing complex tasks. This deal not only deepens Meta's long-standing partnership with Amazon Web Services but also signals significant changes in the approach to AI infrastructure.
Until recently, the focus in the AI industry was on graphics processing units (GPUs), which are indispensable for training large-scale models. However, after the launch of ready-made AI products, the load on computing systems is changing. There is now a need to quickly process requests, run complex reasoning chains, efficiently work with data, coordinate numerous operations, and provide results to billions of users. To accomplish these tasks, not only GPUs are required, but also high-performance central processors.
Amazon plans to start deliveries with tens of millions of Graviton cores, with the possibility of further increasing volumes as Meta's computing needs grow. These processors will support a variety of workloads associated with the company's AI services. Given that Meta manages platforms with a billion-user audience, even a slight increase in speed or decrease in energy consumption at one level of infrastructure brings a colossal effect on the scale of the entire system.
Graviton is Amazon's own line of server processors designed on Arm architecture. AWS actively positions these chips as faster, more economical, and energy-efficient solutions for cloud tasks, surpassing many traditional server processors. In the context of AI infrastructure, cost savings cease to be a secondary factor, becoming a key element of the strategy. Modern data centers require more and more energy, more racks, and, crucially, more predictable performance.
The latest Graviton5 is manufactured using advanced 3-nanometer technology and is equipped with 192 cores. Amazon claims that its performance has increased by up to 25% compared to the previous generation. Additionally, the new chip has five times the cache size of its predecessor. The increased cache allows the processor to access data significantly faster, and reduced latency between cores is critically important for systems where many interrelated tasks are processed in parallel.
Amazon also notes a reduction in latency for data exchange between cores by up to 33%. For AI services, this metric directly determines the response speed to the user. An AI agent performing information searches, code writing, step verification, and calling external tools constantly switches between different operations. The faster the infrastructure can link these processes, the shorter the pause becomes between the user request and the received result.
Graviton operates closely with the AWS Nitro System – Amazon's hardware-software platform that provides security, networking functions, and high performance in the cloud. The processors also support the Elastic Fabric Adapter technology, designed for low-latency communication between servers in large clusters. Such clusters are necessary when a single task is distributed among many machines, and it is critically important that data exchange does not slow down the entire process.
This deal clearly demonstrates that the race in the AI field is no longer limited to merely acquiring GPUs for model training. Modern agent systems require the ability to plan, reason, generate code, and perform complex sequences of actions. After the model training is completed, the main load shifts to inference – the process of the finished model working with real user requests. At this stage, central processors, memory capacity, network speed, and overall energy efficiency take on increasingly significant roles.
For Meta, diversifying sources of computing power has become a strategically important task. The company seeks to avoid dependence on a single type of chip or a single supplier, especially in conditions where demand for AI resources outpaces the construction of new data centers. Graviton provides Meta with an additional opportunity to scale CPU workloads that underpin agent AI.
For Amazon, this agreement with Meta serves as a powerful public confirmation of the success of their strategy for developing proprietary processors. Although Graviton has long been actively used within AWS, deploying tens of millions of cores for one of the largest consumers of AI infrastructure elevates this line to a whole new level. Amazon convincingly demonstrates that competition in the AI field is unfolding not only around expensive accelerators but also around the entire computing foundation: central processors, networking solutions, cache memory, energy consumption, and the overall cost of processing each request.
Leave a comment