Meta builds the world’s most powerful «AI trainer»

As part of Meta’s preparations for “The Metaverse”, the company is now building what will be the world’s most powerful AI supercomputer.

The project has already been underway for a year and a half, and the first phase of the construction of the “AI Research SuperCluster (RSC)” has been completed. The machine is operational and is already the fifth most powerful of its kind, according to the Guardian. But when it is completed by the summer of 2022, it will have taken over the performance throne.

Must train artificial intelligence



RSC will mainly be used to “train” Meta’s AI systems to perform various tasks, such as automatic moderation and recognition of incitement and other unwanted statements on Facebook. But Meta also has other plans for the supermachine, write Kevin Lee and Shubho Sengupta in a blog post:

– RSC will help Meta’s AI researchers build new and better AI models that can learn from trillions of examples, work across hundreds of different languages, analyze text, images and video seamlessly together, develop new tools that can use extended reality (“AR”) and much more. We hope RSC will help us build brand new AI systems that, for example, can translate speech in real time for groups of many people who each speak their own language, so that they can collaborate seamlessly on a research project or play an AR game together. .


RSC is in operation today, but it is only the first phase. Photo: Screenshot / Meta

Hardware

At the time of writing, the RSC has 760 of Nvidia’s DGX A100 modules, each equipped with eight A100 GPUs. In total, there will be 6080 GPUs, which by comparison each have almost twice as many transistors as there are in the GA102 chip that sits in Nvidia’s top model on the RTX 3090.

When completed, the RSC will have no less than 16,000 A100 GPUs and impressive processing power. Sweden’s most powerful machine of this type has, for example, “only” 480 A100 GPUs, while Perlmutter’s super machine has 6159 pieces.

Chewing away as much data as RSC can do will also require its own storage system, and RSC will get over 230 petabytes of storage capacity – all of the fast kind.

Performance



It is difficult to state any comparable performance for AI supercomputers.

For example, RSC will not necessarily be the world’s most powerful supercomputer at all (although it will also claim to be on that list), but most powerful AI supercomputer. The performance of these two types is measured differently, as AI machines have a lower requirement for precision in their calculations.

In addition, machines in this class are often measured at their theoretical maximum performance, which can often have only a purely academic relationship to what you actually get out of them in practical tasks:

– It is not uncommon for some supercomputers to achieve less than 25 percent of their so-called peak performance when running applications in the real world, says Bob Sorensen from Hyperion Research to The Verge.

Encrypted data

The data used by RSC is genuine and is retrieved from Meta’s various services, but must be end-to-end encrypted:

– Before data is imported into RSC, it must go through a privacy review process to verify that it is properly anonymized. The data is then encrypted before it can be used to train AI models, Lee and Sengupta write in the blog post.

According to Tom’s Hardware RSC is more or less blocked from the Internet, and all data that comes in must first go through Meta’s own servers. The security system encrypts virtually all data, even on RSC’s own storage systems, and the data is only decrypted when it is loaded into working memory and is to be used for AI training.



Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.