Advanced Computing in the Age of AI | Friday, April 19, 2024

Facebook Upgrades Servers With AI Training 

(Virgiliu Obada/Shutterstock)

Facebook unveiled new compute and storage servers this week along with custom training server platform as part of a server "refresh."

Along with its Bryce Canyon storage server and Yosemite v2 and Tioga Pass compute servers, the social media giant announced its Big Basin server used to improve the training of neural networks. The next-generation GPU-based server would allow Facebook to train machine-learning models that are 30 percent larger than the platform's predecessor, Big Sur. The company said during this week Open Compute Project summit on Silicon Valley that the performance boost that includes a 25 percent boost in memory to 16 Gb, enabling research tasks such as learning to identify images based on "enormous" searches.

The company (NASDAQ: FB) claimed Big Basin delivered a nearly 100-percent improvement over Big Sur in tests with an image classification model. Big Sur was announced in 2015.

Those image recognition features are growing in importance as more photos and videos are being shared on social media platforms, the company stressed. Along with photo and video classification, Facebook uses AI to deliver services such as speech and text translations.

Big Basin is based on eight Nvidia Tesla P100 accelerators ganged to form a "hybrid cube mesh," the company said.

"We designed the system to allow for the disaggregation of the CPU compute from the GPUs, which enables us to leverage and connect existing [Open Compute Project] components and integrate new technology when necessary," Kevin Lee, a Facebook software engineer, explained in a blog post.

"For the Big Basin deployment, we are connecting our Facebook-designed, third-generation compute server as a separate building block from the Big Basin unit, enabling us to scale each component independently," Lee added.

The new Bryce Canyon server also reflects the growing requirement for high-density storage of photos and video. The combination of more processing power and increased memory capacity along with a 20 percent boost in hard drive density yielded a four-fold boost in computing power over the Honey Badger storage server introduced in 2015.

Meanwhile, a pair of new compute servers stress power efficiency in hyper-scale datacenters along with increased IO bandwidth. The new version of the Yosemite compute server supports "hot service," meaning it continues to operate when pulled from a rack for maintenance. Meanwhile, Tioga Pass incorporates dual-socket motherboards and increased bandwidth to GPUs, flash memory and network cards.

Facebook said it was contributing the server designs to the Open Compute Project. It is also contributing a new design for 100G optical connections used to link compute and storage servers in datacenters via a company switching architecture. The goal is to upgrade datacenters to operate at 100 gigabits per second, boosting data rates while allowing for future upgrades, Facebook said.

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

EnterpriseAI