Inside Extreme Scale Tech|Thursday, October 2, 2014
  • Subscribe to EnterpriseTech Weekly Updates: Subscribe by email

Facebook Unveils Homegrown Ethernet Switch 

facebook-infrastructure-logo

Social media giant Facebook is one step closer to ousting Cisco Systems as its primary datacenter switching supplier with the launch of a new top-of-rack engineering project inside of the company code-named “Wedge.”

As in the front of the wedge that is going to split apart the $20 billion and fairly monolithic datacenter switching market.

Jay Parikh, vice president of infrastructure engineering at Facebook, divulged some of the details on the homemade Wedge top-of-rack switch at the GigaOm Structure conference in San Francisco. The Wedge switch is the next step in a journey that Facebook has been on for the past year to peel the skins off of Layer 2 switches, break them up into modular components, and open them up for innovation by hackers like those at Facebook, its hyperscale peers, and those in the supercomputing and high-end enterprise sectors who, no doubt, would like to have more control over their networking stacks and do a little hacking here and there to make things work better.

Last year, Facebook first started the open networking effort within the Open Compute Project that it founded nearly three years ago to open up designs for servers, storage, and datacenters. The idea with all of these open projects is to foster the development of machines that suit the needs of hyperscale companies like Facebook and to share innovation across members, much like the open source community has done for decades for software. Open Compute has gathered steam and garnered the support of component-level and system-level vendors who know that it is better to join the open hardware movement than to try to fight it. But as Najam Ahmad, director of technical operations at the social network, explained to EnterpriseTech last year, the goal is not just to have a commoditized switch that is analogous to an X86 server. The goal is to bust open switch appliances and modularize both the hardware and the software so building blocks can be changed independently and at a rate that customers – not vendors – dictate.

Last fall, prototype Open Compute switches were revealed by switch chip makers Broadcom, Intel, and Mellanox Technologies. This was the first step toward true open networking and the machines have been in testing at Facebook and other unnamed customers in the financial services and service provider communities that are also seeking a more hackable network stack. Intel had already lined up Quanta Computer and Accton to manufacture its “SeaCliff Trail” reference switch if any Open Compute members were eager to have the Intel variant of the open networking switch.

facebook-wedge-switch-block-diagram

This was phase one of the open networking development effort, explained Parikh, and it is in phase two that things get very interesting.

With phase one, Facebook wanted to pry loose the network operating system from the underlying switch hardware and encourage vendors to open up their hardware platforms to put in X86 coprocessors so network functions formally running on other appliances or on servers could be moved onto the switch and to – and this is heresy for the main switch makers – allow multiple switch operating systems to run on the iron. Facebook got behind Cumulus Networks, a provider of a Linux-based switch operating system, and no doubt hopes for more to come.

With phase two of the open networking project, Facebook is once again getting out ahead of the vendors (as it did with its initial server and datacenter designs three years ago) and building its own switch hardware, the above-mentioned Wedge, and switch operating system, called FBOSS.

Wedge is supposed to look, feel, and operate as a server, and strictly speaking, FBOSS is not a traditional network operating system as we know it. As Jimmy Williams, a networking engineer at Facebook, explained it to EnterpriseTech, FBOSS is based on the company’s homegrown version of Linux, and its function is to run network applications on the compute element of the Wedge switch and one of those applications just so happens to be a switching system that tells the network ASIC inside of the Wedge switch what to do.

What FBOSS is ultimately doing, Williams explained, was hooking into Facebook’s Thrift networking service. Facebook created Thrift to automatically generate the client and server sides of remote procedure calls in distributed applications, allowing these to be spawned from the many different programming languages in use at Facebook. Facebook open sourced Thrift back in 2007, and earlier this year it open sourced an improved version called FBThrift to try to get some more community momentum behind it.

facebook-wedge

The Wedge switch was built using Broadcom’s Trident-II switch ASIC, which is one of the popular switch chips out there on the market. (Intel’s Fulcrum Microsystems and Mellanox Technologies’ Switch-X chips are the other two that dominate the Ethernet switch market; Cisco Systems has its own ASICs, of course, as well as using merchant silicon.) The Broadcom Trident-II ASIC is set up to provide sixteen 40 Gb/sec ports, and Parikh said it could easily be expanded to 32 ports. The ports can also be equipped with splitter cables, breaking them down into 10 Gb/sec ports that would boost the effective port count to 64 ports in a 1U enclosure, with the possibility of doubling it up again at the 10 Gb/sec speed.

The Wedge switch has a compute element, which as it turns out is a microserver based on an unspecified Intel processor (most likely an eight-core “Avoton” C2000 processor) and that adheres to Facebook’s “Group Hug” microserver specification. Parikh said that the Wedge system was designed to accept other kinds of processor modules, and he called out ARM chips in particular as Facebook has been doing for a couple of years now as it awaits the maturation of an X86 alternative.

facebook-wedge-switch

Last fall, when Facebook was showing off the open networking reference designs, it was careful to be able to show systems using all three major merchant networking ASICs – those from Broadcom, Intel, and Mellanox. Thus far, Facebook has only built a Wedge switch using the Broadcom Trident-II chip and both Parikh and Williams were mum on what the plan was to have other switch ASICs added. But clearly the system is designed so a different ASIC can be plugged in as desired, and such choice and flexibility is, in fact, the whole point of the open networking effort at Facebook and at the Open Compute Project.

Williams tells EnterpriseTech that the Wedge switches are currently being tested in the company’s datacenter networks ahead of full-scale production. The deployment plan inside of Facebook’s three – soon to be four – datacenters was not revealed, and Williams said that it was “all about how the hardware and the software perform.”

Parikh said that Facebook would eventually contribute the design to the Open Compute Project, where others could modify it or, if they were happy with it, seek out an OCP-approved manufacturer to build them for production use. Facebook did not say if it would make the switches for production itself through an ODM partner or wait until one of the OCP manufacturing partners did it on their own. “If we have to build it, it is fine; if we can buy it, it is fine,” Parikh said.

It was not clear if Facebook would open source FBOSS, but sources at the company say that the intent is to open source key components of the software. Other Linux-derived network operating systems, such as Wind River and Cumulus Linux just to name two, could no doubt be adapted to run on the device.

About the author: Timothy Prickett Morgan

Editor in Chief, EnterpriseTech Prickett Morgan brings 25 years of experience as a publisher, IT industry analyst, editor, and journalist for some of the world’s most widely-read high-tech and business publications including The Register, BusinessWeek, Midrange Computing, IT Jungle, Unigram, The Four Hundred, ComputerWire, Computer Business Review, Computer System News and IBM Systems User.

2 Responses to Facebook Unveils Homegrown Ethernet Switch

  1. dave ginsburg

    Tim –

    Great article, and we see ‘server-switch’ architectures such as that described by FB to be critical to evolved pod deployments. With the vast numbers of cores within a rack, and increasing focus on virtualization and visibility, an architecture that lends itself to capabilities beyond classical L2/L3 switching is the future.

    Now, some of this may be handled by operating systems such as FBOSS, but some capabilities may require a fabric-centric approach that leverages a network hypervisor as opposed to an OS view of the world that is limited to a single TOR switch. When you look at network hypervisors, they begin to enable the rich automation and virtualization promised by SDN as well as hosting functionalities such as OpenStack-based orchestration or NFV.

    Dave Ginsburg

     
  2. Bollocks187`

    There is absolutely nothing innovative in this design nor the concept of multiple parts that make up a switch. Also to correct your article – this is NOT a “chassis” this is a pizza-box design.

     

Add a Comment