Advanced Computing in the Age of AI | Thursday, March 28, 2024

Mesosphere To Bring Google-Style Cluster Management Mainstream 

Enterprises that want to run their clusters the way that Google does will soon have that ability now that Mesos, an open source cluster controller, is getting a commercial backer that will provide enterprise-grade support for the tool.

Mesos was inspired by the system management programs at Google, and Mesosphere, the company that will be standing behind Mesos, providing packaging and support, says that any hyperscale datacenter operator and any Global 2000 customer with multiple clusters and mixed workloads is going to be interested in giving Mesos a whirl.

Google creates, others emulate, everyone benefits. That seems to be one of the laws of this era of hyperscale computing. Google was first among its peers to design servers and datacenters tuned to its workloads and now most of the hyperscale and cloud players do this. Google created the MapReduce method of chewing on unstructured data and this eventually spawned Hadoop at Yahoo. The examples go on and on. Given its broad applicability to cluster and application management, the ideas in Google's Borg system management tool and its follow-on, called Omega, that are embodied in Mesos may have a more profound effect on enterprise datacenters than these other Google-inspired technologies.

Mesos is an open source project that has found a home at the Apache Foundation after getting its start at the University of California at Berkeley a few years back. One set of Berkeley researchers was trying to make software run better on multicore processors and other researchers were trying to make software run more efficiently across distributed systems. The basic idea that they came up with is to run multiple jobs at the same time across a cluster rather than dedicate a single cluster to each job – to, in effect, treat the datacenter like one big computer instead of a collection of systems. The idea is akin to server virtualization where you put many different jobs and sometimes whole copies of operating systems on a single physical machine. In both cases, you can drive up the utilization of the machinery – and reduce the number of server nodes – by sharing a cluster across workloads instead of partitioning clusters for specific workloads.

Mesos got its start back in 2009 at the AMPLab at Berkeley, which is partially funded by Google and which maintains close ties to the search engine giant. The Mesos software was open source and when Twitter was looking to do a better job of managing its infrastructure, the company, which had a bunch of ex-Googlers on board who had used Google's Borg system management tool, jumped into the Mesos project and started figuring out how to put Mesos into production. In 2012, Airbnb put Mesos underneath some of its analytics clusters and that was about the same time that Mesos became an Apache project. After a couple of years in development and adoption into production at eBay, PayPal, Groupon, Netflix, OpenTable, HubSpot, Salesforce.com, Vimeo, Conviva, and Best Buy, Mesosphere is putting together a war chest to take Mesos mainstream.

Mesopshere is located in San Francisco and was founded by Tobi Knaup, a software engineer at Airbnb who created its search and fraud detection systems and the Marathon framework for Mesos, and Florian Leibert, who built the search and analytics systems at Twitter and then moved to Airbnb where he, among other things, created the Chronos fault tolerant job dependency scheduler that runs atop Mesos. Leibert is CEO. Matt Trifiro, who was most recently chief marketing officer at Heroku, has also joined the team as senior vice president in charge of marketing. Ben Whitehead, who built distributed systems at Goldman Sachs, has been tapped to work on backend tools for Mesos. Tim Chen, who is a committer on the Apache Drill ad-hoc query system that complements MapReduce batch processing on Hadoop, has also joined up..

Today, Mesosphere is announcing that it has secured a Series A funding round from Andreessen Horowitz for $10 million, with an additional $500,000 in funding from DataCollective and Fuel Capital. Mesosphere had previously raised $2.25 million in seed funding from Andreessen Horowitz, Kleiner Perkins, Foundation Capital, and SV Angel. (The SV Angel referred to here is none other than Brad Silverberg, a former Microsoft executive who was in charge of Windows and Office several years back.) Leibert tells EnterpriseTech that including the seed funding, the Series A funding, and investments by Twitter and Airbnb, more than $20 million have been invested in Mesos thus far.

The idea behind Mesos is to create an aggregation layer that spans server capacity in local and remote datacenters. This aggregation does not tightly bind the servers into a shared memory system, but rather treats them like a giant resource pool onto which applications can be deployed. It is a bit like the inverse of server virtualization, conceptually:

mesos-cloud-era

Don't get the wrong idea from that image above. Mesos doesn't run one big distributed app across servers in a cluster, but rather interleaves the workloads on nodes in the cluster to maximize the utilization of compute, memory, and I/O resources on the cluster. Mesos can run on bare metal clusters or on machines that have some level of virtualization on them to provide some isolation between workloads running on the same machine. Google did a lot of the original work to create control groups, or cgroups, a kind of lightweight application container for Linux, which is the basis for the new LXC containers that are part of the latest updates to the commercial Linuxes from Red Hat, SUSE Linux, and Canonical. (You can read all about how Google containerizes all workloads on its vast fleet of servers here.) Mesos knows how to hook into these cgroups containers and it also has plug-ins for the up-and-coming Docker container and application packaging system that could become more popular than cgroups and LXC containers at a lot of enterprise shops. The tool also supports the KVM hypervisor championed by Red Hat, Canonical, OpenStack, IBM, and others.

Because applications are containerized or dedicated to specific nodes if they are running on bare metal, multiple workloads – perhaps an application framework for a Web application, a memcache-d cache for that application, and Hadoop batch analytics – can all be run simultaneously on the same cluster, using what Mesos calls elastic sharing instead of static partitioning of nodes in the cluster:

mesos-elastic-sharing

Airbnb and HubSpot run their applications entirely on the Amazon Web Services public cloud and Mesos knows how to manage the virtualized compute, storage, and I/O of the AWS cloud. To illustrate the difference between using Mesos and not using it, Airbnb has a cluster on AWS that has about 4,000 cores that is used to run Hadoop, Storm, Cassandra, Kafka, Presto, and a number of other analytics tools. This cluster has one-quarter of a person's time allocated to administration. Another 4,000-core cluster running on Amazon's cloud running a different set of applications and not using Mesos to provision and autoscale the applications atop the frameworks has between seven and eight people running it.

Before it went public in May 2012, Twitter talked a little more about its infrastructure and said that it had more than 50,000 cores in production atop Mesos. By shifting to Mesos to manage the provisioning of applications on the cluster, Twitter was able to free up about 30 percent of its servers so they could do other work (this at a time when its workloads were doubling each year). The time it took to deploy new applications at Twitter went from somewhere between eight and ten weeks (including all of the manual partitioning of clusters and provisioning of software stacks) to minutes.

Mesos is designed to scale up across tens of thousands of nodes and Leibert says it can do 40,000 to 50,000 nodes "easily." The important thing is also speed. It can fire up from 5 to 1,000 instances almost instantly and it is very clever about how it provisions nodes. For instance, when you want to deploy Hadoop workloads across the cluster, Mesos only spins up TaskTracker nodes as the Hadoop job runs; then the batch job is finished, the TaskTrackers are nuked, putting the server capacity back in the pool for other uses.

Mesos has APIs for C++, Python, Go, and Java applications and allows for CPU, memory, and I/O isolation methods other than cgroups to be plugged into the architecture. The interesting bit is when an application framework is married to Mesos to allow for automatic scaling. Here are the frameworks currently supported by Mesos:

mesos-frameworks

Mesosphere created Marathon, a framework scheduler (a wrapper if you will) for Linux applications that turns the cluster into a resilient platform cloud. This wrapper is what gives those Linux applications the resilience, scalability, and fault tolerance that Mesos delivers. Twitter created Aurora, which is a service scheduler for long-running jobs that makes use of the scalability and fault tolerance inherent in Mesos. Interestingly, Mesos supports Cray's Chapel parallel programming environment as well as the Message Passing Interface (MPI) protocol commonly used on parallel supercomputers so nodes in a cluster can share the results of their calculations during simulations. (While Torque is shown in the table above, it is currently being worked on and is not yet ready.)

Supporting a wide variety of platform, analytics, batch scheduling, and data storage layers is going to be key to the widespread adoption of Mesos. And so is the demand for a universal tool that can manage all of these disparate workloads side-by-side on a single cluster.

"All of the pressure is not on IT operations," explains Trifiro. "A lot of times datacenters are out of floor space or power, and even if they do not have a large datacenter, they have operational complexity. Sometimes, they have eight to ten full-time people managing a hundred nodes. And then there is developer pressure, as more and more companies are delivering competitive advantage through software, and they want to be able to deploy quickly and do continuous integration and delivery. All of these things are putting immense pressure on the IT operations team, all of this separate from an order of magnitude change in scale as the business changes."

Trifiro says that while Mesos is all about managing at scale, even if he had a baby cluster with only ten nodes, he would install Mesos on it to drive up utilization and to allow it to do a bunch of different kinds of work at once.

mesos-roadmap

Liebert says that so-called "lighthouse customers" in addition to the Internet giants mentioned above are putting the commercial-grade Mesosphere distribution of Mesos through the paces now. (Including one unnamed Fortune 10 customer, and no, Liebert would not tell EnterpriseTech the industry the player is in so we could identify it.) Mesosphere is now maintaining distributions of Mesos for all of the major Linux distributions. The company is not yet announcing the Mesosphere distribution, which will come later this year. There will be free editions of the product as well as commercial support and extended products. Pricing has not yet been set.

EnterpriseAI