Advanced Computing in the Age of AI | Thursday, March 28, 2024

ClearDB Accelerates Databases By 10X With Project Halo 

Like many companies, ClearDB has tried a bunch of things to get itself going. Having found some success as a provider of tools that make MySQL databases redundant, more secure, and ready for cloud deployment, the company is putting the finishing touches on a new set of tools, code-named Project Halo, that will be able to accelerate the performance of any modern database management system on any platform.

This could be a breakout product if it performs like Cashton Coleman, the founder and CEO at ClearDB, is claiming that it will. Coleman is not giving out a lot of details, but he did tell EnterpriseTech a bit about Project Halo, which will be commercialized as ClearDB Iron.

ClearDB, formerly known as SuccessBricks, was founded in December 2010 with its own funding and has not, as yet, needed venture funding. Coleman has a broad career in IT, having built supercomputers for clients as well as working as a system guru for Feedster, Zend Technologies, and a number of other startups. He was also a software engineer and then chief architect at Salon.com, and has lots of experiencing hacking databases.

Coleman says that his company, which is now located outside of Dallas, tried a bunch of different products and services relating to the MySQL database before coming up with the ClearDB extensions that the company sells today and that it has taken as its corporate identity. These tools to create a database as a service launched in early 2011, and they come in two flavors. The dedicated clusters are designed to support one customer running a MySQL cluster on Amazon Web Services EC2, Windows Azure, Rackspace Cloud, Linode, or GoGrid, while a multitenant version for supplying a shared database service is available only on EC2 and Azure. One important and differentiated feature that ClearDB has is geographically distributed replication – something that Amazon is just rolling out on its cloud – and it can do it for both reads and writes across the database clusters that are widely separated. Another one is that is makes sure that MySQL master and replica nodes are physically right next to each other inside of the cloud databases, lowering latencies.

Customers can also license the ClearDB software use it on-premises in a private cloud, but thus far very few customers have done that.

The first big customer that ClearDB lined up for them was none other than Microsoft, which uses ClearDB on its Azure cloud to support those applications that need MySQL instead of its own SQL Server database. Coleman can't say how many databases are installed on Azure using ClearDB. But what he can say is that across all of the database services that it underpins on AWS, Azure, and Salesforce.com's Heroku as well as through partnerships with AppFog (now part of telco CenturyLink), Pivotal (the big data spinout of VMware), and CloudBees, there are a total of 66,000 MySQL databases running inside of ClearDB and that 20,000 of them are deployed on AWS EC2. In aggregate, these databases process 514 billion transactions per day, says Coleman.

Project Halo is a different sort of animal, but is related. After dancing around with some vague descriptions of what ClearDB Iron might be. Coleman confirmed our suspicion that it was a software abstraction layer that would sit between databases and the operating systems and storage that they ride upon.

"ClearDB Iron takes what we have with our third generation clusters and lifts the I/O throughput watermark to the total I/O capability of the virtual server instance or the physical server to their limits," explains Coleman. "In environments like Amazon EC2, your best Elastic Block Storage implementation is going to get you around 32 MB/sec. That's not really that great. With SSDs, EC2 instances will get you maybe 450 MB/sec. Still not all that great. With our internal storage virtualization technology, we can raise that up to the literal throughput max on the system. On lower-grade systems on EC2, that is around 1.6 GB/sec and on higher-end systems in EC2, we can typically do 3.2 GB/sec. You can take this all the way up to a IBM System z mainframe and hit the 384 GB/sec throughput marker. The software that we have built is really extraordinary. Players are testing it in the market right now, including Microsoft and IBM."

The ClearDB Iron tool is not, like the regular ClearDB product, dependent on MySQL. In fact, any modern database, including relational and NoSQL databases, can be accelerated with ClearDB Iron, and it can run on any reasonable operating system. (It is not clear what Project Halo is coded in, but given the fact that it is all about performance, it is very like C or C++ and not Java.) In general, Coleman says that ClearDB Iron will increase the throughput of databases running on cloudy infrastructure by a factor of ten. But he is not going to be precise about how. (That is provided that the system is I/O constrained and has enough CPU to push that.)

"All I can tell you is that it is very quick and that it uses an interesting tiered storage approach," says Coleman with a laugh. "We use a very interesting methodology when it comes to storage semantics. It is not something that people have thought of, and it is definitely new. I have been dealing with high-speed storage for over 20 years, and this is definitely not something that folks try because they don't necessarily think it will work. But this definitely works, and our core systems run on this technology today and we have a couple of big customers using it in-house, too. Microsoft just completed its tests, and they concur with what we say about performance increases."

ClearDB has been a cloud play, and it looks like ClearDB Iron will get most of its early action there, too.

"Our initial idea was to go for the infrastructure cloud players because they are the ones looking for the database option and they are going to need to boost the performance the most. They can't seem to capture the customers that are looking for that 'real iron' experience. Customers who have databases on premises today, every time they go to the cloud, they are disappointed because they are not getting the price/performance they expected."

That said, ClearDB Iron will probably end up being deployed on premises in virtualized environments if it is as good as Coleman claims, and will likely also see action in hybrid public private cloud setups where databases are involved.

ClearDB Iron is scheduled to launch sometime in the second quarter.

EnterpriseAI