Larry Smarr: Building the Big Data Superhighway
Internet pioneer Larry Smarr once had a vision of bringing connected computers out of academia and into the consumer world. Today, he envisions a second virtual highway, one capable of delivering on the promise of big data by leveraging fiber optic networks to transmit data at speeds of 10 gigabits to 100 gigabits per second. The idea is similar to NSFnet, which became the backbone of the Internet in the 1980s. Like NSFnet, Smarr hopes this new network – the Pacific Research Platform – will be a template others will adopt.
Smarr, who is founding director of the director in 2000 of the California Institute for Telecommunications and Information Technology (Calit2), a University of California San Diego/UC Irvine partnership, sat down recently with EnterpriseTech to discuss Pacific Research Platform. Following is the interview:
EnterpriseTech: Please could you provide us with an update of where the Pacific Research Platform stands today, about a month since the New York Times article appeared?
Larry Smarr: The National Science Foundation officially awarded the grant a few days before the New York Times article and it has a start date of October 1st, 2015. It is a five-year grant, providing $1 million a year, which will be used to coordinate the development of the Pacific Research Platform. By NSF rules you can't spend the money directly for scientific research; that has to be supported under other peer-supported grants. This is largely a socio-technological grant, in that you've got to coordinate 20 university campuses, and their chief information officers, and the different ways they've all implemented big data campus-scale networks. But then you've also got to work with a dozen or so major scientific research teams, each of which is cross-institutional, which is why they were selected. They're already working together across campuses or across labs, but they're doing so on the commodity shared Internet. And so the real question here is, if we can move them up to a dedicated optical network – roughly about 1,000 times faster than the shared Internet – how will that change the way that they do data-driven science?
EnterpriseTech: How technologically feasible is that, to move these scientific teams up to an optical network?
Smarr: There are only a few places in the United States that could do this for $1 million a year. You've got to remember, the money is not going to buy any optical fiber. We have to build the Pacific Research Platform on prior investments. The West Coast has a great advantage because member universities have invested for two decades in their nonprofit research and education network – the Corporation for Education Network Initiatives in California (CENIC), now one of the most advanced regional optical networks in the world. CENIC extends up to the University of Washington over Pacific Wave (PW) and recently CENIC/PW have upgraded to a 100-gigabit optical fiber backbone, connecting the campus gateways of the West Coast research universities. In the Pacific Research Platform we're extending that to the University of Hawaii to the west and to the east to Chicago, to demonstrate it can scale to a national footprint. The important thing is the National Science Foundation spent the last three years having reviewed requests for proposals from campuses to build what are called Science DMZs on their campus, which are dedicated networks for big data science research separate from the commodity Internet. So the way to think about it is we're at a very special moment in history. A project like the Pacific Research Platform couldn't have been done before this moment in time because we had to build on both the National Science Foundation investing in the on-campus infrastructure and the West Coast universities investing collectively in the wide area CENIC/PW infrastructure. What our Pacific Research Platform does, is essentially, tie those all together as one uniform big data freeway system.
EnterpriseTech: What could the results be, beyond speed?
Smarr: A metaphor I like to think about is another form of infrastructure, that of automobile and truck transportation. We had spent 50 years building up in the United States city streets and two-lane highways – Route 66, Highway 40 – connecting those cities and that was our automobile and truck transportation system in 1955. At that point, President Eisenhower called for building the interstate highway system because, in addition to the city streets, a number of the metropolitan areas had created freeway systems – Los Angeles obviously comes to mind, but think of many of the other cities that had created these separate infrastructures away from the city streets. And then the interstate highway systems connected those cities' freeway systems together and, as a result you could drive from New York to San Francisco at high speeds without ever seeing a stoplight. Now that formed the basis of what was the next half-century of economic growth for the United States, because an entire set of new industries started that were only possible because of that interstate highway system, connecting the cities' freeways.
I don't know what the future's going to bring in detail, but this provides a historic analogy for creating a separate Internet infrastructure for a specialized reason – high-speed interconnection from inside one campus to inside another campus.
EnterpriseTech: Has the East Coast caught up?
Smarr: There are a number of regional optical networks in the East that connect many universities. But I think in addition you have to build on that physical infrastructure a collective human effort, as we have here on the West Coast with the Pacific Research Platform, between a set of campuses and the wide-area provider that connects those campuses. That's why I say it's a socio-technological effort, not just an engineering effort.
EnterpriseTech: What brought that socio-economic effort together in the West Coast, whereas it hasn't happened elsewhere?
Smarr: I believe that you will see similar projects to this within the next year or two all over the country, so I don't think California is so far ahead of everybody else, but it is the only state that has a university system like the University of California – 10 campuses of incredible diversity, all within one system, with many of them among the best research universities in the country. So it's easier to do the socio-technological piece where the core is a set of campuses have a very long history of working with each other. However, the real secret to our early success is the existence of CENIC/PW, because they are a membership organization. CENIC is a membership organization which has not only got the 10 UC campuses, but all the Cal State campuses, the three private campuses of Caltech, Stanford and USC, all the community colleges, and the 9,000 K-12 plus the libraries, all in one integrated regional optical network.
The other thing I want to mention is that this could not have happened without the pioneering work of the Department of Energy’s Energy Sciences network (ESnet), whose director is Greg Bell. They're based out of Lawrence Berkeley Laboratory, but they have in many ways the most advanced network in the country. They developed this idea of the Science DMZs. They demonstrated that multiple DMZs at various DOE national labs and supercomputer centers could be tied together by a 100 gigabit per second backbone. This Department of Energy worked as an example that enabled the NSF to clone the Science DMZ architecture and fund over 100 US campuses to install local Science DMZs. Essentially, this was a massive technology transfer from the Department of Energy to the academic sector through the facilitation of the NSF. So what I did was use this DOE success as a model, where the analog of ESnet was CENIC/PW and the analog of the DoE labs were the university campuses and the Pacific Research Platform supercomputer centers were the analog of the DOE supercomputer centers.
EnterpriseTech: It's fascinating to see how this all came together…
Smarr: There's a historical precedent you may not be familiar with. A similar technology transfer happened in 1983 to 85, in which the NSF supercomputer centers were created. The proposal I submitted in 1983 was to clone the supercomputing environment at Lawrence Livermore National Laboratory, including its mass store and visualization capabilities, into what became the National Center for Supercomputing Applications at the University of Illinois Urbana-Champaign. Similarly, Sid Karin, who proposed the San Diego Supercomputer Center did it by essentially cloning MFEnet, which is the magnetic fusion energy network at the Department of Energy. Then NSF interconnected the five new NSF centers with the NSF Network – NSFnet – adopting DARPA's Internet protocols for ARPAnet and that became the backbone of what became known as the Internet. And so, here we are essentially 30 years later, doing another massive technology transfer from the Department of Energy into the academic sector.
EnterpriseTech: Amazing how history repeats itself and quite thrilling to hear the story first-hand. What made you recognize today's Internet was inadequate for big data?
Smarr: The evidence is everywhere that big data can't be supported with the current commodity Internet. Let me give you an example: You know the Internet is engineered to efficiently and interactively deal with megabyte-size objects. So think about when you take a photograph, which is a few megabytes, with your smartphone. You attach it to an email, text message, or social networks, and it's just gone, right? Scientific research doesn't get warm until it's in gigabytes, which is a thousand times larger – or terabytes, which is a million times larger – and you can't stuff 1,000-times larger objects through the Internet. If you tried that on a campus you'd be shut down for a presumed denial of service attack. Instead, people ship disk drives, by FedEx, or if they want to get to get something across a campus they use sneakernet, carrying a disk drive or a thumb drive from one place to another. This is in spite of the fact that there is optical fiber in the ground and each optical fiber can actually support 50 to 100 separate channels, each of which carries 10 gigabits per second. And so really it was clear that you just needed a separate high performance big data cyber infrastructure.
As I say, 30 years ago, this is what universities realized about computing. The biggest computer in those days, in 1985, that most academic scientists had access to was the DEC VAX 780, which was a fine computer. I had use of one at the University of Illinois, but then I took my research group’s code that ran eight hours overnight on the VAX – which was computing fluid flow accreting onto a black hole –to the Cray-1 at the Max Planck Institute for Physics and Astrophysics in Germany and it ran 400 times faster. Now, 400 times faster, what does that mean? Well, eight hours times 60 minutes per hour is 480 minutes, so it means the program I ran overnight now ran in 1.2 minutes. Well, I can run dozens of those per day. So I could do two weeks' worth of VAX progress in my computational science in one day on the Cray. That kind of reasoning is what led the National Science Foundation to set up its National Supercomputer Centers.
Now just because there were supercomputers available, researchers didn't get rid of their local computers. In fact, the first personal computers had recently come out in the early 1980s. Everyone today still has personal computers. They even have clusters on the campuses, but if they want to run big jobs they go to a separate infrastructure. Analogously, you'd continue to use the Internet for what it's good for, but if you've got something that's a big data job you need a separate infrastructure to run that on. We've been doing this for a long time. I was a principal investigator for the OptIPuter Project, which started in 2003, and was a multi-institutional grant. We showed that you could build out just such an optical wide-area network in which the speed between the distant campuses was as fast or faster than the backplane of a cluster. An important implication is that it changes how you co-locate big data with big compute. Normally, you think this means you need to local the big data on the disk drive of a cluster, but the data can't get to the processors any faster than the back plane that connects the parallel processors and if your wide area network is even faster than the back plane, well the big data on the optical network is already co-located with remote computers on that network. So it allows big data analysis to happen in a distributed fashion.
EnterpriseTech: Have some commercial companies used optical networks?
Smarr: Large, global companies like banks have been doing this for years: think about the use of optical networks that connect datacenters to enable banks to do fast trading. They have a private optical network that they use to shave milliseconds off of trades and so their financial data is globally connected by optical fibers. Large energy companies also use optical fibers like this. I think, in fact, it was partly looking at the fact the private sector could already do this that gave me the confidence we could do this across the academic sector. And the fact that global banks – with datacenters around the world – or Google or Facebook – both of which have datacenters around the world – are connected by private fiber optics. This is why I knew this could be done globally. You'll notice Australia’s Academic and Research Network (AARNet), within literally a week of the PRP being announced, is becoming an international partner with it. As part of PRP, the Netherlands was already a partner to show we could handle international distances and recently Japan and Korea have also joined.
EnterpriseTech: Yes, the financial industry is well known for doing everything it can – including physically moving its datacenters – to shear fractions of seconds off its speeds…
Smarr: Think how incredibly expensive it is to either put a submarine cable across the ocean or to lay fiber across the country. The banks will do that if it gives them a more direct link and can shave hundreds of milliseconds off of a trade. So that just shows there's enormous return on investment for the use of optical fibers. It's already been demonstrated in the private sector.
EnterpriseTech: Do you see it becoming less restrictive in cost so it becomes more accessible to other enterprises and industries?
Smarr: Well the good news is there's a vast amount of fiber in the ground in the United States already and that each fiber can be subdivided into channels, each of which can be leased. Most enterprises don't have to worry about having to put in optical fiber. They would work with a third-party company like Level 3 to lease the paths between the enterprise sides and link it that way. That's a pretty competitive market right now, for the fiber optics and leasing.
EnterpriseTech: What are the biggest challenges you face right now with Pacific Research Platform?
Smarr: The Pacific Research Platform is an ambitious, large-scale project, so I think project management is the most challenging part. We've got an excellent team between UC Berkeley’s Center for Information Technology Research in the Interest of Society (CITRIS), which is a co-organizer, and Calit2, which is the lead organization. The thing, again, that makes this all possible is, essentially, everything has been pre-organized. All of these universities meet together through CENIC, the regional optical network membership organization. The University of California, as a core set of ten campuses for the Pacific Research Platform, has all ten of its chief information officers plus the Office of the President as members of the UC Information Technology Leadership Council within the University of California, and they meet regularly. The science teams, for instance the astronomers that are doing the telescope surveys, are already working together. Team members were selected as scientists that were already collaborating across campuses, so they pre-organized. It's fortuitous there were so many components of this that were already in existence and people were used to working with each other, so it's more of a coordination effort, I guess, than anything else.
EnterpriseTech: How about technologically? Are there any technologies you need to create or seek out?
Smarr: One of the innovations we've created is called Flash I/O Network Appliances (FIONA) to answer the question: how do you effectively terminate a 10-gigabit/second optical fiber. Say I bring a 10-gigabit/sec optical fiber into your lab and say, 'Well, here you go.' The good news is this fiber will bring 1,000 times as much data per second into your lab as you were doing with the shared Internet. That's the good news. The bad news is you've got 1,000-times as much data coming into your lab; where are you going to put it? And what are we going to plug this optical fiber fire hose into? And so what we did is realize you could just take a normal PC and re-engineer it, optimized for big data, and then put 10 or 40 gigabit per second network interface cards on it and now you have an inexpensive termination device for the fiber optic, just the way your smartphone is designed to terminate the wireless Internet or your PC is engineered to terminate the shared Internet. We are now deploying those across all the Pacific Research Platform campuses. That makes the integration of the networking a lot easier because you've got a more uniform endpoint.
EnterpriseTech: Sounds like you got that one nicely figured out. Are there other technology issues to address?
Smarr: Going forward there are several. First and foremost, just because you can do Ping -- meaning that you've got actual connectivity between say a point inside one campus, across the campus network, across the CENIC back plane, and then into another campus, across the campus, into their building – the first time you try to send a large file, you'll find that instead of 10 gigabits per second, maybe you're getting 100 megabits per second. There is all kinds of tuning that is required. The network engineers use perfSONAR to measure the flow of bits and then decode where the bottlenecks are – firewalls or other kinds of barriers that are in the way. We have tremendous network engineers on this project and they're all working together, which is the major reason the Pacific Research Platform is feasible – the collective wizardry of some of the best network engineers in the country. Second, in later stages of the Pacific Research Platform we will move to Software Defined Networks and innovative security technologies.
EnterpriseTech: Are there other partners you hope to attract or regions of the world you hope to enter or expand your presence in?
Smarr: The NSF is funding us, not just for the West Coast, but to be a sub-national model. Hopefully, over the next few years there'll be other organizations like PRP and out of that – maybe five years from now – you'll have a national scale version of this, connecting together all the big data networks.
EnterpriseTech: That's a realistic timeframe?
Smarr: Yes. Remember, this is really a national competitiveness issue. President Obama issued an executive order creating the National Computing Strategic Initiative in July 2015, largely focused on accelerating the move toward the exascale supercomputer. But I think that you can see this kind of big data infrastructure connecting all of our big data researchers across the country with all the scientific instruments, computing, and storage as an equally important national competitiveness initiative. I think there'll be steady progress toward a ubiquitous version of the Pacific Research Platform across the United States – and connected globally, with our major university research partners and places like the Large Hadron Collider (LHC). They're a good example. They've got 20,000 or so physicists all over the world, getting data from the LHC in Geneva and that community has essentially built such a global optical fiber to support their data analysis. In fact, that is one of our main drivers. We're using that successful example as one of the main drivers for the PRP.
EnterpriseTech: So you're creating a best practices template yourself, too?
Smarr: That's exactly what the NSF had in mind. If you go back and look at their call of proposals you'll see they explicitly want one of the outputs of this funding to be a model for a national-scale version.