Advanced Computing in the Age of AI | Friday, March 29, 2024

Microsoft’s Own Five Year Move To The Azure Cloud 

Just because you have built one of the world's largest clouds does not necessarily mean that you can easily move your own applications to it. Such is the case with Microsoft's own IT department, which is taking measured steps in deploying the more than 1,300 line-of-business applications that run the $87 billion software and services behemoth onto its own private clouds or the Azure public cloud.

Microsoft is older than the Windows Server platform it has been selling for two decades, and because of this the company had back-office systems for processing transactions, running its software packaging factories (remember when software came in boxes with floppy disks and then CDs?), and doing what has become known as analytics. For many years, Microsoft used to run many of its back-office systems on IBM AS/400 minicomputers, which had a slew of applications that were appropriate for its manufacturing and distribution operations, and Windows Server was a product for many years before the company was able to unplug those AS/400s and move to its own Windows-based ERP systems.

It is the nature of businesses – particularly large, multinational corporations – to be careful about making transitions in their IT infrastructure, and that is the reason why Microsoft is not trying to move all of its internal applications to the Azure cloud in one fell swoop. The easiest thing to do on the public cloud is create and run new applications; porting existing applications that on bare metal servers to the cloud involves some thought, testing, and tuning. They may not run well in the cloud because of inefficiencies in virtualization, differences in server and storage scalability and performance, and networking issues.

Microsoft's IT operations are nowhere near the scale of its Azure cloud, which has well over 1 million servers and quite possibly as many as 2 million. (Microsoft has hinted that it has more systems than Amazon Web Services or Google, but has not been precise on the number.) But MSIT, as the operations for running Microsoft itself is known internally, has plenty of iron and systems. Here is what it looked like back in May when EnterpriseTech was at a pre-briefing on Microsoft's many Azure announcements:

microsoft-it-one

Jon Ormond, director of networks for MSIT, walked us through the feeds and speeds of the internal IT operations, which had over 40,000 operating system instances on an undisclosed number of servers. (This was the operating system count prior the acquisition of Nokia, which happened at the end of April.) Back in May when Microsoft was talking privately about its internal IT operations and explaining how it was experimenting with its ExpressRoute service to link its own datacenters into the Azure public cloud, the company had nine datacenters for all this gear, but it has subsequently reduced that down to seven facilities.

About 65 percent of those server operating system instances were virtualized, according to Ormond, and they run more than 1,300 line-of-business applications, which support over 180,000 users (including Microsoft employees and some partners) linking in from 513 offices in 113 countries around the globe. Most of these, says Ormond, are homegrown applications for expense reporting, human resources management, supply chain management, and so forth. Microsoft is a SAP shop and has other third party applications as well. (There is an application that tracks soda consumption in every Microsoft facility.) Microsoft's goal is to get 80 percent of those 1,300 applications deployed into the Azure cloud in the next five years and to get 95 percent of those server operating system images moved over to Azure or to private internal clouds running the Windows Azure Pack on top of Windows Server within the next five years.

"As an enterprise, we are hoping for convergence between these two products," said Ormond, referring to Windows Azure Pack for on-premises clouds and the Azure public cloud. "And every month we see them coming closer and closer together. We are highly a virtualized environment now, but it doesn’t have some of the essential characteristics of a cloud, such as elastic scale and on-demand self-service. We do this manually."

Here is what the MSIT network looks like. And remember, this is the internal network for running Microsoft's business operations, not the networks for gluing together all the bits of Azure that support the public cloud, Office365, Bing, Xbox 360, and other services that get lumped into the "cloud" category by Microsoft. Microsoft has 7,241 switches and 2,302 routers which delivers 22 Gb/sec of sustained traffic coming in and going out to the Internet. This is just to run Microsoft.

microsoft-it-two

Ormond said that there was a "bow wave" of servers that would be coming to the end of their lives in the next five years, and that replacing them as-is with their workloads left in place would cost on the order of $200 million and would fill the coffers of Dell, Hewlett-Packard, and the other companies that Microsoft buys servers from. Ormond estimated that if the targets to move to the cloud could be hit over the next five years, Microsoft could eliminate about $150 million in that capital expense for servers.

"We will still have the associated OpEx, I still have to pay Azure, but I don’t have to spend that CapEx, which is precious to me," Ormond explained. "The OpEx I can vary and I can budget for. I am not stuck with it. The reason we have this $200 million expense is because these are physical servers in datacenters that I have to care for and feed and patch. We are hoping that it will reduce my OpEx as well, but we have not been doing this long enough to see if that bears fruit."

Microsoft provided a little more insight into its plan to move its internal IT operations to the cloud in a post on its IT Pro blog this week, and unbeknownst to EnterpriseTech, actually put out a detailed case study back in May talking about the move and its economic parameters.

Microsoft knows that it wants to keep its Active Directory, Domain Name Server, Windows Server Update Services, and System Center Configuration Manager machines, which manage access to its applications, how portions of the Microsoft network link together, how infrastructure is updated, and how internal machines are updated. Microsoft's IT department also plans to move any applications that contain financial information or personal data will be the last to move to the Azure cloud, giving the IT staff more time to assess how Azure stands up to the needs of Microsoft itself. Here is what the prospective transition from internal machines to a mix of public and private cloud looks like as it is currently scheduled:

microsoft-it-three

To make the move to the cloud, Microsoft has set up three distinct organizations. The Service and Deployment Operations (SDO) team builds the tools to help migrate MSIT workloads to Azure, makes sure the Azure infrastructure can handle the code and the load, and does the migration. The Project Stratus team works with business line managers to translate bizspeak into techspeak, and the First and Best Team identifies how to use Microsoft technologies that other people are sold internally at Microsoft. The company has developed a toolkit called Microsoft Assessment and Planning (MAP) to automate application components and size virtual machines and is using FactFinder Enterprise by BlueStripe Software to see all of the system and network dependencies of an application so it can be moved. Microsoft expects to use a mix of lift-and-shift as well as new application development to get its applications onto Azure or internal Azure private cloud clones.

About 60 percent of the 37,000 legacy server and virtual machine environments in the Microsoft IT department are going to need to be upgraded before the end of fiscal 2018, four years from now, and the remaining 40 percent of the OS instances are running on gear or hypervisors that are current enough to remain in place.

Here is the interesting bit of math that Microsoft has done. Of those 37,000 legacy instances, if all it does is on-site optimization as it installs new hardware, it will save about $75 million in capital expenses. If it can move 6,000 of those 20,000 end-of-life OS instances to Azure in four years, then it saves $35 million in capital expenses. It has to do better than that, then. With 14,000 OS instances moving to Azure, Microsoft reckons it can save $82 million in CapEx, and that is at least better than just building new infrastructure and compressing workloads on it in MSIT datacenters. The ideal Azure adoption scenario is to move all 20,000 OS instances to Azure in four years, in which case MSIT can avoid about $118 million in capital spending. And it will also cut down on depreciation, too.

The missing piece of the equation is whether Azure will cost less than MSIT to deliver the virtual machines in which applications run and the network connections that link applications and systems software together and to the end users. Microsoft did not share those projections, but you can bet that the goal here is for MSIT to eventually find itself largely out of the infrastructure and platform business and to focus on applications and service levels.

EnterpriseAI