Advanced Computing in the Age of AI | Friday, March 29, 2024

Critical Apps In The Cloud And High Availability 

Enterprise data centers have been using cloud computing for several years for their low-risk environments, such as test-dev and non-mission-critical applications. In this context, the benefits of cloud computing are well understood. IT managers can use the cloud to easily and cost-efficiently set up and provision an application environment without the limitations and hardware investment required in a physical or virtual server environment.

IT departments are looking to expand the use of cloud computing for more of their core business applications. Some are even looking to push their entire data center into the cloud where they can gain configuration flexibility, improved IT resource allocation, and, in some cases significant cost savings. Moving to the cloud is the only way some datacenters can support rapid growth or rapidly changing requirements for storage, performance, and IT resources in today’s fast-paced economy.

Minimum Requirements for Business Critical Applications

However, moving tier one business critical applications such as SQL Server, Oracle, and SAP to the cloud presents several significant challenges. For these applications, companies have no tolerance for downtime, disruption of service, data loss, or slowed response times. To even consider moving a business critical application to the cloud, data center managers have to ensure that the move will not add risk, complexity or cost to their application environment. The cloud environment also has to match or exceed the service levels and recovery time and recovery point objectives that they are meeting in their physical server environments (typically using shared storage clusters). These requirements raise several key questions:

  • How can you provide high availability and disaster protection needed for tier 1 applications in a cloud where shared storage clusters are not offered?
  • Will providing HA and DR protection impose limitations on cloud flexibility?
  • Will moving applications to the cloud add risk, complexity, performance overhead?

 

Options for protecting applications in the cloud

Public cloud providers offer some redundancy in the form of separate and redundant data centers or computing resources, such as AWS EC2 Availability Zones and Microsoft Azure fault domains. However, they fall short of the high availability failover clustering protection that companies typically use to protect important applications in their physical or virtual server environments. While moving to the cloud means you no longer have physical servers in house to worry about, you still have to protect against failures in cloud instances and outages in public cloud provider service.

In a traditional failover cluster, two or more servers share the same physical storage, typically networked storage configured through a SAN. The critical application operates on one server and in the event of a failure, clustering software moves the application operation to another server in the cluster. Since all of the cluster servers share the same storage, the application can continue to operate after a failover without a loss of data.

More and more companies are overcoming the lack of shared storage clustering offered by cloud providers with a deceptively simple solution: SANless clusters. In a physical server deployment, you would typically use software that manages the application failover process such as Windows Server Failover Clustering (WSFC) and two servers connected to a SAN to create a cluster. In a cloud environment, you can still use WSFC or another failover software, by simply adding SANless clustering software that synchronized the storage in the cloud cluster nodes using efficient replication (synchronous or asynchronous) to create a virtualized storage that looks to the failover software like a SAN. It allows you to run your business critical applications in a public, private or hybrid cloud with the same level of protection as a physical server deployment. Some SANless software can be used to handle both failover and replication in Linux environments, in multinode deployments across subnets, or in other scenarios where WSFC may not be possible.

By using SANless software to create clusters in the cloud, you get the same level of high availability and disaster protection that you get in a physical server deployment without adding complexity or changing existing operational procedures. It is managed in the same way as a traditional cluster. You also eliminate all of the limitations and downside of owning and operating a physical SAN. For example, SANs are expensive to buy and require specialized (sometimes costly) administration skills to manage. Because SANs are primarily optimized for large scale storage rather than low latency data access, SANs can slow performance in highly transactional database environments such as SQL, Oracle, and SAP.

For example, a large pre-owned car company in Japan planned to expand their business very quickly – nearly doubling its locations worldwide in the next two years. The only way their IT infrastructure could accommodate that level of growth was by moving to the AWS cloud. However, they would not move to the cloud unless they had an easy-to-deploy HA protection for their business critical SQL Server applications.

Configuration Flexibility

Cloud environments let you configure an application environment and allocate IT resources in an easy, dynamic, flexible way that a SAN-based cluster cannot approximate. A SANless cluster enables you to leverage the benefits of cloud environment and gives you the added flexibility to mix physical, virtual, and cloud configurations to best meet your HA and DR requirements.

Cloud as Disaster Recovery Site

One of the biggest benefits of moving to the cloud is the ease of deploying a disaster recovery solution. For example, an online gaming company moved its operations to an AWS cloud and wanted to ensure it was protected from sitewide disasters. The easiest solution was to create a SANless cluster that spanned Availability Zones. If they experience an outage on their primary cloud instance, operation will continue on servers in a geographically separated cloud instance.

Despite the advantages of cloud computing, many enterprises are too invested in their physical server and on premises data center to move to an all-cloud environment. However, SANless clustering is proving to be an important solution for these organizations as well by providing a simple, highly cost-efficient disaster recovery solution. Until recently, managing site failures has been very complex and expensive, requiring large investments in specialized hardware and software as well as the availability of a second data center site. Using a SANless failover cluster, you can create a scalable disaster recovery protection without the cost or complexity of building out or renting an off-site disaster recovery facility. You can use the cloud as a second site to locate a cluster member and use it to handle failover when the local site fails.

SANless clustering will become an essential requirement as companies look to fully integrate cloud computing into their infrastructure. By eliminating the limitations of SAN-based solutions, SANless clusters enable enterprises to fully leverage the benefits of the cloud.

 

Jerry Melnick is chief operating officer at SIOS Technology Corp, maker of SIOS SAN and SANless cluster software. He has more than 25 years of experience in the enterprise and high availability software industries. He holds a bachelor of science degree from Beloit College with graduate work in computer engineering and computer science at Boston University.

EnterpriseAI