Accelerating AI Workloads with Low Latency File Access Sponsored Content by Mellanox
Similar to the wide-spread adoption of high-performance computing (HPC) across industries, the use of Artificial Intelligence (AI) is also becoming more prevalent in businesses today. By incorporating AI into their operations, businesses are using the technology to accelerate digital transformation, make real-time financial decisions, speed the time to results, and even optimize cost structures. In fact, a 2017 survey of global CIOs found that 90 percent of CIOs championing AI in their organizations today expect improved decision support that drives greater topline revenue growth.
Because most AI applications are dependent upon the use of massive amounts of data, they require HPC capabilities especially in the areas of compute, storage, and networking. AI’s requirements are very similar to those of other computationally intensive applications (e.g., Big Data analytics, forecasting, modeling, and simulations) that are also increasingly being introduced into the enterprise today.
While the promise of AI includes faster and better insights into data and in-turn reduced risk and increased efficiency, there are potential technical obstacles in the way. To remove the obstacles, Mellanox Technologies and WekaIO are leading in this space, delivering solutions with exceptional world-class data center performance. Their solutions seamlessly fit into existing data center infrastructures and allow any organization to realize the full benefits that AI and HPC applications can deliver.
Selecting the right technology solution
A key challenge today for AI use is closing the gaps in accessing massive amounts of data, training a deep learning model, and deploying it into production. In many main-stream industries such as autonomous vehicles, medical research, security, or even customer service applications, the requirements go beyond just that of powerful processing capabilities. Efficient AI systems are also highly dependent on a superior network with advanced acceleration engines and RDMA (remote direct memory access) capabilities supported in the hardware.
RDMA is an industry standard that supports what is known as zero-copy networking by enabling the network adapter to move data directly to or from the application. This eliminates both the operating system and CPU involvement, so it is exceptionally faster than other solutions. Mellanox 100Gb/s solutions can move data between server hosts as fast as .6u seconds and enables much higher message-rates, which is paramount for HPC and AI workloads.
What's more, GPUDirect™ RDMA is another widely adopted capability used to accelerate data movement and increase performance and scalability with GPU-bound workloads. GPUDirect RDMA allows data movement directly to and from remote GPUs using the Mellanox network fabric, again removing the processor and system memory from any involvement of data movement. In benchmark tests, GPUDirect RDMA achieved a data transfer rate of 11GB/second per client using a single EDR link, a 10X improvement over NFS’s per client performance of 1GB/second. More importantly, this performance scales with each additional EDR link, making GPUDirect RDMA one of the most popular techniques in both HPC and AI today when scaling beyond a single compute node. Even the leading deep learning frameworks such as Caffe2, Microsoft Cognitive Toolkit, MXNet, PyTorch, and TensorFlow already natively support Mellanox. Putting the issue into perspective, consider that the NVIDIA DGX-2, the very latest in state-of-the-art platforms for AI, includes 8 ConnectX-5 100Gb/s adapters, delivering 1600Gb/s bi-directional throughput and takes full advantage of Mellanox acceleration capabilities.
WekaIO Solves the I/O Starvation Problem
Especially with AI applications, fast access to storage also has a significant role not only in reducing the training time in a deep learning scenario but also in supporting fast decision making in production environments. To optimize the running of workloads and make the most efficient use of expensive GPU arrays, artificial intelligence-based systems need an innovative storage solution that aggregates data that previously resided on different systems into a unified storage pool. WekaIO, a high-performance, scalable storage software company, has introduced a high-performance storage solution based on a parallel filesystem that offers the highest throughput, lowest latency data access for CPU- and GPU-intensive AI and HPC workloads.
The WekaIO Matrix™ software (coupled with ConnectX-series InfiniBand and Ethernet adapters) enables performance improvements to data-hungry workloads without a business needing to modify existing networks or applications. The solution uses a WekaIO developed, optimized zero-copy network software stack based on PCIe virtualization and Data Plane Development Kit (DPDK) technologies.
The combination of Mellanox interconnection solutions and WekaIO software allows AI and HPC workloads to achieve the highest performance, scalability, and efficiency. This provides HPC-class performance and scale without the complexity and expense of proprietary systems. And it future proof’s an organization’s investment because both the hardware and the software solutions are based on non-proprietary interconnect hardware as defined by the InfiniBand Trade Association.
Artificial Intelligence and HPC platforms alike depend on Mellanox and WekaIO to deliver solutions that are based on open industry standards, provide world-class capabilities and performance, and lower the overall total cost of ownership. Maximizing the potential of today’s modern GPU systems is essential for businesses today. Companies, no matter their size or industry, are looking to optimize the use of data, transform their business, and maintain a competitive advantage by leveraging artificial intelligence. The leading AI companies today leverage state-of-the-art capabilities including WekaIO and Mellanox.
For more information about Mellanox and WekaIO solutions for AI, visit: