Advanced Computing in the Age of AI | Thursday, March 28, 2024

Yahoo! Japan: 50TB/Day Data Transfer in Trans-Pacific Search for Lower Energy Costs 

Transferring data from one data center to another in search of lower regional energy costs isn’t a new concept, but Yahoo! Japan is putting the idea into transcontinental effect with a system that transfers 50TB of data a day from Japan to the U.S., where electricity costs a quarter of the rates in Japan.

DataDirect Networks and IBM Japan have worked for about a year on a new active archive solution that allows Yahoo! JAPAN to cache dozens of petabytes of data from its OpenStack Swift object storage system at a data center in Japan and then transfer and store it in a U.S.-based data center owned by YJ America, Inc., its American subsidiary.

According to Yahoo! Japan, subsidiary of the American internet giant, the company moved to an active archive system configured on both sides of the Pacific Ocean because of rising data volumes, multi-petabyte data backup requirements, and disaster recovery measures it implemented after the Great East Japan Earthquake of 2011 – along with a desire to avoid Japanese energy costs, which are 74 percent higher than in the U.S.

thumbnail_yahoo-system-diagramThe active archive is built on DDN’s SFA7700X hybrid flash storage appliance and IBM Spectrum Scale Active File Management caching functionality for higher speed I/O and metadata handling, and it is able to handle dozens of petabytes of data within a single file system configuration, according to DDN. The system allows the data center in Japan to cache data from the operating object storage private cloud at a rate of 11TB/ hour and back up the data in a data center archive in the United States, transferring data at what DDN called “a breakthrough transfer rate” of 50TB per day while allowing users in Japan to access data and conduct services.

Laura Shepard, DDN senior director of marketing, told EnterpriseTech said data transfers on this scale are seen at HPC research sites but that it is not common in business environments. “It’s the performance of the underlying storage system, to be able to actually service the requests at the rates required means very high bandwidths for a multi-region data transfer… There was quite a bit of work done to tune the infrastructure to achieve this level of performance for international data transfer.”

Daisuke Masaki, cloud innovations, site operations division, systems management group, Yahoo Japan, said the company was “grappling with a number of challenges related to our large, fast-paced data growth and vital disaster recovery needs; however, installing a massive storage system in a data center in Japan raises additional issues, such as power consumption. We therefore opted for a bold technical solution in which our data center in Japan caches data from the existing object storage (OpenStack Swift) and saves the data to a data center archive in the United States, which can be operated with 26 percent of the electricity cost of a data center in Japan and at about one-third of the cost of competitive solutions. Moving forward, we plan to expand and save data from multiple websites in Japan to the active archive system in the United States.”

EnterpriseAI