Mind the Gap: Coping with Big Data’s Big Costs
In an increasingly data-driven world, with flourishing data volumes and improvements to technological infrastructure, big data promises to benefit businesses in nearly every industry. A study conducted by the Harvard Business Review found that companies leveraging advanced analytics will, on average, experience profit gains and operational improvements between 5 and 6 percent higher than their competitors. By 2020, the revenue generated by big data and business analytics will surpass $200 billion. It would seem the question of whether big data can deliver business value has been firmly settled.
Now, businesses must answer two important follow-ups to capitalize on this opportunity:
- What can big data do for my company?
- How can I build the necessary IT infrastructure and internal operations framework to ensure my big data program’s sustainable success?
These questions help to identify what we’ve dubbed “the big data activation gap.” This is the space between having access to raw data and the point when your big data operation produces quantifiable value.
Many issues underlie the gap, and they can vary from company to company. In 2017, we conducted a survey on the State of DataOps, asking IT and business leaders about their internal big data successes and challenges as they bridge the gap. Only 8 percent of respondents reported a mature big data program. For the 92 percent struggling to bridge the gap, the most common issues were a lack of technical resources, difficulty finding skilled employees and cost containment.
The best first step in bridging the gap? Identify the true value of your data — what it can do for you. Along with this, pinpoint how and when your teams need access to data, and in what form. From there, your business can start to develop an effective strategy around IT infrastructure and hiring — the answer to the How question — to help your company fully activate its data.
Pinpoint the Destination — And Be Specific
Big data is everywhere, and it’s important for businesses to understand how to effectively analyze and draw insights from their data. Data activation allows companies to successfully put data to work across the organization by activating an infinite volume and variety of data for everyone across the business, for any use case.
Be it machine learning or ad hoc analysis, your desired areas of application will help inform your activation strategy and allow you to establish the right tools and teams. For example, data teams can run an analysis to determine customer buying trends, keywords, and purchasing pathways that may lead to either greater revenue from existing customers or the acquisition of new customers. Those results can be shared across marketing and sales teams to better inform their specific messaging and strategies for interacting with customers, improving outcomes and generating new revenue.
Getting the Right Tools in Place
Once you clearly outline your business needs, it’s crucial that the right data tools be in place to ensure your data is widely accessible. Many solutions are available for processing large amounts of data for ETL, machine learning, ad hoc analysis and beyond. Depending on your business needs, however, certain tools are better for streamlining specific use cases.
For example, Hadoop and Hive work well for complex ETL workloads, but may not provide optimal results on machine learning or artificial intelligence use cases. By identifying the strengths of popular big data engines or frameworks before incorporating them into a data strategy, businesses can better activate their data in a way that suits the specific needs of their teams.
Amidst the complexity of today’s big data environment, no single tool will be able to meet all of a company’s needs, and because of this, many of today’s most innovative organizations are leveraging multiple tools to process and analyze data for different workloads and projects. In fact, data from the survey found that roughly 80 percent of companies are transitioning to a multi-engine deployment strategy for their big data workloads – and that trend is expected to accelerate as businesses grapple with ever-larger data loads.
Bridge Your Talent Gap
Proper tools are crucial, as is a data team capable of using them. Companies are grappling with a shortage of skilled data experts, increasing pressure on small data teams to extract and process increasingly large data requests. Implementing a big data activation strategy allows companies to help bridge this gap and maximize their limited resources. By removing much of the grunt work around processing data, teams can instead focus on extracting and applying actionable insights across the business.
Interestingly, many companies are also trying to address the shortage of data team talent through their data strategies. With a sharp rise in the number of users running queries across different data engines and frameworks, companies are starting to invest in making employees more self-sufficient. Increasingly a self-sufficient approach becomes critical as companies collect larger amounts of data.
While it may be overwhelming to think about how to deal with the rapid increase in data requests, understanding and prioritizing your use cases and acquiring the right tools and team will help set up your organization for success. As a result, your organization will be miles ahead of competitors in implementing a data activation strategy that empowers data users to process data effectively, quickly and affordably.
Ashish Thusoo is CEO and co-founder of Qubole.