The Unstructured Data Management Maturity Index
As data management matures, unstructured data evolves from a storage cost center to the epicenter of value creation.
Enterprise data is growing – it’s no surprise. It is the current rate of data growth that is truly staggering. In 2010, the amount of data created, consumed and stored was two zettabytes, according to Statista. Companies like IDC have predicted explosive global data growth over the next few years: from 64.2 zo of data in 2020 to 175 zo in 2025. That’s nearly three times the growth in five years. About 80% of all data is unstructured: file and object data, including documents, medical images, video and audio files, design data, research data, and sensor data.
By some estimates, less than 5% of this data is used for any purpose, and enterprise IT teams have minimal visibility into their data and its value. So they store it forever because it’s the safest thing to do. The end result: outsized storage expenditures and the inability to leverage data for new use cases and value. A recent study by Accenture revealed that 68% of companies are unable to derive tangible and valuable benefits from data.
Yet consider the opportunity: from real-time analysis of adverse events to inform patient safety measures and new drug development, early identification of product defects in manufacturing, customer sentiment and chat analysis after a new product is released to improve go-to-market strategies or the application of machine learning (ML) algorithms to real-time seismic data and satellite imagery to forecast natural disasters. According to Forrester, organizations that adopt a data-driven decision-making approach are growing at more than 30% annually.
To take advantage of unstructured data for competitive purposes, it is important to develop a management strategy that meets the dual need for profitability and monetization. Here is a 5-step maturity model for organizations looking to modernize unstructured data management practices.
See also: Avoid the culture of data unavailability skepticism
Unmanaged unstructured data: At this point, unstructured data volumes are large and spread across on-premises, edge, and cloud silos, resulting in minimal visibility and little, if any, insight across the entire enterprise. data storage ecosystem. In many cases, data is treated the same way: most or all of the data resides on expensive primary storage and is not managed appropriately to save money or meet the needs of groups and loads separate work. Meanwhile, there is pressure from above to manage costs, moving from sunk data center costs on hardware/maintenance to more flexible, on-demand cloud storage. Yet without adequate visibility into data assets, requirements, and value, it’s difficult for IT and storage professionals to plan and manage effective cloud data migrations. Many will opt for a basic lift and shift approach, which can actually add to the cost further.
- Disconnected storage silos limit visibility into data assets
- Storage, backup, and DR costs are high as a percentage of IT budget
- Tension between storage IT professionals and users/department managers regarding data management decisions
- Lack of expected return on investment from cloud storage migrations or tiering.
Storage-centric data management. This phase is characterized by a move toward better control of data storage costs by using the storage vendor’s own data management capabilities for migration, replication, and tiering of unstructured data. Storage-centric data management can be effective in environments with a single storage vendor, but most environments include multiple sites, additional vendors, and cloud deployments. Storage administrators must use disparate tools to migrate, replicate, and analyze data within these storage silos. This approach provides some cost savings, but may not reduce complexity, reduce flexibility, and still leave money on the table. If an organization wants to access data after it’s been migrated to the cloud through the storage vendor’s tools, IT must retain the storage capacity and pay the egress charge.
- Unclear strategy for migrating to lower cost storage
- Several tools used for migration and other data management tasks
- Hidden costs of prioritizing storage providers to the cloud
- Planned migration to new platforms is often delayed or delayed due to complexity.
Independent management of unstructured data. As an enterprise’s unstructured data grows to petabytes and beyond and hybrid cloud computing infrastructure dominates, the need to separate data management from storage management becomes apparent. Storage teams will seek to adopt an independent approach to data management, sometimes referred to as Data Fabric. Teams use analytics to examine storage silos and identify savings opportunities. For example, moving “cold” data that hasn’t been accessed for a year or more to cheaper storage (like in the cloud) frees up space on expensive, high-performance NAS storage.
- Consolidation of data management tools
- IT can manage data independent of storage technology or service
- The ability to reduce storage and backup costs by 70% or more by identifying and moving cold data to secondary storage
- The unstructured data management solution should not affect the end user’s data access performance.
Policy-based unstructured data management. In this phase, organizations move beyond cost savings to better meet security, compliance, and research requirements. Data policies and open data formats are essential. Organizations automatically and continuously move data to the right storage based on business priorities, costs, or monetization opportunities. For example, an electric car manufacturer wants to understand how its vehicles perform in different weather conditions and so creates a data management policy to continuously pull car trace files at regular intervals into data lakes and analyze them. Once the study is complete, this policy is removed and the moved data is either deleted or moved to deep archival storage.
- Storage teams have shifted from storage-centric operations to a focus on properly managing data throughout its lifecycle with self-service capabilities for users.
- Increased automation to move data to the right storage at the right time, expanding use cases for managing unstructured data.
- Data management policies run automatically until changed or deleted, eliminating manual, error-prone policy management.
Unstructured data management value. Some datasets contain value beyond the original application that created it. With advances in scalable and affordable services such as cloud-based data lakes and machine learning, business leaders are eager to see what their treasure troves of stored data might bring in terms of new insights benefiting the R&D, operations and customer relations. At this ultimate level of unstructured data management maturity, the new prize is data management for long-term value. Capabilities include the ability to search across storage and cloud silos to find precise datasets, then move the data into cloud analytics environments for analysts and data scientists to access. Mature organizations can tag files with additional metadata throughout the lifecycle, improving searchability and queryability. Storage teams work closely with business/departmental stakeholders to understand data requirements for proper planning and long-term goals.
- Unstructured data management tools enable seamless movement of data to external data analytics platforms and services.
- End-to-end workflow automation eliminates the steps of unstructured data discovery and delivery to the platforms of your choice.
- Storage administrators are elevating their role from configuring and managing storage technologies to managing data for market gains.
- Data management becomes a flexible framework that future-proofs data for new applications and business use cases as they evolve.
- IT can measure the increase in revenue generated by unstructured data insights.
No matter where your organization is on the maturity curve, it’s time to stop buying more storage without seeing the big picture and stop treating all data the same. . Instead, start analyzing and understanding data to manage it appropriately and by policy so you can take full advantage of cloud storage and avoid waste. Start spending time on strategies to deliver greater data value, including connecting with data teams building new analytics infrastructure.