Written by: Sameer Dixit, SVP, Engineering, Data, Analytics, Integration and AI ML, Persistent Systems
As organizations capture more and more data, their ability to harness this data and find the relevant insights to transform their business will set them apart from competition. The pandemic has dramatically accelerated the pace of this transformation in organizations. Factors like extreme personalization, business model disruptions, and providing experiences that are immediate, intelligent, and integrated are leading this necessity. Analytics and machine learning technologies are at the forefront of this transformation. Data driven decision making is the goal for every organization and a lot of that is going to be driven by the organizations data and analytics road map.
Here is a practitioner’s take on what we think will be the top trends that are expected to drive the data world in 2022:
- Machine learning for better decisions
Let’s face it, making decisions is hard! As organizations make progress on managing their data more effectively, the next frontier is about using that data to improve decision-making. This becomes especially complex with many, varied and intricate dashboards representing every possible data element, in turn causing fatigue. Multiple data points combined with human impulse and bias make the overall process even harder. Moving forward, these dashboards will be replaced with automated, conversational, mobile, and dynamically generated actionable insights that are generated autonomously using AI technologies. These insights will be customized to a user’s needs and will be delivered to their point of consumption, thus shifting the decision-making process from a handful of data experts to a wider base in the organization.
- Data as a Service
On one side the investments that organizations are making in big data and AI are accelerating at a rapid pace and on the other side the time to insights is drastically shrinking. Business users expect their investments in data and analytics to start showing results in a matter of weeks. However, for organizations that are not data driven, it might take months to arrive from data to insights. IT dependency remains heavy with self-service limited to traditional business intelligence reporting systems. Meanwhile, data custodians/executives continue to struggle with low data trust, long governance cycles, bureaucratic on-boarding, and approval processes.
Custodians of data in a company are being compelled to respond to this huge demand by treating Data as a Service and making trusted, verifiable data available on demand. This will enable organizations to become data-driven within weeks with a powerful combination of a data stewardship team that owns data quality and governance, manages data products, ensures decentralized and distributed ownership, and enables end-to-end self-service through infrastructure as a platform. The right infrastructure coupled with the right tooling ensures ease of access to data significantly reducing the time to create new data products that are secure, well governed, and discoverable. Architecture patterns like the Data Mesh pattern will be leveraged to create a scalable data architecture with distributed governance, data discovery framework, data security framework and SLA driven platform components.
- Focus on the nonfunctional aspects – data lineage, data catalog and data discovery
As organizations start becoming successful in bringing all their data into central data lakes, these data lakes tend to become data swamps; hard to search, extract and analyze data from. The demand for an increased need of self-service analytics for non-technical business users enabling data democratization will see the rise of organizations deploying data cataloging and discovery tools. Data lineage will become even more important for these users to gain confidence in the data. Data governance and security tools and processes will become a necessity as users will demand visibility, compliance, and access/permissions across all organizational data.
- Composable data and analytics
Typically, there are three main tenets for a modern data stack.
A lower total cost of ownership
On demand scalability
Reduced IT overhead
The new normal is to leverage a broad selection of cloud native products and services to build a composable data platform. The goal is to use best of breed components from multiple data, analytics, and AI providers to build a flexible and usable experience that will enable leaders to draw insights from data. The stack must be resilient and future proof. One way to do that it is to build it with loosely coupled components so that certain components can be replaced with something new and better if required. Special attention should be paid to nonfunctional components like lineage, cataloging, provisioning, security, governance and health in order to build a modern data platform.
With the modern analytics cloud, organizations can close the loop from insights to action and push those insights back into an operational application. An analytics engineer can implement this workflow, without a lengthy development process or custom coding thus making insights to action a reality.
- Hybrid, multi cloud and edge
Migration to the cloud is a critical component towards data and analytics modernization. Multi cloud or hybrid cloud deployments will be a key criterion in this cloud adoption. Organizations will decide their cloud strategy based on the various considerations including cost, value, lock in, flexibility, security, and compliance. Edge computing will gain prominence due to various factors like the rising number of use cases requiring low-latency processing, increased availability of computation power at the edge, the adoption of AI and IOT, prohibitive cost of data migration to the cloud and the increasing number of smart devices. The capability to process and analyze data at its place of generation is faster and cheaper versus moving it to a central location. Organizations will look to not only store and search for data at the edge, but also process a large portion of it locally, before sending it to the cloud. We will see more edge clouds, where the compute comes to the edge of the datacenter instead of the data going to the cloud.
- From DevOps to DataOps to XOps
With the deluge of data, processing it manually at scale is going to be very challenging. That combined with the rate at which the business is evolving means organizations will have to move rapidly on the automated operations curve and towards a more XOps way of working. XOps is an umbrella term used to describe the generalized operations automation of all IT disciplines and responsibilities. XOps enables and accelerates the organization’s ability to operationalize the insights from data and analytics. XOps encompasses DataOps, MLOps, ModelOps, AIOps, and PlatformOps that boosts efficiency, enables automation, and shortens development cycles for several industries. MLOPS and ModelOps is especially critical as machine learning starts becoming mainstream and the task of managing the models in production becomes even more critical than building the models themselves. The need for machine learning models to be explainable, reproducible, auditable and being bias averse will further accelerate this need.
- Data Lakehouse
Organizations have developed two parallel data ecosystems over a period. On one side we have the data warehouse forming the base of analytics use cases. The more recent data lake, stores data in the raw form and offers the flexibility, scale, and performance required for bespoke applications coupled with more advanced data processing needs. Going ahead we will see a convergence of these stacks especially with the introduction of cloud data warehouses and data lakes moving on the cloud. The modern cloud data warehouses and data lakes are starting to resemble one another. Both offer commodity storage, native horizontal scaling, semi-structured data types, ACID transactions, interactive SQL queries, and so on. This will drive simplification of the technology and the vendor landscape. This will also drive partnerships amongst the giants in this system.
In conclusion, organizations will have to surface insights in every enterprise application at every touch point for the end user. Not only will the enterprise user demand insights but also recommendations for the next best action relevant for the specific activity/task that the user is performing. This will be true for the simplest tasks like applying for a leave to complex tasks like pricing optimization.