The idea of integrating systems is nothing new. IT departments have constantly struggled to link enterprise applications together in a way that enables decision-makers to have the data they need at their fingertips. Ideally, this information should all be pulled into a single dashboard. But as enterprise IT is constantly evolving, each new application provides a new data source. There is often a lag from the time an application was deployed and when IT could fully integrate it into an enterprise dashboard.
Today’s IT is exponentially more complex. There are the internal applications, which have remained a constant integration headache for CIOs. Then there are the cloud-based deployments and software as a service (SaaS), which means enterprise data now resides in two totally separate spheres. Add in connectivity to business partners and third-party sources, and it soon becomes clear that IT is facing an uphill struggle in dealing with numerous silos of enterprise data in a hybrid world.
Then there is the added complexity that arises with multicloud and hybrid cloud deployments. In fact, nearly half of the 1,700 UK IT decision-makers surveyed by Vanson Bourne in a study for Nutanix identified integrating data across different environments (49%) as their top challenge in multicloud deployment.
These disparate data sources need to be pulled into decision support systems. But moving beyond traditional business intelligence, data integration also has an essential role in advanced analytics, artificial intelligence (AI) and machine learning (ML).
“There is also the non-trivial matter of data governance in a hybrid world,” says Freeform Dynamics distinguished analyst Tony Lock, “especially one where cloud providers offer advanced machine learning and analysis tools that can operate on huge volumes of data coming from multiple sources. Any analysis that includes information from diverse data sources means you must have effective data governance in place.”
Humans require a lot of information to make sense of the world, so current more primitive computer algorithms need far more data, says computer expert Junade Ali. While artificial intelligence (AI) and machine learning (ML) algorithms are getting ever better at doing more with less, we still often need to bring together data from multiple sources for them to produce results that make sense. Humans require a lot of information to make sense of the world, so our current more primitive computer algorithms surely need far more. A challenge in ML is the ability of algorithms to understand causality. So far, much of what is done by AI algorithms is finding correlations between data points, as opposed to understanding causal relationships. Improving causal reasoning in AI offers the opportunity for us to do more with less when it comes to data. Microsoft Research is one team that has a group currently working on improving “causality in machine learning”, but there is still more work to be done. Until such a time when we overcome these challenges in AI, data integration will remain an important part of ensuring we can give our constrained ML algorithms the data they need to provide meaningful outputs. It isn’t just about the volume of data, but also the dimensionality, ML algorithms need a full understanding of all data attributes to have a better chance of finding the right conclusions. For this reason, before embarking on your AI revolution, you must ensure your ducks are in order when it comes to your data. Junade Ali is an experienced technologist with an interest in software engineering management, computer security research and distributed systems.
In addition to a lack of sufficient data governance, poorly integrated data leads to poor customer service. “In the digital economy, the customer expects you to know and have ready insight into every transaction and interaction they have had with the organisation,” says Tibco CIO Rani Johnson.