Logo

The Data Daily

Strengthening data infrastructure for a unified work process

Strengthening data infrastructure for a unified work process

The rush to remote work resulted in multiple challenges for organizations around the world, and leading the list of challenges was technology. Business organizations are forced to scale IT infrastructure to support the sudden shift, resulting in a migration to cloud-based applications and solutions, a rush on hardware that can support a remote environment, and challenges scaling VPNs to support remote worker security.

These concerns in technology persist while the world gradually opens during the “new normal” period, since the remote work setup won’t go away. In fact, in a Gartner survey of 317 chief financial officers (CFOs) and finance leaders that 74 percent will move at least 5 percent of their previously on-site workforce to permanently remote positions post-COVID 19. Companies have options to stay fully remote or to apply staggered workforce, which means part of the employees will still have to work from home.

As such, having the right technical infrastructure in place to support remote workers remains critical. Companies will need to continue to upgrade their infrastructure to operate at scale while reducing expenses, performance and security issues. Cloud seems to be the best option, and for some organizations, hybrid cloud is the answer since they can still host their data on-premises, as the hybrid cloud model supports both private and public clouds, giving them a combination of flexibility and security during IT infrastructure scaling. 

However, in terms of data management in a remote or blended work setting, the diagnostics, resolution and optimization of data infrastructure has emerged challenged considering the vast, dynamic and interconnected nature of the underlying resources. With larger and much more complex sources and datasets, how can the data infrastructure, especially for the larger enterprises, cope with the challenges resulting from employees working in either remote or hybrid working arrangements during these unprecedented times?

While every business organization wants to be more data-driven, the silos created by the current work conditions driven by the pandemic forced companies, mostly enterprises, to rethink their approach to data access. Before, only a few elite data scientists were able to perform the task of analyzing complex data. Now, the goal is to empower anybody to use data at any time to make faster decisions with no barriers to access or understanding, allowing no gatekeepers that create a bottleneck at the gateway to the data.

Data democratization seems to be the popular approach, since it is said to be the future of managing big data and realizing its value. It is crucial in allowing data to pass safely from the hands of a few analysts into the hands of the masses within a company. Businesses that have armed their employees with the right tools and understanding are succeeding today because they are arming all their employees with the knowledge necessary to make smart decisions and provide better customer experiences.

In essence, data democratization starts with breaking down information silos as the first step toward user empowerment. Ideally, the tools will filter the data and visualizations shared with each individual — allowing employees visualize their data and align them with the organization’s key performance indicators: metrics, goals, targets, and objectives that have been aligned from the top-down that enable data-driven decisions.

With the right visualization and analytics tools in place, training the team becomes the next essential step. Since data democratization depends on the concept of self-service analytics, every team member must be trained up to a minimum level of comfort with the tools, concepts, and processes involved in order to participate.

Lastly, you cannot have a democracy without checks and balances, the final step to sharing data across your teams is data governance. Mismanagement or misinterpretation of data is a real concern. Therefore, a center of excellence is recommended to keep the use of data on the straight and narrow. This center of excellence should have a goal to drive adoption of data usage which is made possible by owning data accuracy, curation, sharing, and training. These teams are often most successful when they have budget, a cross-section of skillsets, and executive approval.

The remote working or blended working setup also paved the way for business organizations to having multiple data sources that are often siloed in different tools owned by different teams. Like missing pieces of a jigsaw, it is impossible to create a single source of truth (SSOT) — a concept used to ensure that everyone in the organization bases business decisions on the same data — from siloed data. Sales tools don’t easily integrate web or product analytics data. Marketing tools don’t easily integrate subscription data. You need to have a complete, unified profile of every person and company that has ever interacted with your brand.

As an effect, workflows are slowed down when a company does not invest in single source of truth for their business intelligence. All too often we see that people's day-to-day workflow is slowed down by not having confident access to business data. Decision making is also impaired by a lack of a single source of truth in an organization, which should be driving their decision making by using as much data as is available. Where there is uncertainty as to the validity of data, organizations are often unable to answer what should be relatively trivial, though important.

That’s why deployment of an SSOT architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or denormalized data elements (a direct consequence of intentional or unintentional denormalization of any explicit data model) pose a risk for retrieval of outdated, and therefore incorrect, information.

As companies rush to data democratization headlong as an answer to remote working or hybrid arrangements, the bias in data analytics becomes a major issue and steps should be taken to reduce it by developing advanced algorithms.

A chief cause of making the wrong decisions, bias in data analytics can happen either because the humans collecting the data are biased or because the data collected is biased. When data is biased, we mean that the sample is not representative of the entire population. But the more people in a business organization that tap on data analysis, the more there will be biases when collecting and analyzing responses which favor their research or project. In this regard, data scientists and citizen data scientists should be open to all kinds of viewpoints that would ultimately help to take better decisions.

Using deep analysis of data to help you with decision making is a good idea but it can also backfire if the data is biased. Machine learning algorithms do precisely what they are taught to do and are only as good as their mathematical construction and the data they are trained on. Algorithms that are biased will end up doing things that reflect that bias.

But bias in data analytics can be avoided by framing the right questions, which allow respondents to answer without any external influences, and by constantly improving algorithms.

The writer is general manager of Asia Pacific and Japan at TIBCO Software Inc., an independent provider of infrastructure software creating event-enabled enterprises

Images Powered by Shutterstock