Logo

The Data Daily

The Future of Business Intelligence is Open Source | 7wData

The Future of Business Intelligence is Open Source | 7wData

While “software is [still actively] eating the world”, it’s also clear that open source is taking over software.

Simply put, open source is a superior approach at building and distributing software because it provides important guaranties around how software can be discovered, tried, operated, collaborated on and packaged. For those reasons, it is not surprising that it has taken over most of the modern data stack: infrastructure, databases, orchestration, data processing, AI/ML and beyond.

Looking back, the main reason why I originally created both Apache Airflow and Apache Superset while I was at Airbnb in 2014-17 is because the vendors in the data space were failing to:

As it is often the case with open source, the capacity to integrate and to extend were always at the core of how we approached the architecture of those two projects.

More specifically for Superset, the main driver to start the project at the time was the fact that Tableau (our main data visualization solution at the time) couldn’t connect natively to Apache Druid and Trino / Presto, our data engines of choice that provided the properties and guaranties that we needed to satisfy our data use cases.

With Tableau’s “Live Mode” misbehaving in intricate ways at the time (I won’t get into this!), we were stirred towards using Tableau Extracts. Extracts crumbled under the data volumes we had at Airbnb, creating a whole lot of challenges around non-additive metrics (think distinct user counts) and forcing us to intricately pre-compute multiple “grouping sets” which broke down some of the Tableau paradigms and confused users. Secondarily, we had a limited number of licenses for Tableau, and generally had an order of magnitude more employees that wanted/needed access to our internal than our contract allowed. That's without mentioning the fact that for a cloud-native company, Tableau's Windows-centric approach at the time didn't work well for the team.

Some of the premises mentioned above have since changed, but the power of open source and the core principles on which it’s built have only grown. In this blog post, I will explain why the future of business intelligence is open source.

If I could only use a single word to describe why the time is right for organizations to adopt open source BI, the word would be freedom. Flowing from the principle of freedom comes a few more concrete superpowers for an organization:

Airbnb wanted to integrate in-house tools like Dataportal and Minerva with a dashboarding tool to enable democratization of data within their organization. Because Superset is open source and Airbnb actively contributes to the project, they were able to supercharge Superset with in-house components with relative ease.

On the visualization side, organizations like Nielsen are creating new visualizations and deploying in their Superset environments. They’re going a step further by empowering their engineers to contribute to the customizability and extensibility of Superset. The Superset platform is now flexible enough so that anyone can build their own custom visualization plugins, a benefit that is unmatched in the marketplace.

Within the wider community, many report using the rich REST API that ships with Superset, allowing them full programmatic control over all aspects of the platform. Given that pretty much everything that user can do in Superset can be done through the API, sky is the limit when it comes to automating processes in and around Superset.

Around the topic of integration, members from the Superset community have added support for over 30 databases (and growing!) by submitting code and documentation contributions.

Images Powered by Shutterstock