Adventures from Kathmandu to San Francisco by Sandeep Giri: November 2005

BI software products, open source as well as the commercial ones, have mainly been focused on the developers/SI's building BI applications, as opposed to focusing on the end-user of the BI applications. The end-user's problem is pretty straightforward: "I am sitting on a mountain of data accumulated over various data silos in my organization, and I need help making sense out of it; I need a tool that helps me derive actionable insights". The end-user doesn't want to write code or SQL/MDX queries, they don't want to mess with complex regression models or machine-learning data mining models. They want a tool that has some notion of the *business* questions they ask, and can help analyze their existing data in the context of those questions -- hence the term *business* inelligence.

The "I" in BI should be about intelligence, not infrastructure. Most of current BI technology talk solely focuses on the "infrastructure" components - database, OLAP, workflow, data mining, reporting, etc. but just by having those components in one stack, you don't become intelligent. You still need data models optimized for your specific types of analyses, you still need to find the appropriate data mining and stat models for your problem space, you still need the right visualizations both to interactive analyze your data, as well as to publish your analyses in such a way that mere mortals can understand and use those insights to make business decisions. *This* is the real "I" in BI -- Intelligence, not Infrastructre!

Open source BI platforms are going to be key in highlighting this differentiation because open source by its very nature commoditizes the infrastructure of BI -- pushinig the proverbial "value up the stack" to components that actually produce intelligence.

At my work, we bet on this very phenomena where we utilize an open source BI infrastructure, and build domain-specific BI applications on top of it. Since we started doing this a few years ago, we didn't have the advantage to choose between Pentaho, JasperSoft, or BIRT -- so we did our best by building our own platform by integrating existing open source BI components, of which some like Mondrian and JPivot are also being utilized by Pentaho, et al. For us, this was a pure R&D initiative to leverage an open source development model, and in the spirit of collabortion, we even published it into sourceforge as OpenI (http://openi.sourceforge.net). We hope that as Pentaho, JasperSoft, etc. start releasing complete version of their platforms, we should find a way to collaborate with their stack. But for what it's worth -- you can download and use OpenI today as a BI application, just like we use it in production to serve our clients. In fact, I am keenly interested in community feedback on OpenI to understand the best ways to collaborate with the other open source BI projects.

The key is that open source BI needs to first define itself as a respectable, reliable alternative to the commercial BI vendors. Then we can bicker about what's free, and what's commercial. The infrastructure should be completely open, on top of which you can have commercial, domain-specific BI applications. But first, Pentaho, JasperSoft, BIRT, and OpenI -- all of us need to collaborate and put an open source BI stack out there that can hold its own in comparison to commercial BI vendors. Open source doesn't work like enterprise software where one company/organization needs to own the entire space. And even if they wanted to, they can't, because successful open source is all about building a community that spreads far beyond the confines of your organization. We are fortunate to have several noteworthy projects and components in the open source BI space -- it is time to these projects to collaborate and present open source BI as a unified front, and start growing the community. That's how open source BI is going to become real.

Adventures from Kathmandu to San Francisco by Sandeep Giri

Friday, November 04, 2005

The I in BI is about Intelligence, not Infrastructure