Once you've addressed storing and visualizing your data assets, the next step in harnessing the power of big data is blending operational data with data from other sources. Pentaho is seeking to smooth the path for organizations looking to do so with its new Business Analytics 5.0 platform.
When the big data rubber hits the road, it's about more than just storing massive amounts of data or even analyzing and visualizing a single stream. Gaining true insight from your data assets generally requires blending operational data and data from other big data sources together. Business analytics platform vendor Pentaho is striving to make that process easier than ever.
"True 'big picture' insights happen when operational data sources are blended with big data sources," says Quentin Gallivan, CEO of Pentaho. "Companies that compete largely on service, in industries like telecommunications and financial services, see big data blending's potential to help them gain market-share by providing the most personalized and interactive customer experience."
This week, Pentaho unveiled Pentaho Business Analytics 5.0, a complete redesign and overhaul of its data integration and analytics platform that addresses data blending from the ground up and offers a new interface intended to simplify the user experience.
"What we're seeing from our base is the need to make data more valuable by blending it with other data sources to provide insight," says Rosanne Saccone, CMO of Pentaho. "Customers want to blend data not just at the glass and the desktop, but at the source."
For instance, a telco might want to blend machine data about dropped calls with data from its data warehouse identifying its most valuable customers and their service level agreements (SLAs). This would allow the telco to then proactively target valuable customers that are not receiving agreed upon service levels with promotions and discounts.
Blending Can Be a Significant Data Integration Challenge
As Matt Casters, Pentaho's chief of data integration, notes, data blending allows a data integration user to create a transformation capable of deliver data directly to other business analytics tools. Traditionally, data is delivered to these tools via a relational database. But that becomes challenging when dealing with massive volumes of data or when you just don't have the time to wait until database tables are updated.
Addressing this issue often leads to hugely complex big data architectures with many moving parts: Hadoop clusters, NoSQL and traditional RDBS technologies, ETL tools, data marts, traditional BI tools and more.
Bringing it all together and giving users the capability to blend data with varying levels of data quality and granularity can be a significant challenge.
"The main problem we faced early on was that the default language used under the covers, in just about any business intelligence user facing tool, is SQL," Casters explains. "At first glance, it seems that the worlds of data integration and SQL are not compatible."
Casters says that DI requires reading from a multitude of data sources, such as databases, spreadsheets, NoSQL and big data sources, XML and JSON files, web services and more.
"However, SQL itself is a mini-ETL environment on its own as it selects, filters, counts and aggregates data," he says. "So we figured that it might be easiest if we would translate the SQL used by the various BI tools into Pentaho Data Integration transformations. This way, Pentaho Data Integration is doing what it does best, not directed by manually designated transformations but by SQL."
"In other words: We made it possible for you to create a virtual "database" with "tables" where the data actually comes from a transformation step," he adds.
Pentaho Business Analytics 5.0 blends data "at the source," which Saccone says maintains the appropriate level of data governance and security necessary for accurate and reliable analysis. The more commonly used method of end user blending "away from the source" lacks the ability to audit and cannot ensure correct inferences from the data, she says.
Pentaho's platform also avoids the need to stage the data before blending, which often leads to out-of-date data sets.
New Features of Pentaho Business Analytics 5.0
Other features of the new platform include the following:
Pentaho has also added to the ease of data integration with a host of features, including these:
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn. Email Thor at email@example.com
Read more about big data in CIO's Big Data Drilldown.
Copyright 2009 IDG Magazines Norge AS. All rights reserved
Postboks 9090 Grønland - 0133 OSLO / Telefon 22053000
Ansvarlig redaktør Henning Meese / Utviklingsansvarlig Ulf Helland / Salgsdirektør Tore Harald Pettersen