Pentaho Data Integration

Pentaho, which was acquired in 2015 by Hitachi, was founded in 2004 as a business intelligence open source vendor. Hitachi Vantara is headquartered in Santa Clara, CA and has 143 office locations across 47 countries.

Best known for its business intelligence solution, Pentaho Business Analytics, it was one of the first open source software suppliers to become active in the business intelligence/data integration space. To this day, the software is still available in two versions: a community (open source) edition and an enterprise edition, which has to be rented via a subscription model. Despite the acquisition by Hitachi, Pentaho has announced it intends to continue with both models.

Pentaho’s data integration product was originally marketed under the name Kettle, and is essentially an ETL (Extract, Transform and Load) tool although partners provide some of the other data integration functionality. The platform is quite open and can be enhanced by third party tools/existing tools/programming for development and administration. It also has a tight integration with the Pentaho BI platform, which also provides analytical capabilities (e.g. OLAP modeling).

It is difficult to determine how widely the product is actually used. There have been many thousands of downloads.

Since the acquisition, it appears that Pentaho is focusing more on the Big Data and Internet of Things areas of the integration business. While this is clearly important for the future, it would be useful if the product had more functionality in the areas of metadata and versioning, and some more native interfaces (not just JDBC) to legacy data.

The integration offering is known as Pentaho Data Integration (PDI) and various “partner” products are available for areas like data quality.

User & Use Cases

Besides data integration (87 percent), customers mainly use Pentaho Data Integration for data warehouse automation (80 percent) and data marts (60 percent). 40 percent of respondents use it for data quality management, 27 percent for data preparation and 20 percent for enterprise data warehousing. The tool is used especially by mid-sized companies. 60 percent of our sample come from medium sized companies (101 to 2,5000 employees).

Respondents point out that only around 5 people or 8 percent of the employees in the company use the tool. This is a quite low number and shows that it is a specific tool for data integration developers.

Current use

Percentage of employees using Pentaho Data Integration

Number of users using Pentaho Data Integration

Company size (number of employees)

Pentaho Data Integration

Peer Groups Data warehousing automation products, ETL products, Global vendors (data management)
VendorPentaho
Number of responses30
ProductPentaho Data Integration
Offices143 worldwide
Employees307.275 (Hitachi 2019)
Customersn/a
Revenues (2018)n/a
Websitewww.pentaho.com