Collibra Data Catalog
Collibra is a technology provider headquartered in Brussels and New York with offices in 7 countries around the world. It is one of the big players in the data governance market. The company was founded in 2008 as a spin-off of the University of Brussels. The company currently has 800 employees and operates worldwide.
The technology is not focused on any particular industries so customers come from a wide variety of sectors. Collibra offers a data governance platform enhanced by other solutions. It also cooperates with various technology and service partners as well as providing educational services. In 2019, Collibra became the first Belgian unicorn company and generated $112.5 million in its latest round of funding (series F) in April 2020. The company plans to invest it to enable data intelligence and improve the quality of business decisions driven by data. Overall, Collibra has raised a total of $346 million since 2015.
In 2016, Collibra launched its first data catalog tool and has since branched out into data privacy, data lineage and data quality. In February 2021, Collibra acquired owlDQ, thus extending its portfolio with predictive quality. Collibra is available as SaaS and built through continuous delivery in monthly releases (quarterly for on-premises products) so version names and numbers are not used. Collibra is using special offers to encourage its on-premises customers to migrate to the cloud because maintenance for on-premises products is scheduled to end in 2022 and support in 2023. The software addresses mainly business users, but technical users as well. It runs on all three major cloud providers (AWS, Microsoft Azure and Google).
The Collibra platform provides integrations for more than 45 different data sources via certified JDBC drivers, direct custom integrations and API-driven solutions. It also supports the ingestion of metadata and lineage from SQL dialects, ETL tools and BI tools. While Collibra does not support the ingestion of lineage from languages such as Python and Java natively, customers can analyze their code and integrate the metadata or lineage via APIs into the data catalog. The REST APIs can be used by customers and partners to build their own integrations and connectors. Collibra supports different interchange formats (JSON, XLS and CSV) to export asset pages, table views and various prebuilt reports.
For data governance, the software allows for different roles and responsibilities with a well-defined and customizable role-based security concept. In general, the tool is customizable and the metadata storage is open for changes to any element in the repository (e.g., data asset types, attributes and relationship types). This can be done in a business-friendly, no code manner. Workflow design ranges from simple approval to complex operations that include integrations with external systems, for example. In addition, Collibra offers customizable stewardship workflows, validation rules, views, dashboards, notifications and alerts. For consumption, Collibra provides a business glossary, advanced search algorithms, full-text search functions, visual end-to-end data lineage (business and technical) with impact report and much more.
User & Use Cases
All of the companies surveyed use Collibra Data Catalog as a data governance solution. Only half use it for data stewardship / data quality management and data discovery but this will probably increase in the future due to the acquisition of owlDQ. 13 percent use Collibra for data virtualization, data warehousing / BI and data storage/provisioning. This is surprising as Collibra Data Catalog interfaces to data repositories and does not directly store operational data. It is a purely metadata-based tool for cataloging, monitoring and data quality monitoring.
However, this may also mean that customers use Collibra for these use cases to model and coordinate data sources and their data. It is positive to note that 25 percent use Collibra as a holistic data platform, which means that users have access to, or at least get information about, all the company’s data. Collibra is used by a median of 60 users, which is three times the overall survey average. As nearly two-thirds of its customers are from large companies, this suggests that the tool is used predominantly by data experts. The mean of 306 users shows that higher numbers can also be served. The average of 16 administrators per implementation is very high in relation to the mean of 306 users. However, it may be that some survey participants counted content supervisors as well as genuine technical administrators here.
Total number of users per company
Total number of administrators per company
Company size (number of employees)
Want to see the whole picture?
BARC’s Vendor Performance Summary contains an overview of The Data Management Survey results based on feedback from Collibra Data Catalog users, accompanied by expert analyst commentary.Contact us to purchase the Vendor Performance Summary