Data lakes are a highly debated concept yet still lack a clear definition. Many view them as places to store structured, semi-structured and unstructured data – schema-free and close to its raw data format. The structure comes with usage, when the data is needed.

This definition, however, still leaves many unanswered questions:

  • Is the term “data lake” a synonym for Hadoop and big data technologies or is it a collection of all data storage concepts available within a company?

  • Is a data lake physical storage or a logical concept?

  • Which governance requirements apply to data lakes?

Regardless of the answers to these questions, the concept is relevant. 47 percent of users worldwide confirm the benefits of data lakes. 35 percent of respondents still view the data lake as a new term for an old concept or a pure marketing term, and 13 percent feel that the data lake concept is irrelevant.

User opinion about Data Lakes (n=384)

In a regional comparison, over 20 percent of respondents in North America view the concept as a prerequisite for a data-driven company (compared to 13 percent in Europe).

In Europe (46 percent supporters, 37 percent opponents) and North America (42 percent supporters, 38 percent opponents), respondents are split into two factions. On the whole, there still appear to be insecurities regarding the benefits of data lakes. Its unclear definition and/or lofty promises from vendors can make it even harder to make a realistic assessment of this concept.


Hadoop and Data Lakes Report

Use Cases, Benefits and Limitations

Request the free report now