Amazon Redshift

Amazon Web Services (AWS) is a subsidiary of Amazon. It offers cloud-based services across the world on a metered pay-as-you-go basis. The foundation of today’s AWS Data & Analytics Services was laid in 2006 with a relaunch of the AWS platform, initially featuring the Amazon S3 (Amazon Simple Storage Service) and EC2 (Elastic Compute Cloud) services, which are still essential components of the platform today.

Amazon now offers a set of 175 global cloud-based products for data processing, storage, analysis and more. Its partner landscape is immense, as is the number of AWS customers. Amazon’s Availability Zones are located around the world. Amazon currently serves 77 Availability Zones (logical network of data centers) in 24 regions.

The services are closely integrated with each other. Amazon Redshift is a relational massively parallel processing data warehouse with columnar storage and OLAP functionality based on PostgreSQL, which integrates seamlessly with Amazon Data Lake based on S3. The service extends the S3 object storage. Amazon Web Services offers its various services as building blocks. Therefore, Amazon Redshift can integrate with other AWS services, including for example its machine learning service Amazon SageMaker.

Data can be loaded from S3 into Amazon Redshift and prepared, stored and queried in an optimized way for BI/analytics workloads and also unloaded back to S3 to be consumed from there. Amazon Redshift Spectrum (a Redshift feature) enables queries on combined data from Redshift and S3 (Data Lake) or direct queries to S3. Queries on file formats such as CSV, Parquet, Avro and JSON are supported, thus avoiding unnecessary data copies or data movement.

This is an essential feature in AWS’s quest to build a modern data & analytics architecture that is not only designed for a specific use case but is open, flexible and scalable. Amazon Redshift RA3 compute instances now also includes a hardware-based Advanced Query Accelerator (AQUA), which is designed to provide a significant performance boost for queries. To load data into Amazon Redshift users can employ the COPY command or leverage AWS’s or one of the several third-parties cloud ETL services from the broader AWS partner landscape.

AWS offers many data and analytics services alongside Amazon Redshift, so functional overlaps exist. But each service is tailored to a specific use case. The company believes in closely mapping services to very specific use cases and giving users the flexibility to choose the best service for their specific use case.

Amazon EMR is flexible and makes it simple and cost effective to run Hadoop, Spark and Presto. Amazon Athena is a standalone service that provides data exploration and ad-hoc query capabilities on data lakes, geospatial data and service logs without the need to set up or manage any servers. Meanwhile, Amazon Redshift is built for complex SQL processing for reporting and business intelligence.

User & Use Cases

Amazon Redshift is predominantly used for data warehousing and BI, data integration and as a data platform. 43 percent of respondents say they use the service for data pipeline creation, data preparation, data stewardship/data quality management or data discovery. This is a surprisingly high value for the Amazon Redshift service, which in itself does not directly cover these tasks.

It is interesting to note that Redshift is mostly used by small companies (43 percent). This may be due to the fact that many large and medium-sized companies often already have strategic partnerships in place with software providers and EDWH systems. In addition, migration to the cloud is a current trend which is yet to be completed for many. Small – and above all, ‘new’ – companies tend to avoid large infrastructure investments and use flexible, demand-oriented cloud offerings.

Current use

n=21

Number of users using Amazon RedShift

n=21

Number of technical users using Amazon RedShift

n=18

Company size (number of employees)

n=21

Want to see the whole picture?

BARC’s Vendor Performance Summary contains an overview of The Data Management Survey results based on feedback from Amazon Redshift users, accompanied by expert analyst commentary.

Contact us to purchase the Vendor Performance Summary

Amazon Redshift

Peer Groups Data warehouse technologies, Global vendors (data management)
VendorAmazon Web Services
Number of responses21
ProductAmazon Redshift
OfficesWorldwide
Employees25,000
CustomersMore than 1 million
Revenues (2019)$35.03 billion
Websiteaws.amazon.com