By Brooks Patterson, Product Marketing Manager – RudderStack
By Kiran Singh, Sr. Partner Solutions Architect, Databases – AWS
By Sai Prakhya Tata, Solutions Architect, Customer Experience – AWS
By Harmeet Nandrey, Sr. Partner Solutions Architect, MarTech – AWS


Consumers today have higher expectations than ever and are also more informed. Winning new customers and keeping existing customers happy requires thoughtful, tailored engagement along a strong end-to-end journey.

Data provides the raw material required to deliver such journeys and is key for brands that want to stand out, but deriving value from data takes work. Companies have traditionally looked to legacy customer data platforms (CDPs) to help them get value from their customer data.

These CDPs create customer profiles with data from various sources and enabled activation for limited marketing use cases. However, these platforms were closed systems, owned and used almost exclusively by marketing departments that created yet another data silo.

Now, companies are turning to cloud data warehouses as the foundation of their CDPs. This approach shifts the center of gravity for customer data management back to the data team, and frees business teams to focus on driving growth. It delivers many benefits to the data team including flexibility, control, data privacy and governance, and cost savings. Most importantly, it makes it possible for organizations to eliminate data silos and establish a single source of truth for their customer data in the warehouse.

RudderStack is an AWS Specialization Partner and Warehouse Native CDP that was founded upon these convictions: that the data warehouse should be the foundation of the CDP, and that the CDP should be owned by the data team.

RudderStack is also an AWS Marketplace Seller that holds the Amazon Redshift service ready designation. In this post, we’ll look at how RudderStack and Redshift pair to create a robust, flexible customer data platform that enables data teams to partner with their businesses on projects that create efficiencies and drive revenue.

Amazon Redshift as Single Source of Truth for Customer Data

Amazon Redshift delivers the best price-performance for cloud data warehousing and delivers performance innovation out of the box. It provides secure, reliable, easily scalable storage for all of your customer data, so you don’t have to worry about infrastructure management. It’s the perfect foundation for your CDP.

Traditional CDPs create data silos that result in burgeoning vendor costs and lack of visibility of the full customer journey. Plus, they’re built specifically for non-technical users, so they’re not conducive to advanced use cases that require collaboration across technical teams and tools.

In-house builds simply aren’t viable for most companies because of the immense lift required to build and maintain the systems.

Companies that want to create efficiencies and drive value across the entire data activation lifecycle require a warehouse-centric solution that provides a connected infrastructure capable of building a true customer 360. This is what RudderStack calls the Warehouse Native CDP and it’s where Amazon Redshift shines as a secure, scalable source of truth for every customer data point.

Customer Data Platform Approaches

Let’s compare the two prevailing customer data platform approaches: legacy software-as-a-service (SaaS) CDP, and in-house builds with RudderStack’s Warehouse Native CDP.

Warehouse Native CDP In-House Build Legacy SaaS CDP
Data storage Your infrastructure Your infrastructure Partner infrastructure
System flexibility Flexible Flexible Rigid
Maintenance requirements Low High Low
Data integration Partner provided Must be custom-built Vendor provided
Vendor lock-in No No Yes
Feature development Vendor provided Must be custom-built Partner provided
Supported use cases Marketing, product, sales, customer success, AI/ML, analytics Marketing, product, sales, customer success, AI/ML, analytics Marketing

Legacy SaaS CDP

CDPs emerged in the early 2010s as a way to aggregate data from data silos into one place. While they succeeded in building a more comprehensive customer view, these systems were black boxes that ultimately created another data silo, and they were largely useful only for marketing use cases.

Today’s traditional CDPs are more flexible than their predecessors. However, they’re fundamentally limited by their closed architectures and are still made primarily for marketing users. They don’t expose data in a way that’s conducive to more sophisticated use cases like user journeys, attribution, machine learning (ML) models for churn prediction, and product recommendations.

It’s worth noting that even legacy CDP vendors are recognizing the paradigm shift towards the data warehouse and working, post factum, to add data warehouse support to their existing systems.

In-House Build

With inflexible and limiting off-the-shelf solutions, many companies opt to build CDP capabilities in-house. While this option can seem like a good one, most companies get overwhelmed by the cost and magnitude of these projects, not to mention the ongoing maintenance.

As growth accelerates, data volume grows, integration requirements expand, privacy regulations become harder to meet, and error handling gets increasingly complex. Building an internal system at scale can take years, so the in-house build isn’t a viable option for most companies.

Warehouse Native CDP

RudderStack’s approach to the customer data platform is a packaged platform that runs directly on the data warehouse. It solves the data silo problem by building around the warehouse, and it deploys the integration, real-time transformation, unification, and activation layers as a connected, governable, and observable end-to-end system.

Because the Warehouse Native CDP is an end-to-end system, you don’t have to invest time and money building infrastructure or bridging the gaps created by siloed legacy CDPs. Moreover, you still get full control over both pipelines and the modeling of customer profiles in your own warehouse.

The combination of RudderStack and Amazon Redshift supports the entire data activation lifecycle:

  • Collect: Collect streaming behavioral data via RudderStack’s event stream pipeline, and batch data from cloud tools and send it directly to Amazon Redshift.
  • Unify: Automate identity stitching and automatically build a configurable identity graph in Redshift with RudderStack profiles.
  • Activate: Send complete customer profiles from Redshift to the tools your business teams use via RudderStack’s reverse extract, transform, load (ETL) pipeline.


Figure 1 – Data activation lifecycle using Amazon Redshift.

Solution Overview

With a powerful platform like RudderStack enabling data collection, unification, and activation on top of Amazon Redshift, you can make it easy for every team to use customer data as they contribute to a cohesive, end-to-end customer experience.

You can also drive value from the get-go and work towards more sophisticated use cases as you grow and scale:

  • Simplify first-party data collection: RudderStack provides a single, high-performance software development kit (SDK) for data collection, highly scalable event stream pipelines, and hundreds of integrations.
  • Complete customer journey analytics: With data from every customer touchpoint collected and centralized in Amazon Redshift, data analysts can build rich end-to-end customer journeys.
  • Real-time personalization: Using RudderStack, Amazon Redshift, and Amazon ElastiCache, you can build a system to enable behavioral and profile-based personalization for in-app or on-site product and content recommendations.

For a detailed breakdown of the architecture you can use to build a real-time personalization engine with RudderStack, Amazon Redshift, and an in-memory store, check out this technical walkthrough.

Customer Success Story: HealthMatch

HealthMatch is a digital health company that accelerates medical research by quickly matching patients with clinical trials. HealthMatch’s goal is to match the largest pool of qualified patients to a trial with researchers as efficiently as possible. When they succeed, clinical trials happen faster, and speeding up trials saves lives.

Before implementing RudderStack, HealthMatch’s rudimentary data stack wasn’t up to the task of filtering event and user data from various sources and routing it to trigger-based marketing tools to match patients to trials. HealthMatch had an Amazon Redshift data warehouse in place, but it lacked the surrounding infrastructure to make the most of this investment.

To match patients faster, the company implemented RudderStack and, a customer engagement tool. Event data from RudderStack’s event stream pipeline now triggers match notification messages in to go out at the optimal times based on time zone. While these messages wait in the queue,’s SQL sync feature checks Redshift for the latest data every 15 minutes to ensure every email is personalized with up-to-date information.

HealthMatch also uses RudderStack to send event data from all of its sites and applications directly to Amazon Redshift, where it’s combined with data from the production database. This new data flow enables HealthMatch to stitch data from its production database together with event data to unlock full user journey analytics.

HealthMatch also uses RudderStack’s reverse ETL pipelines to push relevant data from Redshift to its entire technology stack, including, Amplitude, and HubSpot.


Figure 2 – HealthMatch data flow using RudderStack and Amazon Redshift.

Because of RudderStack’s Warehouse Native CDP approach, which is HIPAA-compliant, HealthMatch’s data flows through RudderStack directly to Amazon Redshift’s secure warehouse service for storage. As a result, HealthMatch was able to solve these problems with a fully HIPAA-compliant stack.


When you build your customer data platform (CDP) on top of the warehouse with Amazon Redshift and RudderStack, you can derive full value from your customer data.

Amazon Redshift provides secure, scalable storage, and RudderStack delivers the tools you need to equip every team in your business with the customer data they need to drive engagement along a cohesive end-to-end customer journey.

Visit RudderStack’s website to explore the Warehouse Native CDP. You can also learn more about RudderStack on AWS Marketplace.


RudderStack – AWS Partner Spotlight

RudderStack is an AWS Specialization Partner and warehouse-native customer data platform that helps companies to get value from their warehouse or data lake.

Contact RudderStack | Partner Overview | AWS Marketplace