Amazon Redshift - OpenSnowcat

The easiest way to get high-volume, schema-validated behavioral data into Redshift — without managing infrastructure or rigid schemas.

Unlike traditional loaders that rely on rigid column mappings, the SnowcatCloud Loader stores event payloads in a Redshift SUPER column (schema_data), preserving the full JSON structure. This enables powerful querying via PartiQL, while eliminating the pain of schema migrations.

Designed for scale, built for simplicity — it’s the fastest way to make event data Redshift-ready without managing column-level transformations.

Features

Per-Schema Table Isolation
Each Snowplow schema gets its own table for clean data separation and simplified querying.
Micro-Batch Loading
Efficient and cost-aware — events are loaded in small batches, reducing pressure on Redshift.
Schema-Less Evolution
JSON payloads are stored in SUPER columns; no need to manage or migrate table columns.
Buffer Control
Fine-tune how often data is flushed into Redshift, optimizing for latency vs. cost.
Reliable Error Handling
Automatic retries and easy debugging on failed loads — no manual cleanup required.
Native Monitoring
Integrated with CloudWatch Metrics & Logs for full observability.

Loader Configuration

Define how and where data is buffered and loaded using a simple configuration format:

output {
  service: "s3", 
  endpoint: "s3://snowplow-data/",
  compression: "GZIP",
  format: "OPENSNOWCAT_TSV"
}

Amazon Kinesis

Stream OpenSnowcat/Snowplow event data into Amazon Kinesis.

Amazon S3

Store OpenSnowcat/Snowplow event data into Amazon S3.