Loading Data
Amazon Redshift
Load OpenSnowcat event data into Amazon Redshift with high reliability.
The easiest way to get high-volume, schema-validated behavioral data into Redshift — without managing infrastructure or rigid schemas.
Unlike traditional loaders that rely on rigid column mappings, the SnowcatCloud Loader stores event payloads in a Redshift SUPER
column (schema_data), preserving the full JSON structure. This enables powerful querying via PartiQL, while eliminating the pain of schema migrations.
Designed for scale, built for simplicity — it’s the fastest way to make event data Redshift-ready without managing column-level transformations.
Features
-
Per-Schema Table Isolation
Each Snowplow schema gets its own table for clean data separation and simplified querying. -
Micro-Batch Loading
Efficient and cost-aware — events are loaded in small batches, reducing pressure on Redshift. -
Schema-Less Evolution
JSON payloads are stored inSUPER
columns; no need to manage or migrate table columns. -
Buffer Control
Fine-tune how often data is flushed into Redshift, optimizing for latency vs. cost. -
Reliable Error Handling
Automatic retries and easy debugging on failed loads — no manual cleanup required. -
Native Monitoring
Integrated with CloudWatch Metrics & Logs for full observability.
Loader Configuration
Define how and where data is buffered and loaded using a simple configuration format:
output {
service: "s3",
endpoint: "s3://snowplow-data/",
compression: "GZIP",
format: "OPENSNOWCAT_TSV"
}