Loading Data

Amazon Redshift

Load OpenSnowcat event data into Amazon Redshift with high reliability.

The easiest way to get high-volume, schema-validated behavioral data into Redshift — without managing infrastructure or rigid schemas.

Unlike traditional loaders that rely on rigid column mappings, the SnowcatCloud Loader stores event payloads in a Redshift SUPER column (schema_data), preserving the full JSON structure. This enables powerful querying via PartiQL, while eliminating the pain of schema migrations.

Designed for scale, built for simplicity — it’s the fastest way to make event data Redshift-ready without managing column-level transformations.

Features

  • Per-Schema Table Isolation
    Each Snowplow schema gets its own table for clean data separation and simplified querying.
  • Micro-Batch Loading
    Efficient and cost-aware — events are loaded in small batches, reducing pressure on Redshift.
  • Schema-Less Evolution
    JSON payloads are stored in SUPER columns; no need to manage or migrate table columns.
  • Buffer Control
    Fine-tune how often data is flushed into Redshift, optimizing for latency vs. cost.
  • Reliable Error Handling
    Automatic retries and easy debugging on failed loads — no manual cleanup required.
  • Native Monitoring
    Integrated with CloudWatch Metrics & Logs for full observability.

Loader Configuration

Define how and where data is buffered and loaded using a simple configuration format:

output {
  service: "s3", 
  endpoint: "s3://snowplow-data/",
  compression: "GZIP",
  format: "OPENSNOWCAT_TSV"
}