Loading Data

Kafka

Stream OpenSnowcat event data into Kafka.

A high-performance, schema-validated event pipeline into Apache Kafka — built for scale, speed, and resilience.

The SnowcatCloud Loader delivers enriched behavioral data to Kafka topics in real time, with minimal latency and high throughput. Events are schema-validated and published in enriched JSON or TSV format, ready for processing by any downstream Kafka consumer. With robust error handling and configurable batching, it’s ideal for integrating behavioral data into modern streaming architectures.

Compatible with self-hosted Kafka, ❤️ WarpStream , Confluent Cloud, or any managed Kafka provider — all with no ingestion logic or infrastructure to maintain.

Features

  • Real-Time Streaming
    Stream enriched behavioral data to your Kafka in real time with minimal latency.
  • Flexible Output Format
    Choose between Snowplow’s standard TSV or flattened enriched JSON formats for downstream compatibility.
  • Batch Control
    Configure efficient flush intervals to balance throughput and delivery guarantees.
  • Reliable Error Handling
    Automatic retries and intelligent failure recovery ensure no data is lost.
  • Native Monitoring
    Fully integrated with CloudWatch Metrics and Logs for observability and alerting.

Loader Configuration

Define how and where data is streamed using a simple, declarative configuration block:

output {
  service: "kafka", 
  brokers: ["broker1:9092", "broker2:9092"],
  topic: "enriched-good"
  format: "JSON" // OR TSV
},