🎉 Using or considering OpenSnowcat? We want to hear from you! Our repos are feeling lonely – give them a ⭐ star !
Event Routing

Bento

Simplify event-routing OpenSnowcat / Snowplow event routing with Bento.

SnowcatCloud, Inc. developed a Bento processor to simplify event-processing and routing with Bento and OpenSnowcat / Snowplow enriched TSV. We ❤️ Bento!.

You can test Bento using the OpenSnowcat DevKit.

What is Bento?

Bento is a declarative data streaming service that solves a wide range of data engineering problems with simple, chained, stateless processing step. It is designed to be easy to use, easy to deploy, and easy to scale.

What can you do with Bento & OpenSnowcat / Snowplow?

Bento can be used to process and route events from 60+ sources to more than +38 outputs with built-in retry mechanisms, dead-letter queues, and backoff strategies.

  • Move data from one stream to another
  • Convert Enrich TSV to JSON
  • Filter out unwanted events (e.g. bots, internal traffic)
  • Anonymize PII data (e.g. IP address, user ID)
  • Hash sensitive data (e.g. email, user ID)
  • Redact sensitive data (e.g. phone number, name)

Bento OpenSnowcat / Snowplow Cookbooks

Use these cookbooks to deploy Bento with OpenSnowcat / Snowplow. Try them out using OpenSnowcat DevKit.

# Bento pipeline: enriched-good  ->  enriched-good-json
input:
  kafka_franz:
    seed_brokers: ["warp:9092"]
    topics: ["enriched-good"]
    consumer_group: "pipeline:enriched-good-json"

pipeline:
  processors:
    - opensnowcat:
        output_format: json
                         
output:
  kafka_franz:
    seed_brokers: ["warp:9092"]
    topic: "enriched-good-json"
    batching:
      count: 10
# Bento pipeline: enriched-good  ->  elasticsearch
input:
  kafka_franz:
    seed_brokers: ["warp:9092"]
    topics: ["enriched-good"]
    consumer_group: "pipeline:enriched-good:elastic"

pipeline:
  processors:
    - opensnowcat:
        output_format: json
      
output:
# See docs for config options https://warpstreamlabs.github.io/bento/docs/components/outputs/elasticsearch_v2
  elasticsearch_v2:
      urls: [""] #
      basic_auth:
        enabled: true
        username: "" # No default (required)
        password: "" # No default (required)
      index: "good" # No default (required)
      id: ${!count("elastic_ids")}-${!timestamp_unix()}
      discover_nodes_on_start: false
      discover_nodes_interval: 0s
      max_in_flight: 64
      batching:
        count: 5
      retry_on_status:
        - 502
        - 503
        - 504
# Bento pipeline: PII & IP Anonymization & Bot Filtering
# Bento pipeline: enriched-good  ->  enriched-good-anonymized
input:
  kafka_franz:
    seed_brokers: ["warp:9092"]
    topics: ["enriched-good"]
    consumer_group: "pipeline:enriched-good-json"

pipeline:
  processors:
    - opensnowcat:
        output_format: json
        filters:
          drop:
            nl.basjes.yauaa_context.agentClass: 
              contains: ["Bot"]
          transform:
            salt: "your-secret-salt-here"
            hash_algo: SHA-256
            fields:
              user_id:
                strategy: hash
              user_ipaddress:
                strategy: anonymize_ip
                anon_octets: 2
                anon_segments: 3
              network_userid:
                strategy: redact
                redact_value: "[REDACTED]"   
                         
output:
  kafka_franz:
    seed_brokers: ["warp:9092"]
    topic: "enriched-good-anonymized"
    batching:
      count: 10

How does it compare?

FeatureBentoSnowplow's Snowbridge
Open Source✅ MIT❌ Proprietary SLULA
Retry Mechanism✅ "At-least-once" Delivery, DLQ, Backoff Retries⚠️ Beta, just retries
Stateless Architecture✅ Stateless✅ Stateless
PricingFree - Infrastructure cost onlyNot Free - Infrastructure cost + license
Vendor Lock-inNone - Deploy anywhereYES - SLULA 1.1
Built-in Inputs60+ - Stdin, Kafka, S3, Kinesis, Azure, PubSub etc.4 - Stdin, Kafka Kinesis, PubSub, SQS
Built-in Outputs38+ - HTTP, Kafka, S3, PostgreSQL, etc.6 - EventHub, HTTP, Kafka, Kinesis, PubSub, SQS
Data Transformations✅ Native processors, Bloblang & Enrich TSV processor✅ Built-in & Javascript
ConfigurationYAML - DeclarativeConfig File - HCL format
Deployment Options✅ Any infrastructure✅ Any infrastructure
Community & SupportActive - Large communityActive - Snowplow community + Commercial