Loading Data
AWS Firehose
Using AWS Firehose to load data to data warehouses, data lakes, and analytics services.
Amazon Data Firehose offers the simplest method to quickly acquire, transform, and deliver data streams to data lakes, data warehouses, and analytics services without running and managing a Snowplow RDB loader.
Destination S3
Choose source and destination
- Source:
Amazon Kinesis Data Streams
- Destination:
Amazon S3
Source settings
- Kinesis data stream:
arn:aws:kinesis:us-west-2:xxxxxxxxxxxxx:stream/enriched-good-events
- Firehose stream name:
KDS-S3-TngAd
Destination Settings
- S3 Bucket:
s3://enriched-json
- New line delimiter
Enabled
- Dynamic partitioning (Optional) Dynamic partitioning enables you to create targeted data sets by partitioning streaming S3 data based on partitioning keys.
- Multi record deaggregation:
Enabled
- Multi record deaggregation type:
JSON
- Inline parsing for JSON:
Not enabled
- S3 Bucket Prefix:
good/!{partitionKeyFromQuery:app_id}/
(this would partition byapp_id
with prefixgood
) - S3 Bucket error output prefix:
error=
(Optional)
S3 buffer hints
- Buffer size:
128
MiB - Buffer interval:
300
seconds
Compression for data records
GZIP