This repository demonstrates how to load a sample dataset into WarpStream for integration with ClickHouse ClickPipes product.
Sample datasets of various sizes are available (rows, uncompressed size, compressed size) covering a period of 30 days:
- 66m,20GB,1.9GB
- 133m,38GB,4.2GB
- 267m,76GB,8.2GB
- 534m,152GB,16.1GB
- 1064m,304GB,31.6GB
- 52804m,304GB,260.7GB
An ordered sample dataset can be downloaded here containing 534m rows.
Once it's downloaded, you can follow these instructions to deploy a WarpStream cluster with self-hosted Agents to Fly.io or these instructions to deploy on Railway. Or, you can use a WarpStream Serverless cluster, and leave it to WarpStream to manage everything for you.
Once you have a functioning WarpStream cluster with SASL credentials, you can run this script:
go run ./main.go -broker $BOOTSTRAP_URL:9092 -username $SASL_USERNAME -password $SASL_PASSWORD