Automatically merges the packages and objects views from every bucket into a single queryable Iceberg catalog while maintaining data consistency and avoiding duplicates.
🚀 Quick Start: Download the latest release and follow the deployment steps in the deployment README.
After deployment:
# Process all buckets manually
npm run deploy:event
# Monitor logs
npm run deploy:logs recent 5 # show last 5 minutesThe system creates three normalized Iceberg tables:
- Package Revisions (
package_revision): Specific versions of packages - Package Tags (
package_tag): Named versions (likelatest) - Package Entries (
package_entry): Individual files within packages
See doc/schema.md for detailed schema design.
The Titanic Stack creates a data lake table merger that:
- Listens for package revision events via EventBridge
- Merges Quilt package metadata into consolidated Athena tables
- Supports both Glue and S3 Tables formats
- Provides unified views of package revisions, tags, and entries
- Lambda Function: Processes events and manages table operations
- S3 Buckets: Store Glue tables and S3 Tables data
- EventBridge Rule: Routes package events to the Lambda function
- IAM Roles: Provide necessary permissions for cross-service access
- Cause: CDK stack didn't deploy properly or missing S3 bucket
- Solution: Check deployment with
npm run deploy:outputsand redeploy if needed:npm run cdk
- Cause: Wrong policy ARN or insufficient permissions
- Solution: Verify
QUILT_READ_POLICY_ARNis correct and check AWS credentials:aws sts get-caller-identity
- Cause: Source Quilt views don't exist
- Solution: Verify views exist:
aws glue get-tables --database-name $ATHENA_DATABASE_NAME
- Cause:
.envfile missing or incomplete - Solution: Copy
env.exampleto.envand edit with your values
- Cause: Running deployment from wrong directory
- Solution: Ensure you're running
./deploy.shfrom the package directory
- Cause: Insufficient IAM permissions
- Solution: Verify AWS credentials have necessary permissions to create CloudFormation stacks
# Check stack status and resources
npm run deploy:outputs
aws cloudformation describe-stacks --stack-name TitanicStack
aws s3 ls | grep titanic
aws glue get-tables --database-name $ATHENA_DATABASE_NAME
# Monitor logs
npm run deploy:logs recent 30 # Last 30 minutes
npm run deploy:logs errors # Only errors
# View deployment events (if stack fails)
aws cloudformation describe-stack-events --stack-name TitanicStackFull redeploy needed: First deployment failed, changing USE_S3_TABLE setting, missing AWS resources
Simple restart sufficient: Lambda code changes only, temporary AWS API issues
npm run destroy # Delete everything
npm run destroy:buckets:contents # Delete data only- doc/DEVELOP.md - Building directly from CDK
- doc/SCHEMA.md - Table schema design and decisions
This package uses pre-built Lambda assets from the public S3 bucket:
- Assets bucket: Generated deterministically as
titanic-assets-{account}-{region} - Lambda code:
lambda/merge-tables.zip - Strategy: Always uses the latest available version