|
1 | | -<!-- markdownlint-disable --> |
2 | | -[](https://docs.quiltdata.com/) |
3 | | -[](https://slack.quiltdata.com/) |
4 | | - |
5 | | -# Quilt is a data mesh for connecting people with actionable data |
6 | | - |
7 | | -## Python Quick start, tutorials |
8 | | -If you have Python and an S3 bucket, you're ready to create versioned datasets with Quilt. |
9 | | -Visit the [Quilt docs](https://docs.quiltdata.com/installation) for installation instructions, |
10 | | -a quick start, and more. |
11 | | - |
12 | | -## Quilt in action |
13 | | -* [open.quiltdata.com](https://open.quiltdata.com/) is a petabyte-scale open |
14 | | -data portal that runs on Quilt |
15 | | -* [quiltdata.com](https://quiltdata.com) includes case studies, use cases, videos, |
16 | | -and instructions on how to run a private Quilt instance |
17 | | -* [Versioning data and models for rapid experimentation in machine learning](https://medium.com/pytorch/how-to-iterate-faster-in-machine-learning-by-versioning-data-and-models-featuring-detectron2-4fd2f9338df5) |
18 | | -shows how to use Quilt for real world projects |
19 | | - |
20 | | -## Who is Quilt for? |
21 | | -Quilt is for data-driven teams and offers features for coders (data scientists, |
22 | | -data engineers, developers) and business users alike. |
23 | | - |
24 | | -## What does Quilt do? |
25 | | -Quilt manages data like code so that teams in machine learning, biotech, |
26 | | -and analytics can experiment faster, build smarter models, and recover from errors. |
27 | | - |
28 | | -## How does Quilt work? |
29 | | -Quilt consists of a Python client, web catalog, lambda |
30 | | -functions—all of which are open source—plus |
31 | | -a suite of backend services and Docker containers |
32 | | -orchestrated by CloudFormation. |
33 | | - |
34 | | -The backend services are available under a paid license |
35 | | -on [quiltdata.com](https://quiltdata.com). |
36 | | - |
37 | | -## Use cases |
38 | | -* **Share** data at scale. Quilt wraps AWS S3 to add simple URLs, web preview for large files, and sharing via email address (no need to create an IAM role). |
39 | | -* **Understand** data better through inline documentation (Jupyter notebooks, markdown) and visualizations (Vega, Vega Lite) |
40 | | -* **Discover** related data by indexing objects in ElasticSearch |
41 | | -* **Model** data by providing a home for large data and models that don't fit in git, and by providing immutable versions for objects and data sets (a.k.a. "Quilt Packages") |
42 | | -* **Decide** by broadening data access within the organization and supporting the documentation of decision processes through audit-able versioning and inline documentation |
| 1 | +# Quilt: A Data Lakehouse for Actionable Data |
| 2 | + |
| 3 | +Quilt connects teams to actionable data by simplifying data discovery, sharing, |
| 4 | +and analysis. It’s designed to serve data-driven organizations with powerful |
| 5 | +tools for managing data as code, enabling rapid experimentation, and ensuring |
| 6 | +data integrity at scale. |
| 7 | + |
| 8 | +--- |
| 9 | + |
| 10 | +## Navigating the Documentation |
| 11 | + |
| 12 | +The Quilt documentation is structured to guide users through different layers of |
| 13 | +the platform, from basic concepts to advanced integrations. Whether you're a |
| 14 | +business user, developer, or platform administrator, the docs will help you |
| 15 | +quickly find the information you need. |
| 16 | + |
| 17 | +### Quilt Platform Overview |
| 18 | + |
| 19 | +The **Quilt Platform** powers the core features of the Quilt data catalog, |
| 20 | +providing tools for browsing, searching, and visualizing data stored in AWS S3. |
| 21 | +The platform is ideal for teams needing to collaborate on data, with |
| 22 | +capabilities like embeddable previews and metadata collection. |
| 23 | + |
| 24 | +**Core Sections:** |
| 25 | + |
| 26 | +- [Architecture](Architecture.md) - Learn how Quilt is architected. |
| 27 | +- [Mental Model](MentalModel.md) - Understand the guiding principles behind |
| 28 | + Quilt. |
| 29 | +- [Metadata Management](Catalog/Metadata.md) - Manage metadata at scale. |
| 30 | + |
| 31 | +For users of the Quilt Platform (often referred to as the Catalog): |
| 32 | + |
| 33 | +- [Bucket Browsing](Catalog/FileBrowser.md) - Navigate through S3 buckets. |
| 34 | +- [Document Previews](Catalog/Preview.md) - Visualize documents and datasets |
| 35 | + directly in the web interface. |
| 36 | +- [Search & Query](Catalog/SearchQuery.md) - Leverage Quilt’s powerful search |
| 37 | + and querying capabilities. |
| 38 | +- [Visualization & Dashboards](Catalog/VisualizationDashboards.md) - Create |
| 39 | + visual dashboards for data insights. |
| 40 | + |
| 41 | +For administrators managing Quilt deployments: |
| 42 | + |
| 43 | +- [Admin Settings UI](Catalog/Admin.md) - Control platform settings and user |
| 44 | + access. |
| 45 | +- [Catalog Configuration](Catalog/Preferences.md) - Set platform preferences. |
| 46 | +- [Cross-Account Access](CrossAccount.md) - Manage multi-account access to S3 |
| 47 | + data. |
| 48 | + |
| 49 | +### Quilt Python SDK |
| 50 | + |
| 51 | +The **Quilt Python SDK** allows users to programmatically manage data packages, |
| 52 | +version datasets, and automate data workflows. Whether you're uploading a |
| 53 | +package, fetching data, or scripting custom workflows, the SDK provides the |
| 54 | +flexibility needed for deeper integrations. |
| 55 | + |
| 56 | +- [Installation](Installation.md) - Get started with the Quilt SDK. |
| 57 | +- [Quick Start](Quickstart.md) - Follow a step-by-step guide to building and |
| 58 | + managing data packages. |
| 59 | +- [Editing and Uploading Packages](walkthrough/editing-a-package.md) - Learn how |
| 60 | + to version, edit, and share data. |
| 61 | +- [API Reference](api-reference/api.md) - Detailed API documentation for |
| 62 | + developers. |
| 63 | + |
| 64 | +### Quilt Ecosystem and Integrations |
| 65 | + |
| 66 | +The **Quilt Ecosystem** extends the platform with integrations and plugins to |
| 67 | +fit your workflow. Whether you're managing scientific data or automating |
| 68 | +packaging tasks, Quilt can be tailored to your needs with these tools: |
| 69 | + |
| 70 | +- [Benchling |
| 71 | + Packager](https://open.quiltdata.com/b/quilt-example/packages/examples/benchling-packager) |
| 72 | + - Package biological data from Benchling. |
| 73 | +- [Nextflow Plugin](examples/nextflow.md) - Integrate with Nextflow pipelines |
| 74 | + for bioinformatics. |
| 75 | + |
| 76 | +--- |
| 77 | + |
| 78 | +## Who Should Use Quilt? |
| 79 | + |
| 80 | +Quilt is for teams across industries like machine learning, biotech, and |
| 81 | +analytics who need to manage large datasets, collaborate seamlessly, and track |
| 82 | +the lifecycle of their data. Whether you're a data scientist, engineer, or |
| 83 | +administrator, Quilt helps streamline your data management workflows. |
| 84 | + |
| 85 | +## What Can You Do with Quilt? |
| 86 | + |
| 87 | +- **Share**: Easily share versioned data using simple URLs and email invites. |
| 88 | +- **Understand**: Enrich data with inline documentation and visualizations for |
| 89 | + better insights. |
| 90 | +- **Discover**: Use metadata and search tools to explore data relationships |
| 91 | + across projects. |
| 92 | +- **Model**: Version and manage large data sets that don't fit traditional git |
| 93 | + repositories. |
| 94 | +- **Decide**: Empower your team with auditable data for better decision-making. |
| 95 | + |
| 96 | +--- |
| 97 | + |
| 98 | +## How to Get Started |
| 99 | + |
| 100 | +To dive deeper into the capabilities of Quilt, start with our [Quick Start |
| 101 | +Guide](Quickstart.md) or explore the [Installation |
| 102 | +Instructions](Installation.md) for setting up your environment. |
| 103 | + |
| 104 | +If you have any questions or need help, join our [Slack |
| 105 | +community](https://slack.quiltdata.com/) or visit our full [documentation |
| 106 | +site](https://docs.quiltdata.com/). |
0 commit comments