omop-lite

A small container to get an OMOP CDM database running quickly, with support for both PostgreSQL and SQL Server.

Drop your data into data/, and run the container.

Configuration

You can configure the container or CLI using the following environment variables:

DB_HOST: The hostname of the database. Default is db.
DB_PORT: The port number of the database. Default is 5432.
DB_USER: The username for the database. Default is postgres.
DB_PASSWORD: The password for the database. Default is password.
DB_NAME: The name of the database. Default is omop.
DIALECT: The type of database to use. Default is postgresql, but can also be mssql.
SCHEMA_NAME: The name of the schema to be created/used in the database. Default is public.
DATA_DIR: The directory containing the data CSV files. Default is data.
SYNTHETIC: Load synthetic data (boolean). Default is false
SYNTHETIC_NUMBER: Size of synthetic data, 100 or 1000. Default is 100.
DELIMITER: The delimiter used to separate data. Default is tab, can also be ,

Usage

CLI

pip install omop-lite python omop-lite --help

Docker

docker run -v ./data:/data ghcr.io/health-informatics-uon/omop-lite

# docker-compose.yml
services:
  omop-lite:
    image: ghcr.io/health-informatics-uon/omop-lite
    volumes:
      - ./data:/data
    depends_on:
      - db

  db:
    image: postgres:latest
    environment:
      - POSTGRES_DB=omop
      - POSTGRES_PASSWORD=password
    ports:
      - "5432:5432"

Helm

To install using Helm:

# Add the Helm repository
helm install omop-lite oci://ghcr.io/health-informatics-uon/charts/omop-lite --version 0.2.2

The Helm chart deploys OMOP Lite as a Kubernetes Job that creates an OMOP CDM in a database. You can customise the installation using a values file:

# values.yaml
env:
  dbHost: postgres
  dbPort: "5432"
  dbUser: postgres
  dbPassword: postgres
  dbName: omop_helm
  dialect: postgresql
  schemaName: public
  synthetic: "false"

Install with custom values:

helm install omop-lite omop-lite/omop-lite -f values.yaml

Synthetic Data

If you need synthetic data, some is provided in the synthetic directory. It provides a small amount of data to load quickly. To load the synthetic data, run the container with the SYNTHETIC environment variable set to true.

100 is fake data.
1000 is Synthea 1k data.

Bring Your Own Data

You can provide your own data for loading into the tables by placing your files in the data/ directory. This should contain .csv files matching the data tables (DRUG_STRENGTH.csv, CONCEPT.csv, etc.).

To match the vocabulary files from Athena, this data should be tab-separated, but as a .csv file extension. You can override the delimiter with DELIMITER configuration.

Text search OMOP

Full-text search

Adding a tsvector column to the concept table and an index on that column makes full-text search queries on the concept table run much faster. This can be configured by setting FTS_CREATE to be non-empty in the environment.

Vector search

Postgres does vector search too! To enable this on omop-lite, you can compose the compose-omop-ts.yml with

docker compose -f compose-omop-ts.yml

To do this, you need to have embeddings/embeddings.parquet, containing concept_ids and embeddings. This uses pgvector to create an embeddings table.

Testing

If you're a developer and want to iterate on omop-lite quickly, there's a small subset of the vocabularies sufficient to build in synthetic/. If you wish to test the vector search, there are matching embeddings in embeddings/embeddings.parquet.

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.github		.github
.vscode		.vscode
charts/omop-lite		charts/omop-lite
embeddings		embeddings
omop_lite		omop_lite
tests		tests
text-search		text-search
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
cr.yaml		cr.yaml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock
values.yaml		values.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

omop-lite

Configuration

Usage

CLI

Docker

Helm

Synthetic Data

Bring Your Own Data

Text search OMOP

Full-text search

Vector search

Testing

About

Uh oh!

Releases 30

Packages

Uh oh!

Uh oh!

Contributors 4

Uh oh!

Languages

License

Health-Informatics-UoN/omop-lite

Folders and files

Latest commit

History

Repository files navigation

omop-lite

Configuration

Usage

CLI

Docker

Helm

Synthetic Data

Bring Your Own Data

Text search OMOP

Full-text search

Vector search

Testing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 30

Packages 0

Uh oh!

Uh oh!

Contributors 4

Uh oh!

Languages

Packages