What :
- This project is a template to create reproducible notebooks and publications for your research projects mostly related to data analysis or ML.
- It makes use of modern tools and possible best practices known at this time. E.g. GitHub codespaces for replication and reproducibility. GitHub Codespaces is a cloud based development enviroment with builin VS Code Support.
- Acessible from wherever GitHib Website with Internet is accessible to you.
- Hassle free way to develop your simple codes .
- Learn to make a dynamic publication using MyST
- Learn to make a release
- Learn to archive your software/code on Zenodo / Software Heritage.
Why : You can use this setup for building your proof of concept very fast and efficiently.
Pre-requiste : Knowledge of GIT and Basic Python programming is essential to take this course.
It makes use of following python libraries :
- UV : Package Manager (UV is a lightning-fast package manager written in Rust. It offers significant performance improvements over traditional tools like pip and pipenv)
- Ruff : Python Linter (Ruff is a powerful linter that identifies and corrects common Python code style and quality issues. It's designed to be highly efficient and customizable.)
- Groq AI API : A sample notebook demonstrating the usage of Groq AI. a prior account is needed before running this notebook to provide the API key.
- MyST Markdown for publication.
- Empty files for CITATION and codemeta.json, to be filled in by the user as part of the exercise.
-
Create a groq account and keep the API keys handy for the exercies. Refer to for more details : https://console.groq.com/keys
-
Make a copy of this repository by forking it.
-
In the forked repository, run the codespace
-
Please refer to Setup.md for more details on follow-up instructions.
- https://github.com/github/codespaces-jupyter/tree/main
- Research folder structure standard
- https://github.com/pyOpenSci/pyos-package-template/tree/main Citation : https://zenodo.org/records/14052274
- https://curvenote.com/blog/-markdown-pyopensci-2024
- https://github.com/astral-sh/uv?tab=readme-ov-file#script-support
- https://docs.astral.sh/ruff/linter/#fixes
- https://docs.astral.sh/ruff/installation/
- https://github.com/UtrechtUniversity/generative-ai/tree/main?tab=readme-ov-file, https://github.com/UtrechtUniversity/generative-ai/blob/main/kickstarter/notebooks/claude_opus.ipynb
- https://myst-parser.readthedocs.io/en/latest/ Citation : https://zenodo.org/records/14805658
- https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giad113/7516267
- https://blog.reviewnb.com/jupyter-notebook-reproducibility-managing-dependencies-data-secrets/
- Ten Simple Rules for Reproducible Research in Jupyter Notebooks
- Similar project with a Python package : https://github.com/manzt/juv
- A Jupyter AI Package to interact with multiple AI Api's : https://github.com/jupyterlab/jupyter-ai
- JupyterLab Magic Wand to in cell AI assistence : https://github.com/Zsailer/jupyterlab-magic-wand
- Reproducible Notebooks with Pixi : https://prefix.dev/blog/pixi_jupyter_notebooks
- Teaching Python with GitHub Codespaces
- API (Aplication Programming Interface) : APIs are mechanisms that enable two software components to communicate with each other using a set of definitions and protocols ref : https://aws.amazon.com/what-is/api/
- ML : Machine Learning
- AI : Artifical Intelligence
I thank the PyOpenSci Fall Festival 2024 (Leah W and Team) and MyST(Rowan and Team) and Modhurita (UU) for providing me lots of useful tips and inspiration to create this.
Thanks Joyce, Heidi & DRA Team.