This script reads a list of DOIs from list.txt, fetches metadata from CrossRef API, and checks if those papers exist in Poseidon Archives (community-archive, aadr-archive, minotaur-archive). It then generates an HTML table (index.html) displaying:
✔ Paper title
✔ Publication year & exact date
✔ First author’s name
✔ Journal name
✔ Availability in Poseidon archives (✔ or ✘)
✔ A search bar for filtering by title
✔ Dropdown filters for the archives
Every time list.txt is updated and a commit is pushed, this script runs and updates index.html on GitHub Pages.
- Python Version: Python 3.x
- Required Libraries:
pip install requests Jinja2
- Files:
list.txt→ List of DOIs (one per line)base_script.py→ The main scriptindex.html→ The generated output file
Fetches metadata from CrossRef API.
Extracts title, year, journal, date, first author’s name.
Formats publication date into YYYY-MM-DD.
Prints progress updates like:
(1 / 100) Querying metadata for 10.1002/ajpa.23312
Calls Poseidon API to check available DOIs for a given archive.
Extracts DOI list from community-archive, aadr-archive, and minotaur-archive.
Prints status messages while fetching:
Fetching DOI data from community-archive...
Collects all available DOIs from all Poseidon archives.
Stores data in a dictionary mapping DOIs → available archives.
Cleans up DOI format by removing extra spaces & "https://doi.org/".
Checks list.txt for duplicate DOIs.
If duplicates are found, it prints a warning:
WARNING: Duplicate DOIs found:
- 10.1002/ajpa.23312
Creates index.html using a Jinja2 template.
Adds search bar to filter by title.
Adds dropdown filters to show/hide papers based on Poseidon archive availability.
Formats clickable DOI links like this:
<a href="https://doi.org/10.1002/ajpa.23312">10.1002/ajpa.23312</a>
Prints progress while updating:
Updating index.html...
index.html successfully updated!
- Add DOIs to
list.txt(one per line). - Run the script:
python base_script.py
- Open
index.htmlto see the results!
This is a fully automated workflow that updates the table and deploys it to GitHub Pages whenever input.txt changes.
GitHub Actions Workflow runs everything behind the scenes. No manual updates needed!