| Documentation |
aurelian --help
Most commands will start up a different AI agent.
Extracts structured metadata from dataset documentation following the Datasheets for Datasets framework.
Supported File Types: PDF, HTML, JSON, text/markdown (both URLs and local files)
Library Usage:
from aurelian.agents.d4d.d4d_agent import d4d_agent
from aurelian.agents.d4d.d4d_config import D4DConfig
# Process multiple sources
sources = [
"https://example.com/dataset",
"/path/to/metadata.json",
"/path/to/documentation.html"
]
config = D4DConfig()
result = await d4d_agent.run(
f"Extract metadata from: {', '.join(sources)}",
deps=config
)
print(result.data) # D4D YAML outputTest Script:
cd aurelian
python test_d4d.pyFeatures:
- Automatic file type detection (PDF, HTML, JSON, text)
- Both URLs and local file paths supported
- Content truncation at 50,000 characters for token management
- Structured YAML output following D4D schema
Documentation: D4D Agent Guide
gene set enrichment
checking papers against checklists
This agent is for exploring, chatting with, and reviewing GO-CAMs
Docs: gocam_agent
It can be used to generate reviews according to guidelines for GO-CAMs:
It can also generate SVGs, demonstrating innate knowledge of both the visual grammar of pathway diagrams and the semantics of the underlying biology.
This agent is for exploring, chatting with, and reviewing standard annotations
Docs: go_ann_agent
Example review using TF guidelines:
geneontology/go-annotation#5743
Some agents require linkml-store pre-indexed. E.g. a mongodb with gocams for cam agent. Consult the linkml-store documentation for more information.
If an agent requires ontology search it will use the semsql/OAK sqlite database. The first time querying it will use linkml-store to create an LLM index. Requires OAI key. This may be slow first iteration. Will be cached until your pystow cache regenerates.