Skip to content

danielrosehill/Experiments-And-Evaluations-Index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

Experiments & Evaluations Index

Index Experiments Evaluations POC Templates

Repository Index

This repository serves as an index of experimental projects, evaluations, proof-of-concepts, templates, patterns, and exploratory ideas related to AI/LLM development and workflows.

About this Index

This collection brings together various experimental repositories exploring AI agent workflows, LLM capabilities, evaluation frameworks, and development patterns. These repositories represent hands-on experiments, proof-of-concepts, benchmarking efforts, and reusable templates for AI-driven development.


AI Agent Development

Agent Workflows & Patterns

Repository Link Description Date
Agent Handover Demo View Repo Demonstration of agent handover patterns 2025
Agent Network Expander Template View Repo Template for expanding agent networks 2025
Agent Task Repo Pattern With MCP View Repo Repository pattern for agent tasks using MCP 2025
AI Agent UN View Repo AI agent unified namespace 2025
AI Agent Workspace Spec View Repo Agent workspace specification Mar 2025

Development Templates

Repository Link Description Date
AI Assistant Template View Repo Template for AI assistant development 2025
AI Dev Prompts Example View Repo Example development prompts for AI 2025
AI Development Template View Repo General AI development template 2025

LLM Evaluation & Benchmarking

LLM Capabilities & Testing

Repository Link Description Date
Bias Censorship Eval Tests View Repo Evaluation tests for bias and censorship 2025
LLM Evaluation Prompts View Repo Prompts for evaluating LLMs 2025
Assistant Self Ideation View Repo Demo of AI "self ideation" in practice Feb 2025

LLM Experiments

Repository Link Description Date
LLM Experiment Notebook View Repo Notebook of LLM experiments 2025
LLM Long Codegen Test View Repo Testing long-form code generation 2025
LLM Max Token Length View Repo Maximum token length exploration Feb 2025
Single Shot Brevity Training View Repo Training for concise single-shot responses 2025
One Prompt AI Book View Repo Experiment in generating content from single prompts 2025
Long AI Prompting Experiment View Repo Testing experiments with extended prompts 2025

Hugging Face Spaces

Repository Link Description Date
Single Shot Brevity Training View Space Brevity training interface 2025
LLM Long Code Generation Experiment View Space Long-form code generation experiment 2025
Max Output Tokens Analysis View Space Maximum output tokens analysis Feb 2025

Speech-to-Text & Audio Processing

STT Benchmarks & Evaluation

Repository Link Description Date
Local ASR STT Benchmark View Repo Local ASR and STT benchmarking 2025
Long Form Audio Eval View Repo Evaluation of long-form audio transcription 2025
Personal STT Benchmarking View Repo Personal speech-to-text benchmarking 2025
STT Voice Note Evaluation View Repo Evaluation of STT for voice notes 2025

Hugging Face Spaces

Repository Link Description Date
Single Podcast ASR Eval View Space Single podcast ASR evaluation 2025
STT Comparison View Space Speech-to-text comparison tool 2025
Local STT Eval One Sample View Space Local STT evaluation with single samples 2025
Whisper Fine-Tune Eval View Space Interactive evaluation of fine-tuned Whisper models 2025

Hugging Face Datasets

Repository Link Description Date
Podcast ASR Evaluation View Dataset Dataset for podcast ASR evaluation 2025
Whisper Fine-Tune One Shot Eval View Dataset WER and accuracy evaluation comparing fine-tuned Whisper (Tiny, Base, Small, Medium) vs stock models on 1 hour of audio, inference on Modal A100 2025

Audio Processing Experiments

Repository Link Description Date
Crying Baby Audio Scrub View Repo Audio processing for baby noise removal 2025
Audio Context Pipeline Model View Repo Notes and model for audio context pipeline Apr 2025

Specialized Applications

OSINT & Intelligence

Repository Link Description Date
OSINT Missile Intelligence Agent View Repo OSINT-focused intelligence agent 2025

Data Analysis

Repository Link Description Date
GHG EBITDA Correlations View Repo Analysis of greenhouse gas and EBITDA correlations 2025

Testing & Documentation

Test Repositories

Repository Link Description Date
Test Markdown Docs View Repo Test repository for markdown documentation 2025
Test System Prompts View Repo Test repository for system prompts 2025

Note: This is a focused index covering experimental AI/LLM development projects. For a higher-level collection of all repository indexes and other projects, see the GitHub Master Index.

Author

Daniel Rosehill Contact: [email protected] Website: danielrosehill.com

About

Index of repositories concerning AI evaluations and experiments

Topics

Resources

Stars

Watchers

Forks