Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
-
Updated
Apr 30, 2025 - Python
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
An agent benchmark with tasks in a simulated software company.
A Python toolkit for chain-of-thought prompting 🐍
Automated Deep Research with LLMs, web search, paper parsing, and didactic summarization.
Generate full fledged PDF reports using LLMs like GPT, Claude, Llama
Deploy your autonomous agents to production grade environments with 99% Uptime Guarantee, Infinite Scalability, and self-healing.
🌪️ AI research assistant that generates Wikipedia-quality articles through multi-perspective analysis. Based on Stanford's STORM methodology.
AIRAS - an open-source project for research automation
Interactive tool for analyzing attention patterns in transformer models with layer-wise visualizations, token importance scoring, and attention flow diagrams
Template repository for the Werewolf hackathon
Transformers + Mambas + LSTMS All in One Model
Multimodal generative AI resources : talking heads, STT, TTS, image & video generation, and more.
Official implementation of "Automated Algorithmic Discovery for Gravitational-Wave Detection Guided by LLM-Informed Evolutionary Monte Carlo Tree Search" (arXiv:2508.03661).
An open source implementation of Mamba 2 in one file of pytorch
A comprehensive collection of PyTorch implementations for the VGG (Visual Geometry Group) models
Running GenAI models at every scale, on every modality
This project explores the integration of an episodic memory module into PyACTR. The goal is to enhance the agent's performance in language processing tasks that require context and pattern recognition.
Veloclade is a research prototype of a neuro-symbolic knowledge graph system. It uses clade-inspired hierarchy + embedding clustering (sentence-transformers) to control ontology growth and mitigate subclassing explosion. Designed for experimentation in hybrid reasoning and AI knowledge representation.
A framework for resonance-based decision-making in artificial agents, combining celestial influences, memory feedback, and symbolic emergence.
Supervisor-worker multi-agent framework for AI system development. Orchestrates specialized agents through complete lifecycle: requirements → architecture → engineering → deployment. Supports Cursor IDE, GitHub Copilot, Claude Projects, AWS Bedrock.
Add a description, image, and links to the ai-research topic page so that developers can more easily learn about it.
To associate your repository with the ai-research topic, visit your repo's landing page and select "manage topics."