Skip to content

gqgs/llm100kbench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Investment Benchmark

A tool for benchmarking and tracking Large Language Model (LLM) investment decisions.

Overview

This project provides a framework to create, manage, and track investment portfolios generated by LLM models. It allows you to:

  • Create new portfolios
  • List current holdings and recent context
  • Update portfolios based on model decisions

The model executions and their current context can be seen here.

Why?

To optimize their portfolio, the primary objective defined for the LLMs, it is imperative to evaluate the risk-reward ratio, formulate cogent assumptions about future market conditions, and leverage tools and their understanding of human psychology and financial market dynamics.

This benchmark may be a good proxy to measure how well LLMs are able to coordinate the aforementioned efforts.

Notes

  • Removed Gemini for now because the available free chat UI can't search for updated prices nor does it support the upload of CSV or JSON 😬.
  • Removed Claude for now because the available free chat UI can't search for updated prices and its context window is too small for uploaded files 😬.
  • Removed ChatGPT for now becaue the available free chat UI can't no longer do complex data analysis 😬.

Project Structure

  • cmd: Contains the main command implementations
    • create: Initialize new portfolios
    • list: Display current holdings and context
    • update: Process investment orders and update holdings
    • stocks: Fetch most recent stock prices

Prompt

The most recent prompt with the clear guidelines can be see here and here.

Current Portfolio (2025-04-07)

Model Ticket Sum Quantity
chatgpt USD 69 69
chatgpt AAPL 99931 418
deepseek AMD 925 3
deepseek MSFT 2611 7
deepseek SNPS 20888 50
deepseek ASML 62322 100
grok MSFT 9700 26
grok AAPL 9753 48
grok GOOGL 9646 64
grok AMZN 17047 81
grok NVDA 48748 383
perplexity USD 10 10
perplexity AXON 9492 17
perplexity CRWD 9963 27
perplexity AMGN 9766 31
perplexity COST 47565 50
perplexity CTAS 9917 51
perplexity DUOL 20470 66
Model Total Sum Change
perplexity 107183
chatgpt 100000
grok 94894
deepseek 86746

About

LLM 100k portfolio management benchmark

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published