This project implements a chatbot with human-like memory characteristics using Pinecone as a vector store and Ollama with the llama3.2:latest model for natural language processing. The system emulates memory degradation over time (based on an exponential forgetting curve) and assigns stronger imprints to important memories, mimicking how humans prioritize significant events.
Current Version: 1.1.0 (Semantic Versioning: MAJOR.MINOR.PATCH)
- Memory Degradation: Older memories fade using an exponential decay function, with a configurable half-life (default: 90 days).
- Importance Weighting: Memories deemed important (via sentiment analysis) receive a stronger imprint, making them more retrievable. (Done ✅)
- Persistent Storage: Conversation history is stored in Pinecone, allowing scalable, long-term memory.
- Contextual Responses: The chatbot retrieves relevant past interactions to inform its responses implicitly, avoiding explicit mentions of memory loss or repetition. (Done ✅)
- Dynamic User Identity Detection: The user's identity is detected explicitly by the LLM from the user's input and stored as metadata in Pinecone. (Done ✅)
- Debug Mode: Optional debug output for monitoring importance scores and memory operations. (New ✨)
- PEP8 Compliant: Code follows Python's PEP8 style guide for readability and maintainability. (New ✨)
- Versioning System: Version information is maintained in the source code and displayed at startup.
- Python 3.8+
- Pinecone Account: Sign up at Pinecone and obtain an API key.
- Ollama: Installed locally with the
llama3.2:latestmodel pulled (ollama pull llama3.2:latest). - Git: To clone this repository.
- Clone the Repository:
git clone https://github.com/MLidstrom/llm-memory.git
cd llm-memory- Install Dependencies:
pip install pinecone-client ollama langchain langchain_pinecone langchain-huggingface sentence-transformers numpy python-dotenv nltk- Set Up Pinecone:
- Create a
.envfile in the project root with your Pinecone API key:
PINECONE_API_KEY=your-actual-api-key-here
- Run Ollama Locally:
- Start the Ollama server with the
llama3.2:latestmodel:
ollama run llama3.2:latest- Run the Chatbot:
python llm-memory.py- Type your messages in the prompt (
You:). - Type
exitto stop the chatbot.
- Debug Mode:
- To enable debug mode, open
llm-memory.pyand setDEBUG = Truenear the top of the file. - With debug mode enabled, you'll see additional output like importance scores for each conversation.
- Pinecone Vector Store: Stores conversation embeddings with metadata (text, timestamp, importance, user identity).
- Embedding Model: Uses
sentence-transformers/all-MiniLM-L6-v2to generate 384-dimensional vectors. - Memory Degradation: Applies an exponential decay factor based on time elapsed since the memory was stored.
- Importance Scoring: Uses sentiment analysis (VADER) to assign an importance score (−1.0 to 1.0) to each memory, amplifying its embedding and retrieval weight.
- User Identity Detection: The user's identity is explicitly identified by the LLM based on user disclosures.
This project uses Semantic Versioning:
- MAJOR version for incompatible API changes
- MINOR version for functionality added in a backward-compatible manner
- PATCH version for backward-compatible bug fixes
Version information is displayed at startup showing the current version, Pinecone index, model information, and debug status.
- Half-Life: Adjust
half_life_daysincalculate_decay_factor(default: 90 days) to control memory fade rate. - Importance Logic: Modify
calculate_importanceto use custom rules. - Decay Threshold: Change
decay_thresholdinretrieve_context(default: 0.05) to filter faded memories. - Debug Mode: Set
DEBUG = Truein the source code to enable diagnostic output.
- Implement a repetition boost for frequently mentioned topics.
- Periodically prune highly decayed memories from Pinecone.
- Support multi-user memory with user-specific filtering.
- Version-based memory migrations for handling breaking changes.
pinecone-client: Pinecone vector store integration.ollama: Local LLM inference withllama3.2:latest.langchain&langchain_pinecone: Framework for embeddings and vector store management.langchain-huggingface: HuggingFace embeddings integration for LangChain.sentence-transformers: Embedding generation.numpy: Numerical operations for decay calculations.python-dotenv: Load environment variables from.envfile.nltk: Sentiment analysis for importance scoring.
This project is licensed under the MIT License. See LICENSE for details.
Contributions are welcome! Please submit a pull request or open an issue for suggestions, bug reports, or enhancements.
This project follows PEP8 style guidelines. When contributing, please ensure your code passes PEP8 validation.
- Built with Pinecone for vector storage.
- Powered by Ollama and the
llama3.2:latestmodel. - Inspired by human memory research and the forgetting curve.
Last Updated: March 12, 2025