feat: ConversationSummaryBufferMemory #973

danny-avila · 2023-09-20T13:58:07Z

Closes #741

Summary

Adds ConversationSummaryBufferMemory to OpenAI/Plugins endpoints
- Enabled by setting OPENAI_SUMMARIZE=true
- (Optional) summary model is gpt-3.5-turbo but can be set with env var OPENAI_SUMMARY_MODEL
- Passing a prompt higher than the summary model context with summarization enabled will trigger the model to attempt a summary for as much of the prompt as possible (starting from the end)
  - TODO: In the future, the entire prompt exceeding the summary context, and any prior messages up to the summary message, can be summarized recursively
  - TODO: In the future, I will add a setting for the user/admin to determine the summary context size
refactors tiktoken encoder used in tokenizer route to own function
adds split by tokens helper function
adds getModelMaxTokens helper function, which extends openrouter support to handle larger context for openAI models
update token count when a message is edited
updates/adds JSDocs for majority of methods edited and created
adds env var to debug OpenAI endpoint DEBUG_OPENAI
adds ConversationSummaryBufferMemory to plugins agents to mitigate large token spend when iterating a lot

Change Type

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update
Documentation update

Testing

Edited existing tests that were affected by the changes., mainly for BaseClient
Added tests for summary handling logic
Added tests for new formatMessage helpers

Checklist

My code adheres to this project's style guidelines
I have performed a self-review of my own code
I have commented in any complex areas of my code
I have made pertinent documentation changes
My changes do not introduce new warnings
I have written tests demonstrating that my changes are effective or that my feature works
Local unit tests pass with my changes
Any changes dependent on mine have been merged and published in downstream modules.

…ne util function

…eware at top

…ry, feat(ci): getMessagesForConversation tests

…00000' as root message

…mmary' to 'summaryMessage', add previous_summary refactor(getMessagesWithinTokenLimit): replace text and tokenCount if should summarize, summary, and summaryTokenCount are present fix/refactor(handleContextStrategy): use the right comparison length for context diff, and replace payload first message when a summary is present

…value

refactor(handleContextStrategy): add usePrevSummary logic in case only summary was pruned refactor(loadHistory): initial message query will return all ordered messages but keep track of the latest summary refactor(getMessagesForConversation): use object for single param, edit jsdoc, edit all files using the method refactor(ChatGPTClient): order messages before buildPrompt is called, TODO: add convoSumBuffMemory logic

… is true

…s and test edge case

…mplement LLM initialization for summaryBuffer

…ient prop 'shouldRefineContext' to 'shouldSummarize', change contextStrategy value to 'summarize' from 'refine'

…' being passed

…helper function for langchain messages, add JSDocs and tests

… use new formatting helpers

…ze filename

…y never be used by new method but added for testing

…ssages

…ter compatibility

…e passed, pass humanPrefix and aiPrefix to memory, along with any future variables, rename messagesToRefine to context

…ummary context, refactor(BaseClient): allow custom maxContextTokens to be passed to getMessagesWithinTokenLimit, add defined check before unshifting summaryMessage, update shouldSummarize based on this refactor(OpenAIClient): use getModelMaxTokens, use cut-off message method for summary if no messages were left after pruning

…er than model context

* refactor: pass model in message edit payload, use encoder in standalone util function * feat: add summaryBuffer helper * refactor(api/messages): use new countTokens helper and add auth middleware at top * wip: ConversationSummaryBufferMemory * refactor: move pre-generation helpers to prompts dir * chore: remove console log * chore: remove test as payload will no longer carry tokenCount * chore: update getMessagesWithinTokenLimit JSDoc * refactor: optimize getMessagesForConversation and also break on summary, feat(ci): getMessagesForConversation tests * refactor(getMessagesForConvo): count '00000000-0000-0000-0000-000000000000' as root message * chore: add newer model to token map * fix: condition was point to prop of array instead of message prop * refactor(BaseClient): use object for refineMessages param, rename 'summary' to 'summaryMessage', add previous_summary refactor(getMessagesWithinTokenLimit): replace text and tokenCount if should summarize, summary, and summaryTokenCount are present fix/refactor(handleContextStrategy): use the right comparison length for context diff, and replace payload first message when a summary is present * chore: log previous_summary if debugging * refactor(formatMessage): assume if role is defined that it's a valid value * refactor(getMessagesWithinTokenLimit): remove summary logic refactor(handleContextStrategy): add usePrevSummary logic in case only summary was pruned refactor(loadHistory): initial message query will return all ordered messages but keep track of the latest summary refactor(getMessagesForConversation): use object for single param, edit jsdoc, edit all files using the method refactor(ChatGPTClient): order messages before buildPrompt is called, TODO: add convoSumBuffMemory logic * fix: undefined handling and summarizing only when shouldRefineContext is true * chore(BaseClient): fix test results omitting system role for summaries and test edge case * chore: export summaryBuffer from index file * refactor(OpenAIClient/BaseClient): move refineMessages to subclass, implement LLM initialization for summaryBuffer * feat: add OPENAI_SUMMARIZE to enable summarizing, refactor: rename client prop 'shouldRefineContext' to 'shouldSummarize', change contextStrategy value to 'summarize' from 'refine' * refactor: rename refineMessages method to summarizeMessages for clarity * chore: clarify summary future intent in .env.example * refactor(initializeLLM): handle case for either 'model' or 'modelName' being passed * feat(gptPlugins): enable summarization for plugins * refactor(gptPlugins): utilize new initializeLLM method and formatting methods for messages, use payload array for currentMessages and assign pastMessages sooner * refactor(agents): use ConversationSummaryBufferMemory for both agent types * refactor(formatMessage): optimize original method for langchain, add helper function for langchain messages, add JSDocs and tests * refactor(summaryBuffer): add helper to createSummaryBufferMemory, and use new formatting helpers * fix: forgot to spread formatMessages also took opportunity to pluralize filename * refactor: pass memory to tools, namely openapi specs. not used and may never be used by new method but added for testing * ci(formatMessages): add more exhaustive checks for langchain messages * feat: add debug env var for OpenAI * chore: delete unnecessary comments * chore: add extra note about summary feature * fix: remove tokenCount from payload instructions * fix: test fail * fix: only pass instructions to payload when defined or not empty object * refactor: fromPromptMessages is deprecated, use renamed method fromMessages * refactor: use 'includes' instead of 'startsWith' for extended OpenRouter compatibility * fix(PluginsClient.buildPromptBody): handle undefined message strings * chore: log langchain titling error * feat: getModelMaxTokens helper * feat: tokenSplit helper * feat: summary prompts updated * fix: optimize _CUT_OFF_SUMMARIZER prompt * refactor(summaryBuffer): use custom summary prompt, allow prompt to be passed, pass humanPrefix and aiPrefix to memory, along with any future variables, rename messagesToRefine to context * fix(summaryBuffer): handle edge case where messagesToRefine exceeds summary context, refactor(BaseClient): allow custom maxContextTokens to be passed to getMessagesWithinTokenLimit, add defined check before unshifting summaryMessage, update shouldSummarize based on this refactor(OpenAIClient): use getModelMaxTokens, use cut-off message method for summary if no messages were left after pruning * fix(handleContextStrategy): handle case where incoming prompt is bigger than model context * chore: rename refinedContent to splitText * chore: remove unnecessary debug log

danny-avila marked this pull request as draft September 20, 2023 13:58

danny-avila force-pushed the convo-summary branch 2 times, most recently from fd6e8ac to f296b6c Compare September 24, 2023 23:50

danny-avila added 16 commits September 24, 2023 20:01

refactor: pass model in message edit payload, use encoder in standalo…

a63758a

…ne util function

feat: add summaryBuffer helper

715509a

refactor(api/messages): use new countTokens helper and add auth middl…

e90bfe2

…eware at top

wip: ConversationSummaryBufferMemory

0aed8cc

refactor: move pre-generation helpers to prompts dir

51d7e1b

chore: remove console log

c627b26

chore: remove test as payload will no longer carry tokenCount

6a138eb

chore: update getMessagesWithinTokenLimit JSDoc

c4105ac

refactor: optimize getMessagesForConversation and also break on summa…

a417dc8

…ry, feat(ci): getMessagesForConversation tests

refactor(getMessagesForConvo): count '00000000-0000-0000-0000-0000000…

4669cdc

…00000' as root message

chore: add newer model to token map

99b5984

fix: condition was point to prop of array instead of message prop

d9fff9d

chore: log previous_summary if debugging

bed06ab

refactor(formatMessage): assume if role is defined that it's a valid …

dfd09a6

…value

danny-avila force-pushed the convo-summary branch from f296b6c to 10cfc2e Compare September 25, 2023 00:02

danny-avila added 8 commits September 24, 2023 20:24

fix: undefined handling and summarizing only when shouldRefineContext…

7687aa9

… is true

chore(BaseClient): fix test results omitting system role for summarie…

1e3ddc7

…s and test edge case

chore: export summaryBuffer from index file

ec6f405

refactor(OpenAIClient/BaseClient): move refineMessages to subclass, i…

67a40d6

…mplement LLM initialization for summaryBuffer

feat: add OPENAI_SUMMARIZE to enable summarizing, refactor: rename cl…

4c17830

…ient prop 'shouldRefineContext' to 'shouldSummarize', change contextStrategy value to 'summarize' from 'refine'

refactor: rename refineMessages method to summarizeMessages for clarity

db4c01d

chore: clarify summary future intent in .env.example

e1228eb

refactor(initializeLLM): handle case for either 'model' or 'modelName…

364e12a

…' being passed

danny-avila mentioned this pull request Sep 25, 2023

[Enhancement] Add buffer to increase possible response tokens #1003

Closed

danny-avila linked an issue Sep 25, 2023 that may be closed by this pull request

[Enhancement] Add buffer to increase possible response tokens #1003

Closed

danny-avila added 6 commits September 25, 2023 14:20

refactor(formatMessage): optimize original method for langchain, add …

396239a

…helper function for langchain messages, add JSDocs and tests

refactor(summaryBuffer): add helper to createSummaryBufferMemory, and…

0d1fc69

… use new formatting helpers

fix: forgot to spread formatMessages also took opportunity to plurali…

239a8de

…ze filename

refactor: pass memory to tools, namely openapi specs. not used and ma…

5da312d

…y never be used by new method but added for testing

ci(formatMessages): add more exhaustive checks for langchain messages

043972d

feat: add debug env var for OpenAI

1da7c9f

danny-avila marked this pull request as ready for review September 26, 2023 00:14

danny-avila added 18 commits September 25, 2023 20:21

chore: delete unnecessary comments

eb31b94

chore: add extra note about summary feature

e65675f

fix: remove tokenCount from payload instructions

fd38f88

fix: test fail

a79742d

fix: only pass instructions to payload when defined or not empty object

33fcb31

refactor: fromPromptMessages is deprecated, use renamed method fromMe…

2f52f78

…ssages

refactor: use 'includes' instead of 'startsWith' for extended OpenRou…

8b4189a

…ter compatibility

fix(PluginsClient.buildPromptBody): handle undefined message strings

4741473

chore: log langchain titling error

4555b17

feat: getModelMaxTokens helper

a4db0e3

feat: tokenSplit helper

5856309

feat: summary prompts updated

cde3c25

fix: optimize _CUT_OFF_SUMMARIZER prompt

daa724f

refactor(summaryBuffer): use custom summary prompt, allow prompt to b…

7f2f8fc

…e passed, pass humanPrefix and aiPrefix to memory, along with any future variables, rename messagesToRefine to context

fix(handleContextStrategy): handle case where incoming prompt is bigg…

dcec559

…er than model context

chore: rename refinedContent to splitText

25ca4b5

chore: remove unnecessary debug log

d01ee64

danny-avila merged commit 317a1bd into main Sep 27, 2023

danny-avila deleted the convo-summary branch September 27, 2023 01:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: ConversationSummaryBufferMemory #973

feat: ConversationSummaryBufferMemory #973

Uh oh!

danny-avila commented Sep 20, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

feat: ConversationSummaryBufferMemory #973

feat: ConversationSummaryBufferMemory #973

Uh oh!

Conversation

danny-avila commented Sep 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type

Testing

Checklist

Uh oh!

Uh oh!

danny-avila commented Sep 20, 2023 •

edited

Loading