-
-
Couldn't load subscription status.
- Fork 23
Refactor logs #1068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refactor logs #1068
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1068 +/- ##
==========================================
+ Coverage 57.51% 58.75% +1.23%
==========================================
Files 109 110 +1
Lines 14291 15021 +730
==========================================
+ Hits 8220 8825 +605
- Misses 6071 6196 +125 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Tempted to rename ChatLogs as ChatTelemetry to prevent it from being confused from logger. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first question is high-level, while I realize this is for logging specifically, it almost seems to me like this could be generalized as a way to persist sessions. If we serialized messages and outputs (as specs) and stored those in the DB, we could eventually allow users to restore old sessions or even share a link to a session, which could then be restored on load. I'm not saying we have to handle that in this PR but I do want to anticipate that possibility. To that end I'd also suggest that we version the schema to ensure that as we modify it we can potentially implement schema migrations.
On a more concrete note, I'd suggest that just like all components in Lumen (AI), the logging implementation should be swappable, i.e. can we make sure I can implement a different ChatLogs class and pass that to the UI constructor?
|
Good ideas. I also think we could potentially deprecate/remove Interceptor and include it here somehow. I just discovered there's a way to capture the instructor's input with logging: https://python.useinstructor.com/concepts/logging/ |
For this, I refactored message_content -> message_json, and began storing memory_json. In a separate PR, I imagine it's just looping through the rows of the messages and hydrating the components (I'm somewhat confident I missed capturing a piece of state though) This is what the memory_json looks like {"outputs": [], "plan": {"title": "Available Datasets", "steps": [{"expert_or_tool": "TableListAgent", "instruction": "Render a list of all available tables to provide the user with information about the datasets they can work with.", "title": "List Available Datasets", "render_output": true}]}, "reasoning": "The user is asking for a list of available datasets. The best approach is to use the TableListAgent, which is specifically designed to render a list of all available tables to the user. This will provide the user with the necessary information about the datasets they can work with.", "source": {"tables": {"windturbines.parquet": "read_parquet('/Users/ahuang/repos/lumen/windturbines.parquet')"}, "uri": ":memory:", "type": "duckdb"}, "sources": [{"tables": {"windturbines.parquet": "read_parquet('/Users/ahuang/repos/lumen/windturbines.parquet')"}, "uri": ":memory:", "type": "duckdb"}], "tool_context": ""}This is what message_json looks like for a ChatStep: {'type': 'card',
'steps': [{'title': 'Plan with 2 steps created',
'content': "The user has requested to see the contents of the 'windturbines.parquet' table. The SQLAgent is the appropriate expert to use for this task as it can generate and execute a query to display the table's data. After retrieving the data, the AnalystAgent can be used to help interpret the results if needed.\n\nHere are the steps:\n\n1. SQLAgent: Generate and execute a query to display the contents of the 'windturbines.parquet' table.\n2. AnalystAgent: Analyze the results from the 'windturbines.parquet' table to provide insights and interpretations.\n",
'status': 'success'},
{'title': 'Obtained necessary context',
'content': '',
'status': 'success'},
{'title': 'Querying SQL agent...',
'content': "`SQL` agent is working on the following task:\n\nGenerate and execute a query to display the contents of the 'windturbines.parquet' table.",
'status': 'running'},
{'title': 'Retrieve wind turbines data',
'content': "The user requested to display the contents of the 'windturbines.parquet' table. I will write a SQL query to select all columns from this table.\n```sql\nSELECT * FROM read_parquet('windturbines.parquet') as windturbines\n```",
'status': 'success'}],
}And for a typical message
Implemented SQLiteChatLogs |
|
I would strongly suggest not calling this feature Telemetry as this is not doing what I usually understand as telemetry, i.e. a tool phoning home and often by default some data (e.g. VSCode making HTTP requests to Microsoft when I interact with it). What is added in this PR is about Logging or Monitoring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great! Added some minor comments but the schema itself makes sense to me.
|
The from lumen.ai.agents import ChatAgent, SQLAgent
from lumen.ai.ui import ChatUI
from lumen.ai.coordinator import Planner
from lumen.ai.llm import OpenAI
from lumen.ai.logs import ChatLogs
from lumen.ai.tools import DocumentLookup
from lumen.ai.utils import deserialize_from_spec
Planner(llm=OpenAI(), agents=[ChatAgent(), SQLAgent()], tools=[DocumentLookup()]).to_spec()Which outputs: {'type': 'lumen.ai.coordinator.Planner',
'agents': [{'type': 'lumen.ai.agents.SQLAgent',
'debug': False,
'llm': {'type': 'lumen.ai.llm.OpenAI',
'api_key': '',
'endpoint': '',
'interceptor': None,
'mode': {'type': 'instructor.mode.Mode', 'value': 'tool_call'},
'model_kwargs': {'default': {'model': 'gpt-4o-mini'},
'reasoning': {'model': 'gpt-4o'}},
'organization': '',
'temperature': 0.25,
'use_logfire': False},
'memory': None,
'prompts': {'find_tables': {'response_model': 'lumen.ai.models.make_tables_model',
'template': "{% extends 'Actor/main.jinja2' %}\n\n{% block instructions %}\nDetermine which tables are necessary to answer the user query, paying special attention to the available columns.\n{% if separator %}\nUse table names verbatim, and be sure to include the delimiters {{ separator }}, like '<source>{{ separator }}<table>'\n{% endif %}\n{% endblock %}\n\n{% block context %}\nAvailable tables and schemas:\n{{ tables_schema_str }}\n{% endblock %}\n"},
'main': {'response_model': 'lumen.ai.models.Sql',
'template': '{% extends \'Actor/main.jinja2\' %}\n\n{%- block instructions %}\nYou are an agent responsible for writing a SQL query that will perform the data transformations the user requested.\nTry not to take the query too literally, but instead focus on the user\'s intent and the data transformations required.\nUse `SELECT * FROM table` if there is no specific column selection mentioned in the query; no alias required.\n{%- endblock -%}\n\n{% block context -%}\n{%- if tables_sql_schemas -%}\nHere are YAML schemas for currently relevant tables:\n{% for table, details in tables_sql_schemas.items() %}\n- `{{ table }}`:\n```yaml\n{{ details.schema }}\n```\n{% endfor -%}\n{%- endif -%}\n\nChecklist:\n- Use only `{{ dialect }}` SQL syntax.\n- Do NOT include inlined comments in the SQL code, e.g. `-- comment`\n- Quote column names to ensure they do not clash with valid identifiers.\n- Pretty print the SQL output with newlines and indentation.\n{%- if join_required -%}\n- Please perform a join between the necessary tables.\n- If the join\'s values do not align based on the min/max lowest common denominator, then perform a join based on the closest match, or resample and aggregate the data to align the values.\n- Very important to transform the values to ensure they align correctly, especially for acronyms and dates.\n{%- endif -%}\n{%- if dialect == \'duckdb\' %}\n- If the table name originally did not have `read_*` prefix, use the original table name\n- Use table names verbatim; e.g. if table is read_csv(\'table.csv\') then use read_csv(\'table.csv\') and not \'table\' or \'table.csv\'\n- If `read_*` is used, use with alias, e.g. read_parquet(\'table.parq\') as table_parq\n- String literals are delimited using single quotes (\', apostrophe) and result in STRING_LITERAL values. Note that\ndouble quotes (") cannot be used as string delimiter character: instead, double quotes are used to delimit quoted\nidentifiers.\n{% endif %}\n{%- if dialect == \'snowflake\' %}\n- Do not under any circumstances add quotes around the database, schema or table name.\n{% endif -%}\n\nAdditionally, only if applicable:\n- Specify data types explicitly to avoid type mismatches.\n- Be sure to remove suspiciously large or small values that may be invalid, like -9999.\n- Use Common Table Expressions (CTEs) and subqueries to break down into manageable parts, only if the query requires more than one transformation.\n- Filter and sort data efficiently (e.g., ORDER BY key metrics) and use LIMIT (greater than 1) to focus on the most relevant results.\n- If the date columns are separated, e.g. year, month, day, then join them into a single date column.\n\n{%- if has_errors %}\nIf there are issues with the query, here are some common fixes:\n- Handle NULL values using functions like COALESCE or IS NULL.\n- If it\'s a date column (excluding individual year/month/day integers) date, cast to date using appropriate syntax, e.g.\nCAST or TO_DATE\n- Capture only the required numeric values while removing all whitespace, like `(\\d+)`, or remove characters like `$`, `%`, `,`, etc, only if needed.\n- Ensure robust type conversion using functions like TRY_CAST to avoid query failures due to invalid data.\n{% endif %}\n{%- endblock -%}\n\n{% if comments is defined -%}\nHere\'s additional guidance:\n{{ comments }}\n{%- endif -%}\n\n{%- block examples %}\n{%- if has_errors -%}\nCasting Examples:\n\nIf the query is "Which five regions have the highest total sales from 2022-02-22?"...\n\n- GOOD:\n```sql\nWITH sales_summary AS (\n SELECT\n "region",\n SUM(\n TRY_CAST(\n REPLACE(\n REPLACE("amount", \'$\', \'\'),\n \',\', \'\'\n ) AS DECIMAL(10,2)\n )\n ) AS total_sales\n FROM read_csv(\'sales.csv\')\n WHERE "sale_date" >= DATE \'2022-02-22\'\n AND TRY_CAST(REPLACE(REPLACE("amount", \'$\', \'\'), \',\', \'\') AS DECIMAL(10,2)) IS NOT NULL\n AND "region" IS NOT NULL\n GROUP BY "region"\n)\nSELECT\n "region",\n total_sales\nFROM sales_summary\nWHERE total_sales > 0\nORDER BY total_sales DESC\nLIMIT 5;\n```\n\n- BAD:\n```sql\nSELECT region, SUM(amount) AS total_sales\nFROM sales\nWHERE sale_date >= \'2022-02-22\'\nGROUP BY region\nORDER BY total_sales DESC;\n```\n{%- endif -%}\n{% endblock -%}\n'}},
'provides': ['data', 'pipeline', 'sql', 'table', 'tables_sql_schemas'],
'purpose': '\n Responsible for displaying tables, generating, modifying and\n executing SQL queries to answer user queries about the data,\n such querying subsets of the data, aggregating the data and\n calculating results. If the current table does not contain all\n the available data the SQL agent is also capable of joining it\n with other tables. Will generate and execute a query in a single\n step.',
'requires': ['source'],
'steps_layout': None,
'template_overrides': {},
'user': 'Lumen'},
{'type': 'lumen.ai.agents.ChatAgent',
'debug': False,
'llm': {'type': 'lumen.ai.llm.OpenAI',
'api_key': '',
'endpoint': '',
'interceptor': None,
'mode': {'type': 'instructor.mode.Mode', 'value': 'tool_call'},
'model_kwargs': {'default': {'model': 'gpt-4o-mini'},
'reasoning': {'model': 'gpt-4o'}},
'organization': '',
'temperature': 0.25,
'use_logfire': False},
'memory': None,
'prompts': {'main': {'template': '{% extends \'Actor/main.jinja2\' %}\n\n{%- block instructions %}\nAct as a helpful assistant for high-level data exploration, focusing on available datasets and, only if data is\navailable, explaining the purpose of each column. Offer suggestions for getting started if needed, remaining factual and\navoiding speculation. Do not write code or give code related suggestions.\n{%- endblock %}\n\n{% block context %}\n{%- if \'data\' in memory %}\nHere\'s a summary of the dataset the user just asked about:\n```\n{{ memory[\'data\'] }}\n```\n{%- endif %}\n{% if tables_schemas is defined %}\nAvailable table schemas:\n{% for table, schema in tables_schemas %}\n- `{{ table }}`: {{ schema }}\n{% endfor %}\n{% elif table is defined and schema is defined %}\n- `{{ table }}`: {{ schema }}\n{% endif -%}\nHere was the plan that was executed:\n"""\n{{ memory.reasoning }}\n"""\n{% endblock -%}\n',
'tools': ['lumen.ai.tools.DocumentLookup',
'lumen.ai.tools.TableLookup']}},
'provides': [],
'purpose': '\n Chats and provides info about high level data related topics,\n e.g. the columns of the data or statistics about the data,\n and continuing the conversation.\n\n Is capable of providing suggestions to get started or comment on interesting tidbits.\n It can talk about the data, if available. Or, it can also solely talk about documents.\n\n Usually not used concurrently with SQLAgent, unlike AnalystAgent.\n Can be used concurrently with TableListAgent to describe available tables\n and potential ideas for analysis, but if only documents are available,\n then it can be used alone.',
'requires': [],
'steps_layout': None,
'template_overrides': {},
'user': 'Agent'}],
'demo_inputs': ['Perform an analysis using the first suggestion.',
'Show me a plot of these results.',
'Show me the the first dataset and its columns.',
'What are some interesting analyses I can do?',
'What datasets are available?'],
'history': 3,
'llm': {'type': 'lumen.ai.llm.OpenAI',
'api_key': '',
'endpoint': '',
'interceptor': None,
'mode': {'type': 'instructor.mode.Mode', 'value': 'tool_call'},
'model_kwargs': {'default': {'model': 'gpt-4o-mini'},
'reasoning': {'model': 'gpt-4o'}},
'organization': '',
'temperature': 0.25,
'use_logfire': False},
'memory': None,
'prompts': {'context': {'response_model': 'lumen.ai.models.make_context_model',
'template': "{% extends 'Actor/main.jinja2' %}\n\n{%- block instructions %}\nYou are team lead and have to make a plan to solve the user's query but before you start you can look up some context to make informed decisions.\n\n- Only use tables to request info about tables that might contain relevant data.\n- You have access to the tables so never invoke a tool with the goal of getting data.\n{% endblock -%}\n\n{% block context -%}\n{%- if 'table' in memory %}- The result of the previous step was the `{{ memory['table'] }}` table. Consider carefully if it contains all the information you need and only request more tables if absolutely necessary.{% endif -%}\n\n{%- if table_info %}\nHere are tables and schemas that are already available to you:\n{{ table_info }}\n{%- endif %}\n\nHere's the choice of tools and their uses:\n{% if tools %}\nHere's a list of tools:\n{%- for tool in tools %}\n- `{{ tool.name }}`\n Requires: {{ tool.requires }}\n Provides: {{ tool.provides }}\n Description: {{ tool.purpose.strip().split() | join(' ') }}\n{%- endfor -%}\n{%- endif %}\n\n{%- endblock -%}\n"},
'main': {'response_model': 'lumen.ai.models.make_plan_models',
'template': "{% extends 'Actor/main.jinja2' %}\n\n{%- block instructions %}\nYou are team lead and have to make a plan to solve how to address the user query step-by-step by assigning subtasks to a set of experts and tools.\n\nGround Rules:\n- Each of these experts requires certain information and has the ability to provide certain information.\n- Do not perform tasks the user didn't ask for, e.g. do not plot the data unless requested or compute things if the user asked you to summarize the results in words.\n- Ensure that you provide each expert the context they need to ensure they do not repeat previous steps.\n- Do not go in depth to try to solve the problem, just make a plan and let the experts do the work.\n{%- if tools %}\n- Tools do not interact with a user, assign an expert to report, summarize or use the results.\n- When looking up information with a tool ensure the expert comes AFTER the tool.\n{%- endif %}\n\nImportant Agent Rules:\n- The SQLAgent can generate and execute multiple queries in a single step with joins. DO NOT create two separate steps for generating the query and then executing it.\n- The SQLAgent is a better candidate than TableListAgent if asked to show the table, and usually is followed by the AnalystAgent, which can help the user understand the results of the query.\n- Only use SQLAgent when working with new data or different analysis requirements. If the user wants to visualize, discuss, or further analyze data that's already been retrieved, use the current SQL rather than executing the query again.\n- The ChatAgent usually can be used alone, but if the query is related to the data tables, please use AnalystAgent instead.\n{% endblock -%}\n\n{% block context -%}\n{%- if tool_context %}\nNote that you previously decided that you required additional context and got the following responses DO NOT invoke the same tools again: {{ tool_context }}\n{% endif %}\n\n{%- if table_info %}\nHere are tables and schemas that are available to you:\n{{ table_info }}\n{%- endif %}\n{%- if tables_schema_str %}\n{{ tables_schema_str }}\n{%- endif -%}\n{% if memory.get('document_sources') %}\nHere are the documents you have access to:\n{%- for document_source in memory['document_sources'] %}\n- '''{{ document_source['text'][:80].replace('\\n', ' ') | default('<No text available></No>') }}...''' ({{ document_source['metadata'] | default('Unknown Filename') }})\n\n{%- endfor %}\n{% endif %}\nHere's the choice of experts and their uses:\n{%- for agent in agents %}\n- `{{ agent.name[:-5] }}`\n Requires: {{ agent.requires }}\n Provides: {{ agent.provides }}\n Description: {{ agent.purpose.strip().split() | join(' ') }}\n{%- endfor -%}\n{% if tools %}\nHere's a list of tools:\n{%- for tool in tools %}\n- `{{ tool.name }}`\n Requires: {{ tool.requires }}\n Provides: {{ tool.provides }}\n Description: {{ tool.purpose.strip().split() | join(' ') }}\n{%- endfor -%}\n{%- endif %}\n\n{%- if 'sql' in memory %}\nThe following is the current SQL.\n```sql\n{{ memory['sql'] }}\n```\n{%- endif %}\n\n{%- if 'table' in memory %}\n- The result of the previous step was the `{{ memory['table'] }}` table. If the user is referencing a previous result this is probably what they're referring to. Consider carefully if it contains all the information you need and only invoke the SQL agent if some other calculation needs to be performed.\n- However, if the user requests to see all the columns, they might be referring to the table that `{{ memory['table'] }} was derived from.\n- If you are invoking a SQL agent and reusing the table, tell it to reference that table by name rather than re-stating the query.\n{%- endif %}\n\n{%- if unmet_dependencies %}\nHere were your failed previous plans:\n{%- for previous_plan in previous_plans %}\n- {{ previous_plan }}\n{%- endfor %}\nThese previous plans failed because it did not satisfy all requirements; the last plan failed to provide for: `{{ unmet_dependencies }}`\n\nPlease include some of these these experts to provide for the missing requirements:\n{%- for candidate in candidates %}\n- `{{ candidate.name[:-5] }}`\n{%- endfor %}\n{% endif %}\n{%- endblock -%}\n"}},
'render_output': True,
'suggestions': ['Can you visualize the data?',
'What datasets are available?',
"What's interesting to analyze?"],
'template_overrides': {},
'tools': [{'type': 'lumen.ai.tools.DocumentLookup',
'llm': {'type': 'lumen.ai.llm.OpenAI',
'api_key': '',
'endpoint': '',
'interceptor': None,
'mode': {'type': 'instructor.mode.Mode', 'value': 'tool_call'},
'model_kwargs': {'default': {'model': 'gpt-4o-mini'},
'reasoning': {'model': 'gpt-4o'}},
'organization': '',
'temperature': 0.25,
'use_logfire': False},
'memory': None,
'min_similarity': 0.1,
'n': 3,
'prompts': {},
'provides': [],
'purpose': '',
'requires': ['document_sources'],
'template_overrides': {},
'vector_store': {'type': 'lumen.ai.vector_store.NumpyVectorStore',
'chunk_size': 1024,
'embeddings': {'type': 'lumen.ai.embeddings.NumpyEmbeddings',
'vocab_size': 1536},
'vocab_size': 1536}}]}Just need to patch up logs.py and re-arrange the tables now. |

This PR introduces a database schema to track user interactions and feedback with Lumen AI.
The primary goal is to gather data on what users like, dislike, and retry, helping us iteratively improve our prompts and agent configurations.
The schema centers around message and interaction tracking, with the messages table storing the latest/current version of each message and the retries table maintaining history of all retry attempts. Both tables track likes/dislikes separately since users may dislike an initial response but like the retry.
The system uses multiple states for message modifications, distinguishing between when a user clicks the rerun button to reprompt LLM with the same prompt (reran), manually edits the message (edited), or uses a larger LLM model to rewrite the response (retried). Other states include active, undone, and cleared.
For efficient storage, the schema uses hash-based IDs to deduplicate identical configurations, allowing agents and coordinators with identical configs to share the same database entry.
It also offers convenient methods to view telemetry:
Debating on whether we should rename ChatLogs to ChatTelemetry
Depends on holoviz/panel#7722
WIP: