-
-
Notifications
You must be signed in to change notification settings - Fork 32.6k
[docs-infra] Create llms.txt #46308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs-infra] Create llms.txt #46308
Conversation
llms.txt
and docs markdown.
Netlify deploy previewhttps://deploy-preview-46308--material-ui.netlify.app/ Bundle size report@mui/material parsed: 0B(0.00%) gzip: 0B(0.00%) Show details for 100 more bundles (86 more not shown)@mui/lab/AdapterDateFns parsed: 0B(0.00%) gzip: 0B(0.00%) |
llms.txt
and docs markdown.llms.txt
and docs markdown.
The extractMarkdownInfo function was incorrectly capturing "## Usage" as the description for template README files. Updated the regex to use negative lookahead to exclude headers and added an additional check to ensure captured text doesn't start with '#'. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks largely good to me! Great effort. Added a couple of comments.
package.json
Outdated
@@ -22,7 +22,8 @@ | |||
"release:pack": "tsx scripts/releasePack.mts", | |||
"docs:api": "rimraf --glob ./docs/pages/**/api-docs ./docs/pages/**/api && pnpm docs:api:build", | |||
"docs:api:build": "tsx ./scripts/buidApiDocs/index.ts", | |||
"docs:build": "pnpm --filter docs build", | |||
"docs:llms:build": "rimraf --glob ./docs/public/material-ui/ && tsx ./scripts/buildLlmsDocs/index.ts --projectSettings ./packages/api-docs-builder-core/materialUi/projectSettings.ts --nonComponentFolders material/getting-started material/customization material/experimental-api material/guides material/integrations material/migration", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add nonComponentFolders
as a list on the projectSettings
to avoid having to pass a long list of folders in the command?
|
||
for (const file of files) { | ||
// Calculate relative path from the baseDir to the file | ||
const relativePath = file.outputPath.startsWith(`${baseDir}/`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add absolute paths in the llms.txt (see: https://llms.mui.com/material-ui/7.1.0/llms.txt) so that an AI client is able to directly traverse/fetch content present in the links without needing to understand what the base URL should be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated.
domain
can be specified to the CLI (for older versions), the default is mui.com
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hard disagree. these docs are not built for a specific domain. e.g. thru are hosted on preview donations. we can make them root-relative if you want. any client consuming these links will have to resolve them, as is quite standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is built for LLMs, relying on LLMs to be able to reliably resolve domains is a mistake since we will invite consistent erroneous resolution with no deterministic way of correcting it. Benchmarked against many other llms.txt
, all of them contain fully resolved URLs to aid AI, not make it harder for it:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an mcp will have to resolve them before feeding to the llm. you'll want to be able to test tooling against preview builds of our docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opposition is more because I think relative URLs here would be a mistake, not because it would be hard to do. We could very well do some sort of system prompt engineering in the MCP tools or some programmatic URL resolution in the tool itself, but I don't think we should.
Prompting it to resolve relative links would be unnecessary, the body of llms.txt passes through the mcp, it's trivial to parse and resolve these links and pass the document with resolved links. Relative linking is how the web works. When you follow the links through all your examples, e.g. https://vercel.com/docs/frameworks.md, you'll see that their inner links are relative. And that makes 100% sense, this is how the web works, so any tool that pretends to be able to consume it will need to be able to handle relative linking. i.e. the tool that fetches https://vercel.com/docs/frameworks.md will have to resolve these inner links for the LLM.
again, we have no reliable measure of determining which docs
Sure we do, if we try the chat 5 times with the same prompt and it answers wrong 4 times, we can check that ratio against another version of the docs. If it now answers wrong 2 out of 5 times we have successfully improved the docs.
Anyway, as I said before, if we want to absolute link, what's stopping us from generating correct links for the netlify preview, the url under which will be deployed should be in the environment? You just have to pass it to the CLI. (also should make the whole url configurable, not just the domain, protocol and subdomain matter). I could live with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bharatkashyap What do you think about testing with relative links first since it's the default behavior? If we can prove that it's does not work well or is wrong with the MCP you work on, I can add the absolute links in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, as I said before, if you're really set on absolute linking, what's stopping you from generating correct links for the netlify preview, the url under which will be deployed should be in the environment? You just have to pass it to the CLI. (also should make the whole url configurable, not just the domain, protocol and subdomain matter). I could live with that.
@siriwatknp This is the best option IMO, but I'll try out the llms.txt
with relative links in the MCP and report
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, great. I'll merge this with the relative links and I'll help you test it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llms.txt
and docs markdown.llms.txt
and docs markdown.
…s.txt - Add --domain CLI option with default value 'mui.com' - Update generateLlmsTxt to use absolute URLs instead of relative paths - Remove unused baseDir parameter from generateLlmsTxt function 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
…watknp/material-ui into docs/gen-llms
…projectSettings Move the nonComponentFolders configuration from command-line argument to projectSettings for better maintainability and consistency with other project configurations. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Restore the --domain CLI option that was removed in the previous commit. This allows configuring the domain for absolute URLs in llms.txt files. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
This reverts commit 9523d2d.
A bit late to the party here, but any plans on supporting an llms.txt for Mui X? I've been wanting to feed some better documentation to an LLM for complex things like DataGrid and Charts. |
llms.txt
and docs markdown.llms.txt
and docs markdown
llms.txt
and docs markdown
Result:
Summary
This PR introduces a new script to generate LLM-optimized documentation by processing MUI component
markdown files and creating standalone documentation with embedded code examples and API references.
What's included
New build script:
buildLlmsDocs
/scripts/buildLlmsDocs/index.ts
{{"demo": "filename.js"}}
syntax with actual code snippetsllms.txt
based on the project settings--projectSettings
parameterCore processing utilities
processComponent.ts
: Handles demo replacement in markdown filesprocessApi.ts
: Converts API JSON to markdown tables with proper formattingKey improvements
Usage
The script has been tested with Material UI components and documentation. Unit tests are included for the
core processing functions.
Part of mui/mui-public#423