-
Notifications
You must be signed in to change notification settings - Fork 92
Enhance Qurator context with automatic README.md loading #4561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Add ContextFiles.ts module with README.md loading utilities - Create BucketContext provider to load bucket root README - Integrate bucket context into Bucket.tsx - Add markers for context readiness 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
- Fix buildPathChain to properly exclude bucket root when stopAt='' - Add DirContextFiles provider to load README hierarchy - Integrate context provider into Dir component - Prevent duplicate loading of bucket root README 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
- Add FileContextFiles provider to load parent directory READMEs - Integrate context provider into File component - Load README hierarchy from file's parent directory up to bucket root - Add marker fileContextFilesReady 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #4561 +/- ##
==========================================
- Coverage 39.44% 39.32% -0.12%
==========================================
Files 852 855 +3
Lines 36860 36975 +115
Branches 5764 6037 +273
==========================================
+ Hits 14538 14542 +4
+ Misses 21804 21206 -598
- Partials 518 1227 +709
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Summary
This PR enhances Qurator's context system by automatically loading README.md files from directory hierarchies to provide better domain awareness.
Key Changes
- New ContextFiles module: Core functionality for loading README.md files from S3 with proper error handling and 100KB truncation
- Hierarchical context loading: README files are loaded from current location up to bucket root, providing contextual information at each level
- Integration across views: Added README context to bucket root, directory browsing, and file viewing
- Graceful error handling: Missing README files are handled silently without breaking functionality
Technical Implementation
The implementation follows existing patterns using LazyContext
providers and Effect library. README files are loaded in parallel using Promise.all()
and formatted as XML messages for the assistant. The system includes appropriate loading states and markers for context readiness tracking.
Confidence Score: 4/5
- This PR is largely safe to merge with some minor concerns
- Score reflects well-structured implementation following existing patterns, proper error handling, and thorough integration. Minor deduction for potential undefined body check and CI deployment concerns
- Pay attention to ContextFiles.ts for the undefined body handling and verify CI deployment intentions
Important Files Changed
File Analysis
Filename | Score | Overview |
---|---|---|
catalog/app/components/Assistant/Model/ContextFiles.ts | 4/5 | New core module for README.md loading with proper error handling and truncation |
catalog/app/containers/Bucket/AssistantContext.tsx | 5/5 | New bucket context provider that loads README.md from bucket root |
catalog/app/containers/Bucket/DirAssistantContext.ts | 5/5 | Enhanced directory context with README hierarchy loading from current path upward |
catalog/app/containers/Bucket/File/AssistantContext.ts | 5/5 | Enhanced file context with parent directory README loading |
.github/workflows/deploy-catalog.yaml | 3/5 | Added qctx branch trigger and commented out GovCloud/MP deployments |
Sequence Diagram
sequenceDiagram
participant User
participant BucketComponent as Bucket.tsx
participant DirComponent as Dir.tsx
participant FileComponent as File.js
participant BucketContext as BucketContext
participant DirContext as DirContextFiles
participant FileContext as FileContextFiles
participant ContextFiles as ContextFiles.ts
participant S3
participant Assistant
User->>BucketComponent: Navigate to bucket
BucketComponent->>BucketContext: Initialize with bucket name
BucketContext->>ContextFiles: loadContextFile(s3, bucket, '')
ContextFiles->>S3: getObject(bucket, 'README.md')
S3-->>ContextFiles: README.md content or 404
ContextFiles-->>BucketContext: ContextFileContent or null
BucketContext->>Assistant: Format as XML message
User->>DirComponent: Navigate to directory
DirComponent->>DirContext: Initialize with bucket, path
DirContext->>ContextFiles: loadContextFileHierarchy(s3, bucket, path, '')
ContextFiles->>ContextFiles: buildPathChain(path, stopAt='')
loop For each path in chain
ContextFiles->>S3: getObject(bucket, path + '/README.md')
S3-->>ContextFiles: README.md content or 404
end
ContextFiles-->>DirContext: Array of ContextFileContent
DirContext->>Assistant: Format as XML messages
User->>FileComponent: View file
FileComponent->>FileContext: Initialize with bucket, path
FileContext->>ContextFiles: Get parent directory path
FileContext->>ContextFiles: loadContextFileHierarchy(s3, bucket, parentPath, '')
ContextFiles->>ContextFiles: buildPathChain(parentPath, stopAt='')
loop For each parent path
ContextFiles->>S3: getObject(bucket, path + '/README.md')
S3-->>ContextFiles: README.md content or 404
end
ContextFiles-->>FileContext: Array of ContextFileContent
FileContext->>Assistant: Format as XML messages
Note over Assistant: Now has context from README hierarchy for better assistance
11 files reviewed, 2 comments
- Create PackageMetadataContext to provide package metadata - Create PackageRootContext to load README.md from package root - Create PackageDirContext to load README hierarchy in packages - Use LogicalKeyResolver to resolve virtual package paths to physical S3 keys - Exclude root README from PackageDirContext to avoid duplication - Integrate context providers into PackageTree, DirDisplay, and FileDisplay - Update tasks.md to reflect completion of Task 5 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Add scope="bucket" or scope="package" attribute to distinguish context sources - Add bucket="$bucket" attribute to all context files - Add package-name="$name" attribute for package context files - Update formatContextFileAsXML to accept optional attributes - Update all context providers to pass appropriate attributes - Update documentation (spec, plan, tasks) to reflect these changes This helps the assistant understand the origin and scope of each context file. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Extended PackageTree Revision query to include userMeta, workflow, totalEntries, message - Added modified field to all package-related GraphQL queries to fix cache key consistency - Updated PackageMetadataContext to receive revision data as props (avoids infinite loop) - Pass revision data from PackageTreeQueries through PackageTree to context components The infinite loop issue was caused by urql's cache key for PackageRevision which uses both hash and modified fields. When queries had inconsistent fields, the cache key would change between renders causing re-fetches. Adding modified to all queries ensures stable cache keys. Package metadata now exposed to assistant includes: - userMeta: User-defined package metadata - workflow: Package workflow information - totalEntries: Number of entries in package - totalBytes: Total size of package - modified: Last modification timestamp - message: Package revision message 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Added Task 5.4 documenting userMeta exposure implementation - Added implementation notes about GraphQL cache key consistency - Documented the root cause of infinite loop (cache key instability) - Explained the solution (consistent modified field across queries) Key learnings: - urql cache keys for PackageRevision use hash:modified format - All queries fetching PackageRevision must include modified field - Data fetching inside LazyContext can cause infinite loops - Pass data as props from parent components instead 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Package metadata is now split into two separate XML tags: - package-info tag for system metadata (bucket, name, hash, modified, message, workflow, totalEntries, totalBytes) - package-metadata tag for user metadata (userMeta only, when present) This provides better semantic separation between system-level and user-defined package information, making it easier for the Qurator assistant to understand and process different types of package data. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Reduced visual prominence with faint grey background and hover effect - Replaced text labels with icons (Build, CheckCircleOutline, ErrorOutline) - Made tool details collapsible with Material-UI Collapse component - Extracted reusable ToolMessage component for DRY code - Changed color scheme: intense/normal/faint instead of intense/bright/faint 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
- Load both README.md and AGENTS.md files from directory hierarchies - Reduce file size limit from 100KB to 10KB - Limit to 10 non-root context files (root files always included) - Prioritize closer directories over distant ones - Fix performance: use Promise.all() for parallel loading instead of sequential - Replace hardcoded values with imported constants 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Changed loading order so AGENTS.md files are loaded before README.md at each directory level. This ensures agent-specific instructions have priority when the 10-file limit is reached. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Add code review findings to plan.md - Add Task 10: Code Quality Refactoring to tasks.md - Extract common loading pattern - Refactor context providers - Consolidate package context logic - Standardize error handling 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
- Add useContextFileLoader hook for consistent loading pattern - Extract loadPackageContextFile helper for package contexts - Remove unnecessary error logging for expected 404s - Reduce code duplication by ~50% across all context providers 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
- Mark Task 10 as completed in tasks.md - Add refactoring results to plan.md 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
* master: Qurator: adjust styling for tool messages (#4572) Qurator: Limit total search results contents context to 100k characters to avoid context window overflow (#4573) Bump amazonlinux from 2023.8.20250908.0 to 2023.8.20250915.0 in /catalog (#4558) Bump debian from bullseye-20250908-slim to bullseye-20250929-slim in /lambdas/thumbnail (#4568) Bump amazonlinux from 2023.8.20250908.0 to 2023.8.20250915.0 in /s3-proxy (#4559)
Summary
Key Features
Context File Loading
Package Context
<package-info>
tag): bucket, name, hash, modified, message, workflow, stats<package-metadata>
tag): custom userMeta when presentTechnical Improvements
Context Markers
bucketContextFilesReady
- Bucket root context loadeddirContextFilesReady
- Directory hierarchy loadedfileContextFilesReady
- File parent context loadedpackageMetadataReady
- Package metadata availablepackageContextFilesReady
- Package root context loadedpackageDirContextFilesReady
- Package directory context loadedTesting
Verify context loading by checking Qurator DevTools for the above markers and inspecting assistant context messages.