Skip to content

Conversation

nl0
Copy link
Member

@nl0 nl0 commented Sep 25, 2025

Summary

  • Enhance Qurator assistant with automatic context file loading from directory hierarchy
  • Support both README.md and AGENTS.md files throughout bucket/package structure
  • Implement smart limits: 10KB file size, 10 non-root file maximum
  • Add package metadata context with system/user separation

Key Features

Context File Loading

  • Loads both README.md and AGENTS.md files from directory hierarchy
  • AGENTS.md has priority over README.md when file limit is reached
  • Works for bucket browsing, file viewing, and package navigation
  • 10KB size limit per file (reduced from 100KB for better performance)
  • 10 file limit for non-root files (root files always included)
  • Files loaded from most specific (current) to least specific (root)

Package Context

  • System Metadata (<package-info> tag): bucket, name, hash, modified, message, workflow, stats
  • User Metadata (<package-metadata> tag): custom userMeta when present
  • Separate contexts for package root and package directories

Technical Improvements

  • De-emphasized tool UI: Collapsible tool messages with icons instead of text
Screenshot 2025-09-26 at 11 50 48

Context Markers

  • bucketContextFilesReady - Bucket root context loaded
  • dirContextFilesReady - Directory hierarchy loaded
  • fileContextFilesReady - File parent context loaded
  • packageMetadataReady - Package metadata available
  • packageContextFilesReady - Package root context loaded
  • packageDirContextFilesReady - Package directory context loaded

Testing

Verify context loading by checking Qurator DevTools for the above markers and inspecting assistant context messages.

nl0 and others added 7 commits September 25, 2025 15:10
- Add ContextFiles.ts module with README.md loading utilities
- Create BucketContext provider to load bucket root README
- Integrate bucket context into Bucket.tsx
- Add markers for context readiness

🤖 Generated with Claude Code

Co-Authored-By: Claude <[email protected]>
- Fix buildPathChain to properly exclude bucket root when stopAt=''
- Add DirContextFiles provider to load README hierarchy
- Integrate context provider into Dir component
- Prevent duplicate loading of bucket root README

🤖 Generated with Claude Code

Co-Authored-By: Claude <[email protected]>
- Add FileContextFiles provider to load parent directory READMEs
- Integrate context provider into File component
- Load README hierarchy from file's parent directory up to bucket root
- Add marker fileContextFilesReady

🤖 Generated with Claude Code

Co-Authored-By: Claude <[email protected]>
Copy link

codecov bot commented Sep 25, 2025

Codecov Report

❌ Patch coverage is 3.38983% with 114 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.32%. Comparing base (6c50227) to head (ddd4b59).

Files with missing lines Patch % Lines
...log/app/components/Assistant/Model/ContextFiles.ts 0.00% 67 Missing ⚠️
...containers/Bucket/PackageTree/AssistantContext.tsx 0.00% 20 Missing and 2 partials ⚠️
...log/app/containers/Bucket/File/AssistantContext.ts 0.00% 5 Missing ⚠️
catalog/app/utils/XML.ts 16.66% 4 Missing and 1 partial ⚠️
catalog/app/containers/Bucket/AssistantContext.tsx 0.00% 4 Missing ⚠️
.../app/containers/Bucket/PackageTree/PackageTree.tsx 0.00% 4 Missing ⚠️
...talog/app/containers/Bucket/DirAssistantContext.ts 0.00% 3 Missing ⚠️
catalog/app/utils/LogicalKeyResolver.tsx 40.00% 3 Missing ⚠️
catalog/app/containers/Bucket/Bucket.tsx 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4561      +/-   ##
==========================================
- Coverage   39.44%   39.32%   -0.12%     
==========================================
  Files         852      855       +3     
  Lines       36860    36975     +115     
  Branches     5764     6037     +273     
==========================================
+ Hits        14538    14542       +4     
+ Misses      21804    21206     -598     
- Partials      518     1227     +709     
Flag Coverage Δ
api-python 91.67% <ø> (ø)
catalog 20.11% <3.38%> (-0.08%) ⬇️
lambda 92.60% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR enhances Qurator's context system by automatically loading README.md files from directory hierarchies to provide better domain awareness.

Key Changes

  • New ContextFiles module: Core functionality for loading README.md files from S3 with proper error handling and 100KB truncation
  • Hierarchical context loading: README files are loaded from current location up to bucket root, providing contextual information at each level
  • Integration across views: Added README context to bucket root, directory browsing, and file viewing
  • Graceful error handling: Missing README files are handled silently without breaking functionality

Technical Implementation

The implementation follows existing patterns using LazyContext providers and Effect library. README files are loaded in parallel using Promise.all() and formatted as XML messages for the assistant. The system includes appropriate loading states and markers for context readiness tracking.

Confidence Score: 4/5

  • This PR is largely safe to merge with some minor concerns
  • Score reflects well-structured implementation following existing patterns, proper error handling, and thorough integration. Minor deduction for potential undefined body check and CI deployment concerns
  • Pay attention to ContextFiles.ts for the undefined body handling and verify CI deployment intentions

Important Files Changed

File Analysis

Filename        Score        Overview
catalog/app/components/Assistant/Model/ContextFiles.ts 4/5 New core module for README.md loading with proper error handling and truncation
catalog/app/containers/Bucket/AssistantContext.tsx 5/5 New bucket context provider that loads README.md from bucket root
catalog/app/containers/Bucket/DirAssistantContext.ts 5/5 Enhanced directory context with README hierarchy loading from current path upward
catalog/app/containers/Bucket/File/AssistantContext.ts 5/5 Enhanced file context with parent directory README loading
.github/workflows/deploy-catalog.yaml 3/5 Added qctx branch trigger and commented out GovCloud/MP deployments

Sequence Diagram

sequenceDiagram
    participant User
    participant BucketComponent as Bucket.tsx
    participant DirComponent as Dir.tsx
    participant FileComponent as File.js
    participant BucketContext as BucketContext
    participant DirContext as DirContextFiles
    participant FileContext as FileContextFiles
    participant ContextFiles as ContextFiles.ts
    participant S3
    participant Assistant

    User->>BucketComponent: Navigate to bucket
    BucketComponent->>BucketContext: Initialize with bucket name
    BucketContext->>ContextFiles: loadContextFile(s3, bucket, '')
    ContextFiles->>S3: getObject(bucket, 'README.md')
    S3-->>ContextFiles: README.md content or 404
    ContextFiles-->>BucketContext: ContextFileContent or null
    BucketContext->>Assistant: Format as XML message
    
    User->>DirComponent: Navigate to directory
    DirComponent->>DirContext: Initialize with bucket, path
    DirContext->>ContextFiles: loadContextFileHierarchy(s3, bucket, path, '')
    ContextFiles->>ContextFiles: buildPathChain(path, stopAt='')
    loop For each path in chain
        ContextFiles->>S3: getObject(bucket, path + '/README.md')
        S3-->>ContextFiles: README.md content or 404
    end
    ContextFiles-->>DirContext: Array of ContextFileContent
    DirContext->>Assistant: Format as XML messages
    
    User->>FileComponent: View file
    FileComponent->>FileContext: Initialize with bucket, path
    FileContext->>ContextFiles: Get parent directory path
    FileContext->>ContextFiles: loadContextFileHierarchy(s3, bucket, parentPath, '')
    ContextFiles->>ContextFiles: buildPathChain(parentPath, stopAt='')
    loop For each parent path
        ContextFiles->>S3: getObject(bucket, path + '/README.md')
        S3-->>ContextFiles: README.md content or 404
    end
    ContextFiles-->>FileContext: Array of ContextFileContent
    FileContext->>Assistant: Format as XML messages

    Note over Assistant: Now has context from README hierarchy for better assistance
Loading

11 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

nl0 and others added 21 commits September 25, 2025 20:20
- Create PackageMetadataContext to provide package metadata
- Create PackageRootContext to load README.md from package root
- Create PackageDirContext to load README hierarchy in packages
- Use LogicalKeyResolver to resolve virtual package paths to physical S3 keys
- Exclude root README from PackageDirContext to avoid duplication
- Integrate context providers into PackageTree, DirDisplay, and FileDisplay
- Update tasks.md to reflect completion of Task 5

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add scope="bucket" or scope="package" attribute to distinguish context sources
- Add bucket="$bucket" attribute to all context files
- Add package-name="$name" attribute for package context files
- Update formatContextFileAsXML to accept optional attributes
- Update all context providers to pass appropriate attributes
- Update documentation (spec, plan, tasks) to reflect these changes

This helps the assistant understand the origin and scope of each context file.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Extended PackageTree Revision query to include userMeta, workflow, totalEntries, message
- Added modified field to all package-related GraphQL queries to fix cache key consistency
- Updated PackageMetadataContext to receive revision data as props (avoids infinite loop)
- Pass revision data from PackageTreeQueries through PackageTree to context components

The infinite loop issue was caused by urql's cache key for PackageRevision which uses
both hash and modified fields. When queries had inconsistent fields, the cache key
would change between renders causing re-fetches. Adding modified to all queries
ensures stable cache keys.

Package metadata now exposed to assistant includes:
- userMeta: User-defined package metadata
- workflow: Package workflow information
- totalEntries: Number of entries in package
- totalBytes: Total size of package
- modified: Last modification timestamp
- message: Package revision message

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Added Task 5.4 documenting userMeta exposure implementation
- Added implementation notes about GraphQL cache key consistency
- Documented the root cause of infinite loop (cache key instability)
- Explained the solution (consistent modified field across queries)

Key learnings:
- urql cache keys for PackageRevision use hash:modified format
- All queries fetching PackageRevision must include modified field
- Data fetching inside LazyContext can cause infinite loops
- Pass data as props from parent components instead

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Package metadata is now split into two separate XML tags:
- package-info tag for system metadata (bucket, name, hash, modified, message, workflow, totalEntries, totalBytes)
- package-metadata tag for user metadata (userMeta only, when present)

This provides better semantic separation between system-level and user-defined package information, making it easier for the Qurator assistant to understand and process different types of package data.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Reduced visual prominence with faint grey background and hover effect
- Replaced text labels with icons (Build, CheckCircleOutline, ErrorOutline)
- Made tool details collapsible with Material-UI Collapse component
- Extracted reusable ToolMessage component for DRY code
- Changed color scheme: intense/normal/faint instead of intense/bright/faint

🤖 Generated with Claude Code

Co-Authored-By: Claude <[email protected]>
- Load both README.md and AGENTS.md files from directory hierarchies
- Reduce file size limit from 100KB to 10KB
- Limit to 10 non-root context files (root files always included)
- Prioritize closer directories over distant ones
- Fix performance: use Promise.all() for parallel loading instead of sequential
- Replace hardcoded values with imported constants

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Changed loading order so AGENTS.md files are loaded before README.md at each
directory level. This ensures agent-specific instructions have priority when
the 10-file limit is reached.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add code review findings to plan.md
- Add Task 10: Code Quality Refactoring to tasks.md
  - Extract common loading pattern
  - Refactor context providers
  - Consolidate package context logic
  - Standardize error handling

🤖 Generated with Claude Code
Co-Authored-By: Claude <[email protected]>
- Add useContextFileLoader hook for consistent loading pattern
- Extract loadPackageContextFile helper for package contexts
- Remove unnecessary error logging for expected 404s
- Reduce code duplication by ~50% across all context providers

🤖 Generated with Claude Code
Co-Authored-By: Claude <[email protected]>
- Mark Task 10 as completed in tasks.md
- Add refactoring results to plan.md

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
* master:
  Qurator: adjust styling for tool messages (#4572)
  Qurator: Limit total search results contents context to 100k characters to avoid context window overflow (#4573)
  Bump amazonlinux from 2023.8.20250908.0 to 2023.8.20250915.0 in /catalog (#4558)
  Bump debian from bullseye-20250908-slim to bullseye-20250929-slim in /lambdas/thumbnail (#4568)
  Bump amazonlinux from 2023.8.20250908.0 to 2023.8.20250915.0 in /s3-proxy (#4559)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant