-
Notifications
You must be signed in to change notification settings - Fork 234
Adds the ability to include vision based context to chat via --rag #1661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideThis PR refactors the file ingestion and chat context pipeline by introducing a new file loader abstraction with text and image managers, integrates vision-based context via a RAG flag into the chat shell, enhances configuration layering and environment variable parsing with type coercion and nested keys, updates the CLI to support listing models and specifying a model, and aligns tests and CI/tooling (Makefile, pre-commit, ruff, Tekton pipelines) with these changes. Sequence diagram for chat context ingestion with vision-based RAGsequenceDiagram
actor User
participant CLI as CLI
participant RamaLamaShell
participant OpanAIChatAPIMessageBuilder
participant TextFileManager
participant ImageFileManager
User->>CLI: ramalama chat --rag <context>
CLI->>RamaLamaShell: Initialize shell with args
RamaLamaShell->>OpanAIChatAPIMessageBuilder: builder.load(context)
OpanAIChatAPIMessageBuilder->>TextFileManager: load(text files)
OpanAIChatAPIMessageBuilder->>ImageFileManager: load(image files)
OpanAIChatAPIMessageBuilder-->>RamaLamaShell: messages (text and image context)
RamaLamaShell->>RamaLamaShell: Append messages to conversation_history
RamaLamaShell->>CLI: Continue chat loop
Class diagram for new file loader abstraction (Text and Image Managers)classDiagram
class BaseFileLoader {
<<abstract>>
+static load(file: str) str
+static file_extensions() set[str]
}
class TXTFileLoader {
+static load(file: str) str
+static file_extensions() set[str]
}
class BasicImageFileLoader {
+static load(file: str) str
+static file_extensions() set[str]
}
class PDFFileLoader {
+static load(file: str) str
+static file_extensions() set[str]
}
class BaseFileManager {
<<abstract>>
-loaders: dict
+_get_loader(file: str) BaseFileLoader
+abstract load()
+abstract get_loaders() List[Type[BaseFileLoader]]
}
class TextFileManager {
-document_delimiter: Template
+load(files: list[str]) str
+get_loaders() List[Type[BaseFileLoader]]
}
class ImageFileManager {
+load(files: list[str]) list[str]
+get_loaders() List[Type[BaseFileLoader]]
}
class OpanAIChatAPIMessageBuilder {
-text_manager: TextFileManager
-image_manager: ImageFileManager
+partition_files(file_path: str) -> tuple
+supported_extensions() set[str]
+load(file_path: str) list[dict]
}
BaseFileLoader <|-- TXTFileLoader
BaseFileLoader <|-- BasicImageFileLoader
BaseFileLoader <|-- PDFFileLoader
BaseFileManager <|-- TextFileManager
BaseFileManager <|-- ImageFileManager
OpanAIChatAPIMessageBuilder --> TextFileManager
OpanAIChatAPIMessageBuilder --> ImageFileManager
Class diagram for updated configuration layering and type coercionclassDiagram
class UserConfig {
no_missing_gpu_prompt: bool = False
__post_init__()
}
class RamalamaSettings {
config_file: str | None = None
}
class BaseConfig {
container: bool
engine: str
image: str
store: str
api: str
transport: str
ocr: bool
default_image: str
user: UserConfig
settings: RamalamaSettings
__post_init__()
}
class LayeredMixin {
_fields: set
_layers: list
__init__(*layers)
is_set(name: str) bool
}
LayeredMixin <|-- BaseConfig
BaseConfig o-- UserConfig
BaseConfig o-- RamalamaSettings
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Signed-off-by: Ian Eaves <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @ieaves, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a significant new feature to the ramalama
chat application: the ability to incorporate vision-based context (images) into chat conversations through Retrieval Augmented Generation (RAG). This involved a substantial refactoring of the underlying file loading and processing infrastructure to handle diverse document and image formats. Additionally, the configuration system has been upgraded for greater flexibility and nested settings, and the continuous integration/continuous delivery (CI/CD) pipelines have been expanded to support the new RAG-enabled container images.
Highlights
- Vision-based RAG for Chat: Implemented the capability to augment chat conversations with context from image files (e.g., JPG, PNG) by converting them to base64 data URIs, alongside existing text-based RAG.
- Overhauled File Loading System: Replaced the previous
file_upload
module with a more modularfile_loaders
system, enabling robust processing of various text (e.g.,.txt
,.md
,.json
,.yaml
) and image file types for RAG. - Advanced Configuration Management: Introduced a new layered configuration system that supports nested environment variables (e.g.,
RAMALAMA_USER__SETTING
) and recursive merging of configuration sources, enhancing flexibility and maintainability. - Expanded CI/CD for RAG Images: Added new Tekton pipelines for building and pushing RAG-specific container images (e.g.,
cuda-rag
,rocm-rag
) and updated existing pipelines to ensure parent images are available before dependent builds. - Improved Chat Command Line Interface: Enhanced the
chat
command with new options:--list
to discover available models from an endpoint and--model
to explicitly select a model for inferencing.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ieaves - I've reviewed your changes - here's some feedback:
- In load_file_config,
config_path
is never set before you use it to populatesettings.config_file
, so that value will always be None—capture the actual config file path when you parse and only addsettings
after a successful load. - The simplified get_default_engine drops the macOS apple_vm check and will always pick 'podman' if available; if you still need to respect Apple VM provisioning, reintroduce that conditional or clearly document the behavior change.
- BasicImageFileLoader.load emits a MIME type of None when guess_type fails (resulting in
data:None;base64,…
); consider defaulting to a safe type like 'application/octet-stream' or raising an error instead.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In load_file_config, `config_path` is never set before you use it to populate `settings.config_file`, so that value will always be None—capture the actual config file path when you parse and only add `settings` after a successful load.
- The simplified get_default_engine drops the macOS apple_vm check and will always pick 'podman' if available; if you still need to respect Apple VM provisioning, reintroduce that conditional or clearly document the behavior change.
- BasicImageFileLoader.load emits a MIME type of None when guess_type fails (resulting in `data:None;base64,…`); consider defaulting to a safe type like 'application/octet-stream' or raising an error instead.
## Individual Comments
### Comment 1
<location> `ramalama/file_loaders/file_manager.py:20` </location>
<code_context>
+ self.loaders = {ext.lower(): loader() for loader in self.get_loaders() for ext in loader.file_extensions()}
+
+ def _get_loader(self, file: str) -> base.BaseFileLoader:
+ return self.loaders[os.path.splitext(file)[1].lower()]
+
+ @abstractmethod
</code_context>
<issue_to_address>
No error handling for unsupported file extensions in '_get_loader'.
Accessing 'self.loaders' without checking for the extension may cause a KeyError. Handle missing extensions by raising a custom exception or returning None for clearer errors.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
class BaseFileManager(ABC):
"""
Base class for file upload handlers.
This class should be extended by specific file type handlers.
"""
def __init__(self):
self.loaders = {ext.lower(): loader() for loader in self.get_loaders() for ext in loader.file_extensions()}
def _get_loader(self, file: str) -> base.BaseFileLoader:
return self.loaders[os.path.splitext(file)[1].lower()]
@abstractmethod
def load(self):
pass
@classmethod
@abstractmethod
def get_loaders(cls) -> List[Type[base.BaseFileLoader]]:
pass
=======
class UnsupportedFileExtensionError(Exception):
"""Raised when a file extension is not supported by any loader."""
pass
class BaseFileManager(ABC):
"""
Base class for file upload handlers.
This class should be extended by specific file type handlers.
"""
def __init__(self):
self.loaders = {ext.lower(): loader() for loader in self.get_loaders() for ext in loader.file_extensions()}
def _get_loader(self, file: str) -> base.BaseFileLoader:
ext = os.path.splitext(file)[1].lower()
if ext not in self.loaders:
raise UnsupportedFileExtensionError(f"Unsupported file extension: {ext}")
return self.loaders[ext]
@abstractmethod
def load(self):
pass
@classmethod
@abstractmethod
def get_loaders(cls) -> List[Type[base.BaseFileLoader]]:
pass
>>>>>>> REPLACE
</suggested_fix>
### Comment 2
<location> `ramalama/file_loaders/file_types/txt.py:25` </location>
<code_context>
+ """
+
+ # TODO: Support for non-default encodings?
+ with open(file, "r") as f:
+ return f.read()
</code_context>
<issue_to_address>
No explicit encoding specified when opening text files.
Please specify 'encoding="utf-8"' when opening files to ensure consistent behavior across different systems.
</issue_to_address>
### Comment 3
<location> `test/unit/test_file_upload.py:141` </location>
<code_context>
+class TestImageFileManager:
+ """Test the image file manager class."""
+
+ def test_image_file_manager_load_single_image_file(self):
+ """Test loading a single image file."""
+ with patch.object(BasicImageFileLoader, 'load', return_value="data:image/jpeg;base64,test"):
+ manager = ImageFileManager()
+ result = manager.load(["test.jpg"])
+
+ assert len(result) == 1
+ assert result[0] == "data:image/jpeg;base64,test"
+
+ def test_image_file_manager_load_multiple_images(self):
</code_context>
<issue_to_address>
No test for unsupported file extensions in ImageFileManager.
Add a test to confirm that ImageFileManager.load properly handles unsupported file extensions, either by ignoring them or raising an error.
Suggested implementation:
```python
def test_image_file_manager_load_multiple_images(self):
"""Test loading multiple image files."""
with patch.object(
BasicImageFileLoader, 'load', side_effect=["data:image/jpeg;base64,test1", "data:image/png;base64,test2"]
):
manager = ImageFileManager()
result = manager.load(["test1.jpg", "test2.png"])
assert len(result) == 2
assert result[0] == "data:image/jpeg;base64,test1"
assert result[1] == "data:image/png;base64,test2"
def test_image_file_manager_load_unsupported_extension(self):
"""Test that unsupported file extensions are handled properly."""
# Patch the loader so it would raise an error if called with an unsupported extension
with patch.object(BasicImageFileLoader, 'load') as mock_load:
manager = ImageFileManager()
# Assume 'txt' is not a supported image extension
result = manager.load(["test.txt", "test.jpg"])
# Only the supported file should be loaded
assert len(result) == 1
# The loader should only be called for the supported file
mock_load.assert_called_once_with("test.jpg")
```
If `ImageFileManager.load` is supposed to raise an error for unsupported extensions instead of ignoring them, change the test to use `pytest.raises` and assert the exception is raised. Adjust the test according to the actual behavior of your implementation.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
test/unit/test_file_upload.py
Outdated
def test_image_file_manager_load_single_image_file(self): | ||
"""Test loading a single image file.""" | ||
with patch.object(BasicImageFileLoader, 'load', return_value="data:image/jpeg;base64,test"): | ||
manager = ImageFileManager() | ||
result = manager.load(["test.jpg"]) | ||
|
||
assert len(result) == 1 | ||
assert result[0] == "data:image/jpeg;base64,test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): No test for unsupported file extensions in ImageFileManager.
Add a test to confirm that ImageFileManager.load properly handles unsupported file extensions, either by ignoring them or raising an error.
Suggested implementation:
def test_image_file_manager_load_multiple_images(self):
"""Test loading multiple image files."""
with patch.object(
BasicImageFileLoader, 'load', side_effect=["data:image/jpeg;base64,test1", "data:image/png;base64,test2"]
):
manager = ImageFileManager()
result = manager.load(["test1.jpg", "test2.png"])
assert len(result) == 2
assert result[0] == "data:image/jpeg;base64,test1"
assert result[1] == "data:image/png;base64,test2"
def test_image_file_manager_load_unsupported_extension(self):
"""Test that unsupported file extensions are handled properly."""
# Patch the loader so it would raise an error if called with an unsupported extension
with patch.object(BasicImageFileLoader, 'load') as mock_load:
manager = ImageFileManager()
# Assume 'txt' is not a supported image extension
result = manager.load(["test.txt", "test.jpg"])
# Only the supported file should be loaded
assert len(result) == 1
# The loader should only be called for the supported file
mock_load.assert_called_once_with("test.jpg")
If ImageFileManager.load
is supposed to raise an error for unsupported extensions instead of ignoring them, change the test to use pytest.raises
and assert the exception is raised. Adjust the test according to the actual behavior of your implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces vision-based RAG support, enhances environment/config parsing, refactors file processing, and extends the CLI with model listing and selection flags. A critical bug in the configuration loading logic needs to be addressed, along with a couple of medium-severity suggestions to improve code clarity. After these fixes, this will be a fantastic contribution.
Signed-off-by: Ian Eaves <[email protected]>
@engelmi PTAL |
@@ -17,15 +17,18 @@ def __init__(self): | |||
self.loaders = {ext.lower(): loader() for loader in self.get_loaders() for ext in loader.file_extensions()} | |||
|
|||
def _get_loader(self, file: str) -> base.BaseFileLoader: | |||
return self.loaders[os.path.splitext(file)[1].lower()] | |||
loader = self.loaders.get(os.path.splitext(file)[1].lower(), None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets create a file_type local variable for this to make it clearer.
file_type = os.path.splitext(file)[1].lower(
LGTM |
This PR address issue #150 allowing users to pass image based content through to an LLM over the chat interface.
Summary by Sourcery
Implement context-aware chat enhancements by adding vision-based RAG support, revamp environment/config parsing with nested and legacy variable handling, refactor file processing into dedicated loader managers, extend CLI with model listing and selection flags, and shore up build, CI, and test coverage around these changes.
New Features:
Enhancements:
Build:
CI:
Documentation:
Tests: