-
Notifications
You must be signed in to change notification settings - Fork 340
feat: Create a robust Sandbox interface that can be backed by Daytona or Modal #215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ndbox.py, and clean up uv.lock
… structure with Daytona and Modal support
…c method to return exit code and output
…c method to use /bin/sh and return exit code
…and update sandbox eval method to raise NotImplementedError
…_tests methods in Sandbox class, and update test cases to reflect new logic
- Implement apply_patch method using base64 encoding to handle special characters and large patches - Implement run_tests method with comprehensive test execution logic - Handle both regular pytest tests and doctest-style paths that appear in some instances - Support automatic dependency installation with retry logic - Use chunked file writing to avoid command length limits - Parse test results including regular failures, errors, collection errors, and doctest failures - Return correct (failed_count, passed_count) tuple matching expected test behavior 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Update comments for clarity on pytest installation and script functionality - Refactor test categorization to better separate regular tests from doctest paths - Introduce debug statements to log the number of tests and their types - Modify exit code handling to ensure accurate reporting of test results - Clean up code by removing obsolete doctest failure handling
…hods - Clean up whitespace and formatting for better readability - Refactor chunked writing logic for patch and test files - Enhance comments for clarity on functionality and logic flow - Consolidate test execution logic to streamline pytest invocation - Ensure consistent handling of exit codes and output parsing
- Add json import for enhanced functionality - Modify test categorization to ensure individual doctests are run correctly - Always include --doctest-modules flag for handling doctests - Simplify test argument assembly for pytest execution
- Update CLAUDE.md to clarify test investigation steps and implementation evaluation criteria - Improve error handling in new_sandbox function for Daytona provider to manage event loop issues - Expand test_run_tests to increase instance index range and assert test results for better validation This commit aims to streamline the testing process and ensure robust sandbox functionality.
…r management - Introduce safe_exec method to streamline command execution with custom error messages - Refactor apply_patch method to utilize write_file for patch handling - Simplify test writing logic by removing chunked writing and directly using write_file - Update test_run_tests to reduce instance index range for better test coverage This commit aims to improve the robustness and clarity of the sandbox functionality.
- Standardize error messages in safe_exec method for consistency - Streamline test list writing by directly joining tests instead of using chunked writing - Enhance clarity in apply_patch method with improved command formatting This commit aims to improve the maintainability and readability of the Sandbox class methods.
- Replace safe_exec and write_file methods with direct command execution for improved clarity and performance - Streamline apply_patch and run_tests methods to enhance error handling and reduce complexity - Update test writing logic to utilize heredoc for better handling of special characters This commit aims to enhance the maintainability and efficiency of the Sandbox class methods.
- Remove unnecessary blank lines to enhance code clarity - Adjust formatting for better consistency in the apply_patch and run_tests methods - Update regex handling for missing modules to improve readability This commit aims to streamline the code structure and enhance maintainability of the Sandbox class.
- Implement chunked writing of test lists to avoid command length limits - Create a dedicated Python script for running pytest with enhanced handling of special characters - Streamline test execution logic to improve error management and maintainability This commit aims to enhance the robustness and efficiency of the test execution process within the Sandbox class.
- Add sampling to instance DataFrame for improved randomness in tests - Introduce pytest-timeout dependency to manage test execution duration - Update new_sandbox function to accept a timeout parameter for sandbox creation - Improve error handling in test results parsing to account for collection errors This commit aims to enhance the robustness and flexibility of the sandbox testing process.
- Add functionality to install project dependencies if setup.py or pyproject.toml exists - Increase maximum retries for test execution from 5 to 20 to improve reliability - Implement handling for special cases in package installation to address discrepancies between import names and package names - Introduce dynamic timeout calculation for test execution based on the number of tests to optimize performance This commit aims to improve the sandbox's installation process and enhance the robustness of test execution.
- Introduce the `edit` method to integrate the `edit_anthropic` tool for file operations such as create, view, string replacement, insertion, and undo functionality. - Enhance error handling with specific exit codes and corresponding RuntimeError messages for better debugging. - Add comprehensive tests to validate the new functionality, covering various scenarios including file creation, viewing, editing, and error cases. This commit aims to expand the capabilities of the Sandbox class, allowing for more dynamic file manipulation within the sandbox environment.
…Sandbox.edit() method. This file included details on method signature, command construction, error handling, and testing considerations. Its removal streamlines the documentation as the implementation is now integrated into the codebase.
- Refactor the `sandbox.py` file to enhance readability by standardizing formatting and reducing unnecessary whitespace. - Update the `test.py` file to improve consistency in test case formatting and structure. - Implement clearer separation of logic in the `edit` method and related test cases for better maintainability. This commit aims to streamline the codebase, making it easier to navigate and understand while maintaining existing functionality.
- Eliminate unnecessary imports from the `test.py` file, specifically `tempfile` and `os`, to streamline the code and improve readability. - This change contributes to a cleaner codebase by removing elements that are not utilized in the current implementation.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.