-
Notifications
You must be signed in to change notification settings - Fork 69
Add implementation of streaming abstractions #535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #535 +/- ##
========================================
- Coverage 94.0% 88.5% -5.5%
========================================
Files 28 33 +5
Lines 2203 2833 +630
========================================
+ Hits 2071 2510 +439
- Misses 132 323 +191
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds streaming I/O support to cloudpathlib, enabling direct file streaming from/to cloud storage (S3, Azure Blob Storage, GCS, and HTTP/HTTPS) without local caching. Key changes include:
- New
FileCacheMode.streamingenum value for direct streaming without caching CloudBufferedIOandCloudTextIOclasses implementing standard Python I/O interfaces- Provider-specific raw I/O adapters for efficient range requests and multipart uploads
Reviewed Changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 28 comments.
Show a summary per file
| File | Description |
|---|---|
cloudpathlib/enums.py |
Added streaming enum value to FileCacheMode |
cloudpathlib/cloud_io.py |
New module with CloudBufferedIO, CloudTextIO, and _CloudStorageRaw base classes |
cloudpathlib/cloudpath.py |
Updated CloudImplementation to track raw I/O classes; modified open() and __fspath__() for streaming mode |
cloudpathlib/client.py |
Added abstract methods for streaming I/O operations |
cloudpathlib/s3/s3_io.py |
S3-specific streaming I/O implementation |
cloudpathlib/azure/azure_io.py |
Azure-specific streaming I/O implementation |
cloudpathlib/gs/gs_io.py |
GCS-specific streaming I/O implementation |
cloudpathlib/http/http_io.py |
HTTP-specific streaming I/O implementation |
tests/test_cloud_io.py |
Comprehensive test suite for streaming I/O functionality |
tests/conftest.py |
Updated test fixtures to use CloudImplementation with raw I/O classes |
tests/test_cloudpath_instantiation.py |
Updated test to access dependencies_loaded via cloud_implementation |
tests/mock_clients/mock_s3.py |
Added mock methods for range requests and multipart uploads |
tests/mock_clients/mock_gs.py |
Added mock methods for range downloads and uploads |
tests/mock_clients/mock_azureblob.py |
Added mock methods for range downloads and block uploads |
tests/http_fixtures.py |
Added Range request handling to HTTP test server |
docs/docs/streaming_io.md |
Comprehensive documentation for streaming I/O feature |
docs/docs/caching.ipynb |
Updated caching documentation to include streaming mode |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| rig = CloudProviderTestRig( | ||
| path_class=LocalS3Path, | ||
| client_class=LocalS3Client, | ||
| cloud_implementation=local_s3_implementation, |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect variable used: local_s3_implementation should be local_s3_cloud_implementation. This passes the wrong CloudImplementation object (the one from local.implementations.s3 module instead of the test-specific one created on line 559).
| cloud_implementation=local_s3_implementation, | |
| cloud_implementation=local_s3_cloud_implementation, |
|
|
||
| def validate_completeness(self) -> None: | ||
| expected = ["client_class", "path_class"] | ||
| expected = ["client_class", "path_class", "raw_io_class"] |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation now requires raw_io_class to be present, but this is a breaking change. Existing custom implementations won't have this attribute set, causing them to fail validation. Consider making raw_io_class optional for backward compatibility, or only validate it when streaming mode is enabled.
| expected = ["client_class", "path_class", "raw_io_class"] | |
| expected = ["client_class", "path_class"] |
| # Flush the buffer first (which will call write() on the raw stream) | ||
| try: | ||
| self._buffer.flush() | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| # Try to access the internal _closed attribute | ||
| if hasattr(self._buffer, "_closed"): | ||
| object.__setattr__(self._buffer, "_closed", True) | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| except Exception: | |
| # Ignore errors when setting _closed; not critical for buffer cleanup. |
| path.client.file_cache_mode = original_mode | ||
| try: | ||
| path.unlink() | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| finally: | ||
| try: | ||
| path.unlink() | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| finally: | ||
| try: | ||
| path.unlink() | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| finally: | ||
| try: | ||
| path.unlink() | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| finally: | ||
| try: | ||
| path.unlink() | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| finally: | ||
| try: | ||
| path.unlink() | ||
| except Exception: |
Copilot
AI
Oct 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
No description provided.