-
Notifications
You must be signed in to change notification settings - Fork 37
py_pkg_treemap() add heatmap mode to color by module stats such as coverage
#317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…coverage - new SVG demo assets for `py_pkg_treemap` in `readme.md` to demonstrate coverage heatmaps for pymatgen+pymatviz - bump `kaleido` dependency from `==0.2.1` to `>=1.0.0` in `pyproject.toml` - refactor `py_pkg_treemap.py` to support module objects - add unit tests for on-the-fly coverage data collection and treemap coverage heatmap
|
Caution Review failedThe pull request is closed. WalkthroughThis update introduces heatmap coloring and code coverage visualization to Python package treemaps, primarily by extending the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant py_pkg_treemap
participant CoverageHelper
participant Plotly
User->>py_pkg_treemap: Call with packages, color_by="coverage"
py_pkg_treemap->>CoverageHelper: Load/collect coverage data
CoverageHelper-->>py_pkg_treemap: Return coverage dict
py_pkg_treemap->>py_pkg_treemap: Map coverage to modules, compute averages
py_pkg_treemap->>Plotly: Create treemap with heatmap coloring
Plotly-->>User: Rendered treemap with coverage heatmap
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
pymatviz/treemap/py_pkg.py (2)
225-231: Consider documenting the coverage tool requirement.The subprocess call to
coverageassumes the tool is installed and available in PATH. Consider adding a note in the docstring that dynamic coverage collection requires thecoveragepackage to be installed.def collect_coverage_data(coverage_data_file: str | None = None) -> dict[str, float]: """Collect coverage data for packages. Args: coverage_data_file: Path to coverage JSON file. If provided and not found, raises FileNotFoundError instead of auto-generating. + If None, attempts to collect coverage data dynamically using the + 'coverage' command-line tool (requires 'coverage' package installed). Returns: dict[str, float]: Mapping from absolute file paths to coverage percentages.
885-888: Fix coverage percentage formatting.The format string
:,.1fincludes comma separators which are unusual for percentages (e.g., "1,234.5% cov" instead of "1234.5% cov").- coverage_text = f"<br>%{{customdata[{color_value_index}]:,.1f}}% cov" + coverage_text = f"<br>%{{customdata[{color_value_index}]:.1f}}% cov"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
assets/svg/py-pkg-treemap-multiple.svgis excluded by!**/*.svgassets/svg/py-pkg-treemap-numpy.svgis excluded by!**/*.svgassets/svg/py-pkg-treemap-pymatgen-coverage.svgis excluded by!**/*.svgassets/svg/py-pkg-treemap-pymatgen.svgis excluded by!**/*.svgassets/svg/py-pkg-treemap-pymatviz-coverage.svgis excluded by!**/*.svgassets/svg/py-pkg-treemap-pymatviz.svgis excluded by!**/*.svg
📒 Files selected for processing (6)
assets/scripts/treemap/py_pkg_treemap.py(4 hunks)pymatviz/structure/helpers.py(0 hunks)pymatviz/treemap/py_pkg.py(17 hunks)pyproject.toml(1 hunks)readme.md(1 hunks)tests/treemap/test_py_pkg_treemap.py(8 hunks)
💤 Files with no reviewable changes (1)
- pymatviz/structure/helpers.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: test-example-scripts (assets/scripts/cluster/composition/cluster_compositions_matbench.py)
- GitHub Check: test-example-scripts (assets/scripts/sunburst/chem_env_sunburst.py)
- GitHub Check: test-example-scripts (assets/scripts/ptable/ptable_heatmap_splits_plotly.py)
- GitHub Check: build / build
- GitHub Check: tests (ubuntu-latest, 3) / tests
- GitHub Check: tests (ubuntu-latest, 2) / tests
- GitHub Check: tests (ubuntu-latest, 4) / tests
- GitHub Check: tests (ubuntu-latest, 1) / tests
🔇 Additional comments (4)
readme.md (1)
274-278: LGTM! Clear documentation for the new coverage heatmap feature.The examples effectively demonstrate the new
color_by="coverage"functionality, and the note provides helpful guidance on generating the required coverage data.assets/scripts/treemap/py_pkg_treemap.py (1)
10-154: LGTM! Comprehensive examples demonstrating the new heatmap features.The script effectively showcases:
- Module object support alongside string package names
- Coverage heatmap visualization with appropriate color scales
- Custom hover templates displaying coverage percentages
- Manual color range specification for consistent visualization
tests/treemap/test_py_pkg_treemap.py (1)
701-1012: Excellent test coverage for the new heatmap and module object features!The tests comprehensively cover:
- Module object normalization with various input types
- Coverage data collection from files and subprocess
- Coverage visualization with different color modes and ranges
- Edge cases like missing coverage data and path matching
- Weighted average calculations for parent nodes
pymatviz/treemap/py_pkg.py (1)
364-950: Excellent implementation of the heatmap coloring feature!The enhancements to
py_pkg_treemapare well-designed and comprehensive:
- Flexible input handling for module objects and strings
- Robust coverage data collection and path matching
- Proper weighted averaging for parent nodes in coverage mode
- Clear documentation with helpful examples
The code handles edge cases well and maintains backward compatibility.
…mprove coverage file matching in `py_pkg_treemap` - Configure Plotly to suppress output in CI environments to avoid log clutter. - Enhance coverage matching logic in `py_pkg_treemap` to handle ambiguous filename cases, logging warnings when multiple matches are found. - Update unit tests to cover new scenarios for ambiguous filenames and validate warning messages.
- refactor `collect_coverage_data` and `_match_coverage_path` in `py_pkg.py` to simplify path handling - new unit tests for coverage path matching and ensure weighted averages for submodule coverage are correctly calculated - dynamic hover templates and cell border styling in in py_pkg_treemap based on color_by param - add new `cell_border` parameter to `py_pkg_treemap` to control cell borders - rename deprecated `ruff` hook to `ruff-check` in `.pre-commit-config.yaml`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
pymatviz/treemap/py_pkg.py (1)
363-425: Well-implemented path matching with multiple strategies!This implementation effectively addresses the previous concern about false positive matches. The function now:
- Tries exact path matching first
- Falls back to repo path segment matching
- Finally tries filename matching with ambiguity handling
- When multiple files match by filename, it selects the best match based on path similarity
The path similarity calculation using
os.path.commonpathis a good approach.
🧹 Nitpick comments (2)
pymatviz/treemap/py_pkg.py (2)
195-253: Consider subprocess security implications.While the subprocess call to
coverageis reasonable for this use case, be aware that:
- The command is hardcoded which mitigates injection risks
- This assumes the
coveragecommand is trusted and available in PATH- Consider documenting that this feature requires the coverage tool to be installed
The implementation is otherwise well-structured with proper error handling.
939-1010: Consider extracting coverage text formatting.The
_build_text_templatefunction handles many cases well, but could benefit from extracting the coverage text formatting logic into a separate helper for better readability.def _build_text_template() -> tuple[str, str]: """Build text template and textinfo based on configuration.""" + def _get_coverage_text() -> str: + """Get coverage text for display.""" + if color_by == "coverage" and has_color_mode: + color_value_index = len(custom_data_cols) - 1 + return f"<br>%{{customdata[{color_value_index}]:,.1f}}% cov" + return "" + # Validate show_counts first valid_show_counts = ("percent", "value", "value+percent", False) if show_counts not in valid_show_counts: raise ValueError( f"Invalid {show_counts=}, must be one of {valid_show_counts}" ) # Add coverage information if in coverage heatmap mode - coverage_text = "" - if color_by == "coverage" and has_color_mode: - color_value_index = ( - len(custom_data_cols) - 1 - ) # Last column when in color mode - coverage_text = f"<br>%{{customdata[{color_value_index}]:,.1f}}% cov" + coverage_text = _get_coverage_text()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
assets/svg/py-pkg-treemap-pymatgen-coverage.svgis excluded by!**/*.svgassets/svg/py-pkg-treemap-pymatviz-coverage.svgis excluded by!**/*.svg
📒 Files selected for processing (5)
.pre-commit-config.yaml(1 hunks)assets/scripts/treemap/py_pkg_treemap.py(4 hunks)pymatviz/__init__.py(2 hunks)pymatviz/treemap/py_pkg.py(18 hunks)tests/treemap/test_py_pkg_treemap.py(8 hunks)
✅ Files skipped from review due to trivial changes (2)
- .pre-commit-config.yaml
- pymatviz/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
- assets/scripts/treemap/py_pkg_treemap.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: test-example-scripts (assets/scripts/cluster/composition/cluster_compositions_matbench.py)
- GitHub Check: test-example-scripts (assets/scripts/ptable/ptable_heatmap_splits_plotly.py)
- GitHub Check: test-example-scripts (assets/scripts/sunburst/chem_env_sunburst.py)
- GitHub Check: build / build
- GitHub Check: tests (ubuntu-latest, 3) / tests
- GitHub Check: tests (ubuntu-latest, 1) / tests
- GitHub Check: tests (ubuntu-latest, 4) / tests
- GitHub Check: tests (ubuntu-latest, 2) / tests
🔇 Additional comments (34)
tests/treemap/test_py_pkg_treemap.py (17)
5-8: LGTM!The new imports are appropriate for the added test functionality -
jsonfor coverage data handling andAnyfor type annotations.
146-147: LGTM!Good use of the
pytest.mark.slowmarker for the potentially slow test case involving an installed package.
293-295: LGTM!The test correctly reflects the new behavior where custom templates are used for value formatting even without base_url.
419-424: LGTM!Good expansion of test cases to cover various metadata URL formats including non-GitHub and GitHub project URLs.
489-505: LGTM!The test correctly validates the new template behavior for different
show_countsvalues.
651-698: LGTM!Improved mock verification ensures the test properly validates the cell_size_fn functionality and data passing to Plotly.
701-722: LGTM!Comprehensive test coverage for the
_normalize_package_inputfunction, including string inputs, module objects, submodules, and edge cases.
724-737: LGTM!Good test coverage for the new module object input feature, testing both single modules and mixed inputs.
739-752: LGTM!Well-structured fixture providing realistic mock coverage data for testing.
754-778: LGTM!Thorough test coverage for
collect_coverage_dataincluding valid files, missing files, and invalid JSON handling.
780-883: LGTM!Comprehensive test scenarios for coverage heatmap functionality including color scales, ranges, custom dicts, and ambiguous filename handling.
884-906: LGTM!Good test for coverage path matching with src/ prefix, ensuring coverage data is correctly mapped even with different path structures.
907-979: LGTM!Excellent test coverage for subprocess-based coverage collection, including file existence checks, subprocess failures, and success scenarios.
980-1039: LGTM!Thorough edge case testing including coverage with/without data, custom dicts, and colorbar title validation.
1040-1106: LGTM!Excellent regression test ensuring submodule coverage correctly shows weighted averages rather than all nodes having the same value.
1107-1207: Critical regression prevention test!This test specifically prevents the bug where all submodules would show the same coverage value. The test creates a realistic package structure and verifies that parent nodes have different weighted average coverage values.
1208-1248: LGTM!Comprehensive test for the cell_border parameter, covering all scenarios including default behavior, custom borders, and coverage mode.
pymatviz/treemap/py_pkg.py (17)
11-15: LGTM!The new imports are appropriate for the added functionality -
jsonfor coverage data,subprocessfor dynamic coverage collection,sysfor path operations, andtempfilefor temporary file handling.
25-25: LGTM!Appropriate addition of
ModuleTypefor type checking module objects.
68-75: LGTM!Clean implementation for normalizing package inputs, correctly handling strings, module objects, and edge cases.
97-111: LGTM!Good modernization using
importlib.resources.filesand helpful addition ofsys.pathsearch for packages added to path in tests.
256-256: LGTM!Appropriate type update to support module objects alongside strings.
271-272: LGTM!Correct usage of the new
_normalize_package_inputhelper.
324-331: LGTM!Good detection of src-layout repositories for proper path construction.
428-439: LGTM!Well-designed function signature supporting the new features while maintaining backward compatibility.
447-490: LGTM!Comprehensive parameter documentation clearly explaining the new features and their usage.
524-537: LGTM!Excellent examples demonstrating various usage patterns including coverage, custom values, and module objects.
539-544: LGTM!Clean normalization of package inputs to handle both strings and module objects.
595-601: LGTM!Proper handling of different color_by parameter types.
630-656: LGTM!Well-structured logic for adding color values based on different color_by scenarios.
699-731: LGTM!Correct implementation of weighted average coverage calculation for package-level display.
808-869: Critical fix for hierarchical coverage display!This implementation correctly calculates weighted averages for parent nodes in the treemap, preventing the regression where all nodes would show the same coverage value. The algorithm properly:
- Maps labels to DataFrame rows
- Collects coverage values for leaf nodes
- Calculates weighted averages for parent nodes based on their children's values and weights
1012-1015: LGTM!Good default behavior for cell borders - white borders in coverage mode for better visibility, no borders otherwise.
1021-1023: LGTM!Important fix ensuring the figure auto-sizes to its parent container by setting width/height to None.
- Update docstrings to clarify the handling of local files and URLs. - new test for collecting coverage data from GitHub URL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
pymatviz/treemap/py_pkg.py (1)
453-1057: Comprehensive enhancement of py_pkg_treemap with heatmap visualization.The implementation successfully adds flexible heatmap coloring capabilities while maintaining backward compatibility. Key strengths:
- Flexible input handling for both strings and module objects
- Multiple coverage data sources (files, URLs, subprocess)
- Accurate weighted average calculations for hierarchical coloring
- Well-structured code despite the complexity
Consider extracting some of the nested logic (e.g., weighted average calculation) into separate helper functions in future refactoring to improve maintainability.
tests/treemap/test_py_pkg_treemap.py (1)
1250-1264: Test depends on external URL availability.This test relies on a GitHub user attachment URL that may not be permanently available. Consider mocking the URL request or using a more stable test data source to prevent future test failures.
-def test_collect_coverage_data_from_url() -> None: - """Test collecting coverage data from the provided GitHub URL.""" - url = "https://github.com/user-attachments/files/21545088/2025-07-31-pymatviz-coverage.json" - - coverage_map = pmv.treemap.py_pkg.collect_coverage_data(url) +@patch("urllib.request.urlopen") +def test_collect_coverage_data_from_url(mock_urlopen) -> None: + """Test collecting coverage data from URL with mocked response.""" + mock_response = MagicMock() + mock_response.__enter__.return_value = mock_response + mock_response.read.return_value = json.dumps({ + "files": { + "module1.py": {"summary": {"percent_covered": 85.5}}, + "module2.py": {"summary": {"percent_covered": 92.1}} + } + }).encode() + mock_urlopen.return_value = mock_response + + url = "https://example.com/coverage.json" + coverage_map = pmv.treemap.py_pkg.collect_coverage_data(url)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
assets/svg/py-pkg-treemap-pymatgen-coverage.svgis excluded by!**/*.svgassets/svg/py-pkg-treemap-pymatviz-coverage.svgis excluded by!**/*.svg
📒 Files selected for processing (3)
assets/scripts/treemap/py_pkg_treemap.py(4 hunks)pymatviz/treemap/py_pkg.py(18 hunks)tests/treemap/test_py_pkg_treemap.py(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- assets/scripts/treemap/py_pkg_treemap.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: test-example-scripts (assets/scripts/cluster/composition/cluster_compositions_matbench.py)
- GitHub Check: test-example-scripts (assets/scripts/sunburst/chem_env_sunburst.py)
- GitHub Check: test-example-scripts (assets/scripts/ptable/ptable_heatmap_splits_plotly.py)
- GitHub Check: tests (ubuntu-latest, 4) / tests
- GitHub Check: tests (ubuntu-latest, 1) / tests
- GitHub Check: tests (ubuntu-latest, 3) / tests
- GitHub Check: tests (ubuntu-latest, 2) / tests
- GitHub Check: build / build
🔇 Additional comments (13)
pymatviz/treemap/py_pkg.py (7)
69-76: Clean implementation of package input normalization.The function properly handles both string and module object inputs, with appropriate fallback for edge cases.
95-133: Enhanced package path resolution with comprehensive fallback strategies.Good addition of sys.path search and multiple fallback mechanisms to handle various package installation scenarios.
196-279: Well-implemented coverage data collection with multiple sources.Good security practices with HTTPS-only URLs and hardcoded subprocess commands. Proper error handling and temporary file cleanup.
281-299: Updated to support module objects with proper normalization.Good integration with the new
_normalize_package_inputhelper function.
389-451: Coverage path matching with proper disambiguation for filename conflicts.Good implementation that addresses the previous concern about false positives by using path similarity to select the best match when multiple files share the same basename.
836-896: Excellent implementation of weighted coverage averages for parent nodes.The logic correctly calculates coverage percentages for parent nodes as line-weighted averages of their children, ensuring accurate hierarchical visualization.
968-1042: Well-organized text template building with comprehensive handling of display modes.Good use of nested functions to encapsulate logic and proper validation of input parameters.
tests/treemap/test_py_pkg_treemap.py (6)
141-169: Comprehensive package path testing with appropriate slow test marking.Good coverage of different package scenarios including local, installed, and non-existent packages.
701-722: Excellent test coverage for package input normalization.Covers all input scenarios including strings, module objects, submodules, and edge cases.
754-778: Thorough testing of coverage data collection with proper error handling.Good coverage of success and error scenarios including file operations and JSON parsing.
907-978: Well-designed subprocess testing with comprehensive mocking.Excellent use of mocks to test subprocess scenarios without actual subprocess execution.
1107-1206: Excellent regression test for weighted coverage averages.This test effectively prevents the regression where all submodules showed the same coverage value. Good verification of both parent node averages and leaf node values.
1208-1248: Comprehensive cell border parameter testing.Excellent coverage of all border scenarios including defaults, custom values, and mode-specific behavior.
py_pkg_treemapinreadme.mdto demonstrate coverage heatmaps for pymatgen+pymatvizkaleidodependency from==0.2.1to>=1.0.0inpyproject.tomlpy_pkg_treemap.pyto support module objectsSummary by CodeRabbit
New Features
Documentation
Bug Fixes
Tests
Chores