Skip to content

Conversation

j-t-1
Copy link
Contributor

@j-t-1 j-t-1 commented Aug 5, 2025

Also correct types in CCITTParameters.

@j-t-1 j-t-1 changed the title ENH: Add BlackIs1t functionality to tiff_header ENH: Add BlackIs1 functionality to tiff_header Aug 5, 2025
Copy link

codecov bot commented Aug 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.97%. Comparing base (0b58493) to head (4555d87).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3421   +/-   ##
=======================================
  Coverage   96.97%   96.97%           
=======================================
  Files          54       54           
  Lines        9323     9324    +1     
  Branches     1707     1708    +1     
=======================================
+ Hits         9041     9042    +1     
  Misses        168      168           
  Partials      114      114           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 6, 2025

It would be easier to have defaults in CCITTParameters for columns. Then we do not have to set them in _get_parameters.

pypdf/pypdf/filters.py

Lines 589 to 596 in 56f0eaa

@staticmethod
def _get_parameters(
parameters: Union[None, ArrayObject, DictionaryObject, IndirectObject],
rows: Union[int, IndirectObject],
) -> CCITTParameters:
# §7.4.6, optional parameters for the CCITTFaxDecode filter
k = 0
columns = 1728

But then if they are not retrieved they would be None. The design of this could be better.

@stefan6419846
Copy link
Collaborator

But then if they are not retrieved they would be None. The design of this could be better.

This is wrong. The default value currently seems to be 0 instead of 1728.

The main change we would have to do here would be creating the dataclass instance at the beginning and writing its attributes directly.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 6, 2025

But then if they are not retrieved they would be None. The design of this could be better.

This is wrong. The default value currently seems to be 0 instead of 1728.

The main change we would have to do here would be creating the dataclass instance at the beginning and writing its attributes directly.

So we need to change the default to be 1728. Then initiate the dataclass, obtaining the defaults from CCITTParameters? If all the defaults (from the specification) are there then it is easier to understand.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 6, 2025

FAILED tests/test_filters.py::test_ccitt_get_parameters[None-0] - UnboundLocalError: cannot access local variable 'black_is_1' where it is not associated with a value

The test does not use the new variable. Is this error saying although it is unused in the test it needs to be used, kinda like coverage tests?

@stefan6419846
Copy link
Collaborator

The test does not use the new variable. Is this error saying although it is unused in the test it needs to be used, kinda like coverage tests?

You never initialize this variable in the _get_parameters method. If the BlackIs1 parameter is omitted in the PDF, this will fail.

This should be solved by implementing the proposal from #3421 (comment) as well.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 6, 2025

@stefan6419846 do you know how to fix the new set of errors?

@stefan6419846
Copy link
Collaborator

Not directly. You might want to try setting int(params.BlackIs1) to fixed 0 again. If this works, this most likely indicates that Pillow somehow refuses to load such TIFF images directly and we would most likely have to revert/ignore most of the changes of this PR - or maybe we find a way to further discuss this with upstream if I find the time to have a deeper look into this myself.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 6, 2025

Agree to fix int(params.BlackIs1) to 0 (it worked before). The changes we have made in this PR make it more understandable than before. We may be able to get this over the line.

rows in _get_parameters is unused, have put that back.

@stefan6419846 stefan6419846 changed the title ENH: Add BlackIs1 functionality to tiff_header ENH: Move BlackIs1 functionality to tiff_header Aug 8, 2025
@stefan6419846 stefan6419846 merged commit b622a2f into py-pdf:main Aug 8, 2025
17 checks passed
@j-t-1 j-t-1 deleted the CCITTParameters branch August 8, 2025 08:30
stefan6419846 added a commit that referenced this pull request Aug 11, 2025
## What's new

### Security (SEC)
- Limit decompressed size for FlateDecode filter (#3430) by @stefan6419846

### Deprecations (DEP)
- Drop Python 3.8 support (#3412) by @stefan6419846

### New Features (ENH)
- Move BlackIs1 functionality to tiff_header (#3421) by @j-t-1

### Robustness (ROB)
- Skip Go-To actions without a destination (#3420) by @badGarnet

### Developer Experience (DEV)
- Update code style related libraries (#3414) by @stefan6419846
- Update mypy to 1.17.0 (#3413) by @stefan6419846
- Stop testing on Python 3.8 and start testing on Python 3.14 (#3411) by @stefan6419846

### Maintenance (MAINT)
- Cleanup deprecations (#3424) by @stefan6419846

[Full Changelog](5.9.0...6.0.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants