-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Add logic to track rendering area of various PDF ops #19043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add logic to track rendering area of various PDF ops #19043
Conversation
2475b16
to
5a6a877
Compare
49c4689
to
eac70e4
Compare
master...nicolo-ribaudo:pdf.js:draw-page-portion-optimized is a branch merging this PR together with #19128. In the video below you can see that it first renders in the background a low-resolution image "the old way" taking 12 seconds, and then it renders the "detail view" on top taking only 1.4 seconds and only running one fifth of the PDF operations :) Screen.Recording.2024-12-17.at.18.10.30.mp4Still keeping this as draft because there are significant bugs (in the PDF I'm using for testing, it often skips rendering some pieces of text even if they are visible on screen, or it renders some paths with the wrong color), but it's nice to see some progress. |
Very good progress on this! This is a feature that the community is waiting a long time for. Can't wait to see more progress on this. |
4cd3d42
to
e24c57c
Compare
e24c57c
to
8184a06
Compare
8184a06
to
cad8d31
Compare
Update!
This video shows how we are skipping some ops while rendering the detail view as we scroll around the page :) Screen.Recording.2025-06-02.at.16.00.49.movThe main missing task is that I have to properly hook this logic up to the reftests, maybe rendering a fraction of the page with the logic and checking that it matches the same fraction of the page with the unoptimized rendering. Once this is done, I can go through the failing tests one by one and add the missing tracking. |
c294316
to
740b221
Compare
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 0 Live output at: http://54.241.84.105:8877/65835d880b1e3fa/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 0 Live output at: http://54.193.163.58:8877/3d1a5e199c07639/output.txt |
d676b96
to
9b0b5b9
Compare
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/65835d880b1e3fa/output.txt Total script time: 60.00 mins |
/botio test |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 0 Live output at: http://54.241.84.105:8877/5c0824b19d5019f/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 1 Live output at: http://54.193.163.58:8877/9087a0a216afcf8/output.txt |
There are two failures in the new tests:
They only happen in headless Firefox, and not in "full" Firefox or in Chrome, and the diff is that black lines are very slightly thicker. Any idea of what it could be? |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/5c0824b19d5019f/output.txt Total script time: 33.99 mins
Image differences available at: http://54.241.84.105:8877/5c0824b19d5019f/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/3d1a5e199c07639/output.txt Total script time: 150.75 mins
|
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/9087a0a216afcf8/output.txt Total script time: 62.82 mins
|
This commit is a first step towards mozilla#6419, and it can also help with first compute which ops can affect what is visible in that part of the page. This commit adds logic to track "group of ops" with their respective bounding boxes. Each group eather corresponds to a single op or to a range, and it can have dependencies earlier in the ops list that are not contiguous to the range. Consider the following example: ``` 0. setFillRGBColor 1. beginText 2. showText "Hello" 3. endText 4. constructPath [...] 5. eoFill ``` here we have two groups: the text (range 1-3) and the path (range 4-5). Each of them has a corresponding bounding box, and a dependency on the op at index 0. This tracking happens when first rendering a PDF: we wrap the canvas with a "canvas recorder" that has the same API, but with additional methods to mark the start/end of a group.
When using the pdf debugger, when hovering over a step now: - it highlights the steps in the same groups - it highlights the steps that they depend on - it highlights on the PDF itself the bounding box
Account for line width when stroking Workaround for paintImageMaskXObject Fix transform tracking when using a temporary canvas Track more text dependencies Track text movement properly Forward GState dep for transparency groups showText affects positioning of next text on the same line Mark `bug1734802-partial` as known mismatch Track text font/color across `beginText` calls Fix tracking of transitive dependencies Reset sameLineText in beginText Mark `bug1443140-partial` as a known mismatch Minor cleanup Do not allocate throwaway arrays Fix tracking of leading for moveText Mark issue13130-partial as known partial mismatch Fix tracking of some text drawn by paintChar Account for unbalanced save/restore Mark artofwar-partial as known mismatch Track bbox of type 3 glyphs Track bbox of paintSolidColorImageMask Temporarily skip issue8078-partial Fix .transform call in TilingPattern helper Fix tracking of dependencies of TilingPattern Account for PDFs with no drawings PDFs with morre endText than beginText Track marked content blocks Add missing recordFullPageBBox to CanvasNestedDependencyTracker Mark pr8808-partial as known partial mismatch Fix _createMaskCanvas tracking Mark issue12295-partial as known partial mismatch Use full page bbox for type 3 fonts with no bbox
Track bbox of paintImageMaskXObjectGroup Mark bug1365930-partial as known partial mismatch Use CanvasNestedDependencyTracker for type3 fonts Cache variables in hot loops Track paintInlineImageXObjectGroup bbox before restoring ctx Mark `issue1905-partial` as a known mismatch (it's not visible, at the edge) Track bbox in paintChar when patternFill/patternStroke Todo Fix tracking of smask group transform Mark issue1466-partial as known partial mismatch Avoid multiple CanvasNestedDependencyTracker Mark bug1898802-partial as known mismatch Fix bbox computing with rotations Fix bbox computing of text drawn by clipping Mark bug887152-partial as known partial mismatch Mark issue4926-partial as known mismatch Track text-based clip as dependencies Handle unbalanced save/restore in type3 fonts Ensure that there is a bbox in type3 font operations Fallback for fonts without bbox Track fill dependencies for shadingFill Track filters Mark issue17779-partial as known mismatch Mark a couple more known partial mismatches Fix bbox tracking for invalidPDFjsFont Avoid double fontMatrix transform this.ctx -> ctx Use a float32array for pendingBBox Use existing axialAlignedBoundingBoxfor bbox computation Use bbox stored in font when possible Add test case for untrustworthy font bbox Cache ctx stack transform multiplication Fix recordBBox in CanvasNestedDependencyTracker Remove unnecessary sorting Track rectangular clip boxes Intersect bbox with bbox of clip path data.idx -> idx Fix TS types Fix knownPartialMismatch markings
c19bd43
to
1d5754d
Compare
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 0 Live output at: http://54.193.163.58:8877/e977ca91524b114/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 0 Live output at: http://54.241.84.105:8877/44ba5a9fb922d6c/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/44ba5a9fb922d6c/output.txt Total script time: 17.67 mins
Image differences available at: http://54.241.84.105:8877/44ba5a9fb922d6c/reftest-analyzer.html#web=eq.log |
Fix rebasing mistake
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/e977ca91524b114/output.txt Total script time: 37.64 mins
Image differences available at: http://54.193.163.58:8877/e977ca91524b114/reftest-analyzer.html#web=eq.log |
/botio test |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 0 Live output at: http://54.241.84.105:8877/3edb10cbbfc3cf2/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @nicolo-ribaudo received. Current queue size: 0 Live output at: http://54.193.163.58:8877/54120fbb86bd19d/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/3edb10cbbfc3cf2/output.txt Total script time: 17.47 mins
Image differences available at: http://54.241.84.105:8877/3edb10cbbfc3cf2/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/54120fbb86bd19d/output.txt Total script time: 33.97 mins
Image differences available at: http://54.193.163.58:8877/54120fbb86bd19d/reftest-analyzer.html#web=eq.log |
I started working towards #6419. This PR introduces the logic to track where different elements of the PDF are rendered, and hooks it up to the debugger since @calixteman mentioned that it would be useful.
I'm marking this as draft because there are a few changes I need to make:
canvas.js
to receive the index as a param, rather than returning a function that takes the indexCanvasRecorder
, so that when not recording it doesn't have a performance impact.However, I'd love to receive feedback on the direction.
Commit 1:
Commit 2:
This is an example of what the debugger integration looks like (note: I couldn't figure out how to make my cursor show up in the recording 😅 I'm moving it over the steps list):
Screen.Recording.2024-11-14.at.16.35.58.mov
By default it doesn't show all the bounding boxes because on some PDFs it's too much noise, but if you click on the checkbox then it shows the boxes and you can click on a box to scroll into view the corresponding ops.