-
Notifications
You must be signed in to change notification settings - Fork 74
enhancement: use pdfium for rendering pdf #455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| try: | ||
| images: dict[int, Image.Image] = {} | ||
| if dpi is None: | ||
| dpi = int(os.environ.get("PDF_RENDER_DPI", 400)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default DPI changed from 200 to 400
Medium Severity
The convert_pdf_to_image function's default DPI changed from 200 to 400. The old signature had dpi: int = 200, but the new signature uses dpi: Optional[int] = None with a fallback to int(os.environ.get("PDF_RENDER_DPI", 400)). External callers relying on the previous default will now receive images rendered at 4x the resolution (2x in each dimension), causing significantly higher memory usage and different output quality. Internal callers are unaffected since they explicitly pass dpi=pdf_image_dpi which defaults to 200.
|
@claude , review |
|
Claude finished @cragwolfe's task —— View job Code Review for pypdfium2 MigrationReviewing the switch from pdf2image to pypdfium2 for PDF rendering. Todo List:
High-Priority Issues That Need Fixing🔴 Critical Runtime Error -
|
cragwolfe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(though did Claude get confused by resolved cursor comments?)
Switch to pypdfium2 for rendering.
Note
Switches PDF rendering backend from
pdf2imagetopypdfium2and updates APIs/call sites accordingly.pdf2imagewithpypdfium2ininference/layout.py; imports and calls updatedconvert_pdf_to_imagethat supportsfilenameor file-like input, page ranges (first_page/last_page), optionalpassword, and DPI defaulting toPDF_RENDER_DPI(fallback 400); saves tooutput_folderwhen provided and returns paths whenpath_onlyDocumentLayout.from_fileandPageLayout._get_imageto use the new function1.1.5and updatesCHANGELOG.mdnoting performance improvementsWritten by Cursor Bugbot for commit a9e68f4. This will update automatically on new commits. Configure here.