Apache PDFBox 2.0.22 released: open-source Java tool
Apache PDFBox
The Apache PDFBox library is an open-source Java tool for working with PDF documents. This project allows the creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. PDFBox also includes several command-line utilities. PDFBox is published under the Apache License, Version 2.0.
Features
- Extract Text: Extract Unicode text from PDF files.
- Split & Merge: Split a single PDF into many files or merge multiple PDF files.
- Fill Forms: Extract data from PDF forms or fill a PDF form.
- Preflight: Validate PDF files against the PDF/A-1b standard.
- Print: Print a PDF file using the standard Java printing API.
- Save as Image: Save PDFs as image files, such as PNG or JPEG.
- Create PDFs: Create a PDF from scratch, with embedded fonts and images.
- Signing: Digitally sign PDF files.
Apache PDFBox 2.0.22 has been released.
Bug
[PDFBOX-1532] – extra space added to rotated text
[PDFBOX-1752] – Rendering PDF containing Jpeg2000 fails
[PDFBOX-2633] – saveIncremental java.lang.NullPointerException
[PDFBOX-3683] – Unexpected behavior when setting value for radio button with /Opts entry
[PDFBOX-3891] – Missing data if document is merged with itself
[PDFBOX-3953] – StackOverflowError in org.apache.pdfbox.pdmodel.PDPageTree.getKids
[PDFBOX-4270] – Image in field disappears after flattening
[PDFBOX-4421] – Add support for AES128 encryption for public key
[PDFBOX-4430] – Missing transformation in flatterning when an XObject is used as appearance of a form field
[PDFBOX-4617] – PDButton.setValue and PDButton.getOnValueForWidget cannot handle radios with duplicate names and choices
[PDFBOX-4761] – Alignment Issue in textfield
[PDFBOX-4934] – Could not find referenced cmap stream Adobe-Japan1-XXXX
[PDFBOX-4941] – PDRadioButton.getSelectedExportValues() always returns the first entry
[PDFBOX-4944] – Built-in fonts are reporting nbsp char as having zero width.
[PDFBOX-4946] – ArrayIndexOutOfBoundsException while trying to get text from a page
[PDFBOX-4947] – UnsupportedOperationException when using FontMapperImpl.addSubstitute()
[PDFBOX-4949] – “W n” applied to non existent path produces empty clipping result
[PDFBOX-4955] – Flattened form-fields are rendered at the bottom of the page
[PDFBOX-4956] – COSName.hashCode initialized after put to cache, instead before
[PDFBOX-4958] – AcroForm flatten – correct calculation of appearence position
[PDFBOX-4959] – ClassCastException: org.apache.pdfbox.cos.COSStream cannot be cast to org.apache.pdfbox.cos.COSNumber
[PDFBOX-4964] – PDFDebugger Text View for Streams hides errors
[PDFBOX-4969] – java.lang.IndexOutOfBoundsException
[PDFBOX-4980] – Java 6 compile error
[PDFBOX-4984] – Widget Quadding ignored
[PDFBOX-4988] – Space rendered as missing glyph (2)
[PDFBOX-4997] – Incremental update adds certain objects not marked as needing update
[PDFBOX-4999] – Dangerous COSDictionary.addAll(COSDictionary) method
[PDFBOX-5002] – PDFTextStripper sometimes fuses two words on different lines
[PDFBOX-5005] – Resource missing at https://ipafont.ipa.go.jp/
[PDFBOX-5016] – PDButton set subtype methods don’t reset toggled subtype
[PDFBOX-5019] – IllegalArgumentException: miter limit < 1
[PDFBOX-5028] – Partial field names must not contain period characters
[PDFBOX-5033] – CFF FontParser exits with illegal offset in font
[PDFBOX-5040] – Typo in NameRecord table LANGUGAE -> LANGUAGE
[PDFBOX-5041] – NullPointerException in AppearanceGeneratorHelper.insertGeneratedAppearance
[PDFBOX-5042] – IllegalArgumentException when generation of appearances fails
[PDFBOX-5043] – StringIndexOutOfBoundsException in refreshAppearances()
[PDFBOX-5044] – Stack overflow in PDFieldTree.enqueueKids()
[PDFBOX-5046] – StringIndexOutOfBoundsException when doing DateConverter.parseDate()
[PDFBOX-5048] – NullPointerException in PDType1CFont.getStringWidth() and PDType1CFont.getHeight()New Feature
[PDFBOX-45] – Support incremental save
[PDFBOX-2626] – Regenerate field appearances if NeedAppearances is set prior to rendering
[PDFBOX-2857] – Saving XFA document caused prompt saying Extended features has been disabled
[PDFBOX-2858] – Saving document caused prompt saying Extended features has been disabled
[PDFBOX-4847] – [PATCH] Allow to access raw image data and fix ICC profile embedding in PNGConverterImprovement
[PDFBOX-3393] – Javascript actions on form fields cause data to become hidden
[PDFBOX-3667] – Handle Widget Annotations as Fields even if there is no AcroForm Fields entry
[PDFBOX-4948] – Add substitute font “ZapfDingbatsITCbyBT-Regular” for ZapfDingbats
[PDFBOX-4971] – Show “raw” pane for content streams in PDFDebugger
[PDFBOX-4977] – Provide format action support capability for AcroForm field
[PDFBOX-4985] – Render orphan annotation widgets
[PDFBOX-4990] – say which resource not found when a font is missing
[PDFBOX-4991] – say when GlyphList is not found what was sought for
[PDFBOX-4993] – if infile is missing, say which one
[PDFBOX-5000] – Allow PDFDebugger and PDF/A validation to skip AcroForm fix ups
[PDFBOX-5004] – Repair AcroForm in PDFDebugger
[PDFBOX-5027] – Protect/Encrypt PDF with multiple certificates on command lineWish
[PDFBOX-4928] – Could the new rendering method of PageDrawer be optional?
Task
[PDFBOX-4939] – Increase test coverage for AcroForm examples
[PDFBOX-4940] – Increase code coverage for PDImageXObject tests
[PDFBOX-5009] – Corrupt PDF can lead to a StackOverflowSub-task
[PDFBOX-2859] – Support Incremental Update for forms