The Apache PDFBox library is an open source Java tool for working with PDF documents. This project allows the creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. PDFBox also includes several command line utilities. PDFBox is published under the Apache License, Version 2.0.
Extract Text: Extract Unicode text from PDF files.
Split & Merge: Split a single PDF into many files or merge multiple PDF files.
Fill Forms: Extract data from PDF forms or fill a PDF form.
Preflight: Validate PDF files against the PDF/A-1b standard.
Print: Print a PDF file using the standard Java printing API.
Save as Image: Save PDFs as image files, such as PNG or JPEG.
Create PDFs: Create a PDF from scratch, with embedded fonts and images.