Components | All | New | MacOS | Windows | Linux | iOS | ||||
Examples | Mac & Win | Server | Client | Guides | Statistic | FMM | Blog | Deprecated | Old |
PDFKit.GetPDFPageText
Queries the text of a page in a PDF document.
Component | Version | macOS | Windows | Linux | Server | iOS SDK |
PDFKit | 2.1 | ✅ Yes | ❌ No | ❌ No | ✅ Yes, on macOS | ✅ Yes |
MBS( "PDFKit.GetPDFPageText"; PDF; index ) More
Parameters
Parameter | Description | Example |
---|---|---|
A container value with the PDF content from a media field. Or a text with an URL. Or a PDF reference from PDFKit.Open. | ||
index | The index of the page. From zero to PDFKit.GetPDFPageCount-1. | 5 |
Result
The text of the PDF page as far as PDFKit knows it.
Description
Queries the text of a page in a PDF document.If you need text from all pages, please use PDFKit.GetPDFText.
For solutions on Windows, Linux or iOS, please use DynaPDF.ExtractPageText.
You may need to use Text.ConvertUnicodeToCharacterComposition if text is coming back in decomposed unicode characters.
For DynaPDF.ExtractText we sort text blocks, but for PDFKit functions we can just get the text in the order it appears in the PDF, independent of its position. Text is there for indexing or search, but not in the order you would read it.
Examples
Extract text of page 6 in this PDF
MBS( "PDFKit.GetPDFPageText"; $ref; 5 )
See also
- DynaPDF.ExtractPageText
- DynaPDF.ExtractText
- OCR.SetImage
- PDFKit.GetPDFDocument
- PDFKit.GetPDFPageCount
- PDFKit.GetPDFPagePDF
- PDFKit.GetPDFPagePDFRef
- PDFKit.GetPDFPageValue
- PDFKit.GetPDFText
- Text.ConvertUnicodeToCharacterComposition
Example Databases
Blog Entries
This function checks for a license.
Created 18th August 2014, last changed 11st April 2023