Topics
All
MacOS
(Only)
Windows
(Only)
Linux
(Only, Not)
iOS
(Only, Not)
Components
Crossplatform Mac & Win
Server
Client
Old
Deprecated
Guides
Examples
Videos
New in version:
12.2
12.3
12.4
12.5
13.0
13.1
13.2
13.3
13.4
13.5
Statistic
FMM
Blog
DynaPDF.ExtractText
Extracts the text of the page PageNum.
Component | Version | macOS | Windows | Linux | Server | iOS SDK |
DynaPDF | 8.0 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
Parameters
Parameter | Description | Example | Flags |
---|---|---|---|
The PDF reference returned from DynaPDF.New. | |||
PageNum | The page number. | 1 | |
Flags | The flags for text extraction. Can include SortTextX, SortTextY, SortTextXY and/or DeleteOverlappingText. |
"SortTextXY" | Optional |
AreaLeft | The left coordiante of the area. | Optional | |
AreaTop | The top coordiante of the area. | Optional | |
AreaRight | The right coordiante of the area. | Optional | |
AreaBottom | The bottom coordiante of the area. | Optional |
Result
Returns text or error.
Description
Extracts the text of the page PageNum.The first page is denoted by 1.
Text lines can be sorted in x- and y-direction. The flag DeleteOverlappingText causes that identical text records which are placed on the same position (with a tolerance of 2 units) will be deleted. The records must occur one after the other in order to detect them.
The optional parameter Area can be set to restrict the text extraction to that rectangle. The rectangle must be defined according to the current coordinate system. That means either in bottom up or top down coordinates, see SetPageCoords() for further information. Note also that the function considers the orientation of the page. The width and height of the rectangle must be exchanged if the orientation is 90, -90, 270, or -270 degrees.
If the function succeeds the return value is the text. If the function fails the return value is an error.
Special thing: If this function is called with two parameters, it redirects to old function DynaPDF.ExtractDocumentText to keep compatibility with existing scripts. If area parameters are not given or all zero, the area is not used.
Needs DynaPDF Lite license.
Please use DynaPDF.SetCMapDir to define the CMap folder to handle encodings better.
See also ExtractText function in DynaPDF manual.
Examples
Extract some text:
Set Variable [ $pdf ; Value: MBS("DynaPDF.New") ]
Set Variable [ $r ; Value: MBS("DynaPDF.OpenPDFFromContainer";$pdf; Test::data) ]
Set Variable [ $r ; Value: MBS("DynaPDF.ImportPDFFile";$pdf) ]
Set Field [ Test::PageText ; MBS("DynaPDF.ExtractText"; $pdf; 1; 0) ]
# Cleanup
Set Variable [ $r ; Value: MBS("DynaPDF.Release"; $pdf) ]
Extract text in area:
Set Field [ Test::PageText ; MBS("DynaPDF.ExtractText"; $pdf; 1; "Default"; 200; 200; 400; 400) ]
See also
- DynaPDF.ExtractDocumentText
- DynaPDF.ExtractPageText
- DynaPDF.ImportPDFFile
- DynaPDF.New
- DynaPDF.OpenPDFFromContainer
- DynaPDF.Release
- DynaPDF.SetCMapDir
- DynaPDF.SetSpaceWidthFactor
- OCR.SetImage
- PDFKit.GetPDFPageText
Release notes
- Version 13.0
- Deprecated DynaPDF.ExtractPageRectText and DynaPDF.ExtractPageText functions in favor of DynaPDF.ExtractText function.
- Version 8.3
- Fixed bug in DynaPDF.ExtractText.
- Version 8.0
- Added DynaPDF.RenderPDFFileEx, DynaPDF.FileAttachAnnotEx and DynaPDF.ExtractText.
- Renamed existing DynaPDF.ExtractText to DynaPDF.ExtractDocumentText. New function does the same, but now part of DynaPDF itself and not a helper function from our plugin.
Example Databases
Blog Entries
- Top 10 from the MBS Plugin in 2022
- MBS FileMaker Plugin, version 12.6pr1
- Add page links for FileMaker
- New in MBS FileMaker Plugin 12.1
- Things you can do with DynaPDF
- MBS FileMaker Plugin, version 8.3pr3
- MBS FileMaker Plugin 8.0 - More than 5000 Functions In One Plugin
- MBS FileMaker Plugin, version 7.6pr5
- MBS FileMaker Plugin 6.0 for OS X/Windows
- MBS FileMaker Plugin, version 5.5pr1
FileMaker Magazin
This function checks for a license.
Created 21st December 2017, last changed 25th June 2020
