Topics   All   Mac OS X (Only)   Windows (Only)   Linux (Only, Not)   iOS (Only, Not)  
Components   Crossplatform Mac & Win   Server (Not)   Client   Old   Guides   Examples
New in version: 6.5   7.0   7.1   7.2   7.3   7.4   7.5   8.0   8.1   8.2    Statistic  

DynaPDF.ExtractText

Extracts the text of the page PageNum.

Component Version macOS Windows Server FileMaker Cloud FileMaker iOS SDK
DynaPDF 8.0 Yes Yes Yes Yes Yes

MBS( "DynaPDF.ExtractText"; PDF; PageNum { ; Flags; AreaLeft; AreaTop; AreaRight; AreaBottom } )

Parameters

Parameter Description Example value
PDF The PDF reference returned from DynaPDF.New. $pdf
PageNum The page number. 1
Flags Optional
The flags for text extraction.
Can include SortTextX, SortTextY, SortTextXY and/or DeleteOverlappingText.
"SortTextXY"
AreaLeft Optional
The left coordiante of the area.
AreaTop Optional
The top coordiante of the area.
AreaRight Optional
The right coordiante of the area.
AreaBottom Optional
The bottom coordiante of the area.

Result

Returns text or error.

Description

Extracts the text of the page PageNum.
The first page is denoted by 1.

Text lines can be sorted in x- and y-direction. The flag DeleteOverlappingText causes that identical text records which are placed on the same position (with a tolerance of 2 units) will be deleted. The records must occur one after the other in order to detect them.

The optional parameter Area can be set to restrict the text extraction to that rectangle. The rectangle must be defined according to the current coordinate system. That means either in bottom up or top down coordinates, see SetPageCoords() for further information. Note also that the function considers the orientation of the page. The width and height of the rectangle must be exchanged if the orientation is 90, -90, 270, or -270 degrees.

If the function succeeds the return value is the text. If the function fails the return value is an error.

Special thing: If this function is called with two parameters, it redirects to old function DynaPDF.ExtractDocumentText to keep compatibility with existing scripts. If area parameters are not given or all zero, the area is not used.

Needs DynaPDF Lite license.

Examples

Extract some text:

Set Variable [ $pdf ; Value: MBS("DynaPDF.New") ]
Set Variable [ $r ; Value: MBS("DynaPDF.OpenPDFFromContainer";$pdf; Test::data) ]
Set Variable [ $r ; Value: MBS("DynaPDF.ImportPDFFile";$pdf) ]
Set Field [ Test::PageText ; MBS("DynaPDF.ExtractText"; $pdf; 1; 0) ]
# Cleanup
Set Variable [ $r ; Value: MBS("DynaPDF.Release"; $pdf) ]

Extract text in area:

Set Field [ Test::PageText ; MBS("DynaPDF.ExtractText"; $pdf; 1; "Default"; 200; 200; 400; 400) ]

See also


DynaPDF.ExtractPageText   -   DynaPDF.FileAttachAnnot

Feedback: Report problem or ask question.




Links
MBS Xojo Plugins