gmft package

Top Level Aliases

Currently, contains aliases for key classes and functions.

Importing from the top-level module previously resulted in long load times. However, v0.5 introduces lazy loading, which greatly improves the situation.

Now, classes may either be imported from their original locations, gmft.auto, or from here, where they will be lazy loaded.

class gmft.TATRFormatConfig(*args, **kwargs)

Bases: object

This import is deprecated.

Please use: - Reformat API (v0.5) - gmft.formatters.tatr.TATRFormatConfig

classmethod get_mirrored_class()
class gmft.TATRFormattedTable(*args, **kwargs)

Bases: object

This import is deprecated.

Please use: - Reformat API (v0.5) - gmft.formatters.tatr.TATRFormattedTable

classmethod get_mirrored_class()
class gmft.TATRTableDetector(*args, **kwargs)

Bases: object

This import is deprecated.

Please use: - gmft.AutoTableDetector - gmft.detectors.tatr.TATRDetector

classmethod get_mirrored_class()
class gmft.TATRTableFormatter(*args, **kwargs)

Bases: object

This import is deprecated.

Please use: - gmft.auto.AutoTableFormatter - gmft.formatters.tatr.TATRFormatter

classmethod get_mirrored_class()
class gmft.TableDetector(*args, **kwargs)

Bases: object

This import is deprecated.

Please use: - gmft.AutoTableDetector - gmft.detectors.tatr.TATRDetector

classmethod get_mirrored_class()
class gmft.TableDetectorConfig(*args, **kwargs)

Bases: object

This import is deprecated.

Please use: - Reformat API (v0.5) - gmft.detectors.tatr.TATRDetectorConfig

classmethod get_mirrored_class()

PDF providers

In gmft, multiple documents and PDF providers are supported through a common interface. PyPDFium2 is the default PDF reader. Pymupdf offers more accurate performance but requires the more restrictive AGPL license.

Detectors

In gmft, detectors locate the positions and bounds (bbox) of tables on a page.

Formatters

In gmft, formatters take a located table (CroppedTable) and produces machine-readable output (ie. pandas DataFrame). This task is known in the literature as table structure recognition and table function.

Modules