gmft.detectors.tatr module
- class gmft.detectors.tatr.TATRDetector(config: TATRDetectorConfig = None, default_implementation=True)
Bases:
BaseDetector[TATRDetectorConfig]Uses TableTransformerForObjectDetection for small/medium tables, and a custom algorithm for large tables.
Using
extract()produces aFormattedTable, which can be exported to csv, df, etc.Detects tables in a pdf page. Default implementation uses TableTransformerForObjectDetection.
Initialize the TableDetector.
- Parameters:
config – TATRDetectorConfig
default_implementation – Should be True, unless you are writing a custom subclass for TableDetector.
- extract(page: BasePage, config_overrides: TATRDetectorConfig = None, rect: Rect = None) list[gmft.detectors.base.CroppedTable]
Detect tables in a page.
- Parameters:
page – BasePage
config_overrides – Optional config overrides for this extraction
rect – Optional Rect to constrain detection within given dimensions
- Returns:
list of CroppedTable objects
- gmft.detectors.tatr.TATRTableDetector
alias of
TATRDetector
- gmft.detectors.tatr.TableDetector
alias of
TATRDetector