gmft.formatters.base module
A collection of common objects used by formatters.
Type Hierarchy:
- CroppedTable
- RotatedCroppedTable
FormattedTable
- BaseFormatter
TATRFormatter (alias: TableFormatter)
BaseFormatter is the base class for all formatters.
- class gmft.formatters.base.BaseFormatter
Bases:
ABCAbstract class for converting a
CroppedTableto aFormattedTable. Allows export to csv, df, etc.- abstract extract(table: CroppedTable) FormattedTable
Extract the data from the table. Produces a
FormattedTableinstance, from which data can be exported in csv, html, etc.
- format(table: CroppedTable, **kwargs) FormattedTable
Alias for
extract().
- class gmft.formatters.base.FormattedTable(cropped_table: CroppedTable, df: pandas.DataFrame = None)
Bases:
RotatedCroppedTableThis is a table that is “formatted”, which is to say it is functionalized with header and data information through structural analysis. Therefore, it can be converted into df, csv, etc.
Warning: This class is not meant to be instantiated directly. Use a
TableFormatterto convert aCroppedTableto aFormattedTable.Construct a CroppedTable object.
- Parameters:
page – BasePage
bbox – tuple of (xmin, ymin, xmax, ymax) or Rect object
confidence_score – confidence score of the table detection
label – label of the table detection. 0 means table 1 means rotated table
- df(recalculate=False, config_overrides=None) pandas.DataFrame
Return the table as a pandas dataframe. :param recalculate: By default, a cached dataframe is returned.
Note that it is preferred to explicitly call recompute().
- predictions: TablePredictions
- recompute(config=None) pandas.DataFrame
Recompute the internal dataframe.
- abstract to_dict()
Serialize self into dict
- abstract visualize()
Visualize the table.
- class gmft.formatters.base.TableFormatter
Bases:
BaseFormatter