gmft.detectors.base module
A collection of common objects used by detectors.
Type Hierarchy:
- CroppedTable
- RotatedCroppedTable
FormattedTable
- BaseDetector
TATRDetector
Img2TableDetector
BaseDetector is the base class for all detectors.
Module containing methods of detecting tables from whole pdf pages.
- Example:
>>> from gmft.auto import AutoTableDetector
- class gmft.detectors.base.BaseDetector
-
Abstract base class for table detectors.
- detect(page: BasePage, config_overrides: ConfigT = None, **kwargs) list[gmft.detectors.base.CroppedTable]
Alias for
extract().
- abstract extract(page: BasePage, config_overrides: ConfigT = None) list[gmft.detectors.base.CroppedTable]
Extract tables from a page.
- Parameters:
page – BasePage
config_overrides – override the config for this call only
- Returns:
list of CroppedTable objects
- class gmft.detectors.base.CroppedTable(page: BasePage, bbox: tuple[int, int, int, int] | Rect, confidence_score: float = 1.0, label=0, *, angle: Literal[0, 90, 180, 270] = 0)
Bases:
objectA pdf selection, cropped to include just a table. Created by
BaseDetector.Construct a CroppedTable object.
- Parameters:
page – BasePage
bbox – tuple of (xmin, ymin, xmax, ymax) or Rect object
confidence_score – confidence score of the table detection
label – label of the table detection. 0 means table 1 means rotated table
- property bbox
- captions(margin=None, line_spacing=2.5, **kwargs) tuple[str, str]
Look for a caption in the table.
Since this method is somewhat slow, the result is cached if captions() is called with default arguments.
- Parameters:
margin – margin around the table to search for captions. Positive margin = expands the table.
line_spacing – minimum line spacing to consider two lines as separate.
- Returns:
tuple[str, str]: [caption_above, caption_below]
- static from_dict(d: dict, page: BasePage) CroppedTable | RotatedCroppedTable
Deserialize a CroppedTable object from dict.
Because file locations may change, require the user to provide the original page - but as a helper method see PyPDFium2Utils.load_page_from_dict and PyPDFium2Utils.reload
These are required entries of the dict: - filename (str) - page_no (int) - bbox (list of x0, y0, x1, y1)
These entries were formerly required: - confidence_score (float) - label (int)
These entries are optional: - angle (one of 0, 90, 180, 270)
- Parameters:
d – dict
page – BasePage
- Returns:
CroppedTable object
- static from_image_only(img: Image) CroppedTable
Create a
CroppedTableobject from an image only.- Parameters:
img – PIL image
- Returns:
CroppedTable object
- property height
- image(dpi: int = None, padding: tuple[int, int, int, int] | Literal['auto', None] = None, margin: tuple[int, int, int, int] | Literal['auto', None] = None) Image
Return the image of the cropped table.
Following pypdfium2, scaling_factor = (dpi / 72). Therefore, dpi=72 is the default, and dpi=144 is x2 zoom.
- Parameters:
dpi – dots per inch. If not None, the scaling_factor parameter is ignored.
padding – padding (blank pixels) to add to the image. Tuple of (left, top, right, bottom) Padding (blank pixels) is added after the crop and rotation. Padding is important for subsequent row/column detection; see https://github.com/microsoft/table-transformer/issues/68 for discussion. If padding = ‘auto’, the padding is automatically set to 10% of the larger of {width, height}. Default is no padding.
margin – add content (in pdf units) from the original pdf beyond the detected table bbox boundary.
- Returns:
image of the cropped table
- predicted_word_height(smallest_supported_text_height=0.1)
Get the predicted height of standard text in the table. If there are no words, np.nan is returned.
- text()
Return the text of the cropped table.
Any words that intersect the table are captured, even if they are not fully contained.
- Returns:
text of the cropped table
- text_positions(remove_table_offset: bool = False, outside: bool = False) Generator[tuple[int, int, int, int, str], None, None]
Return the text positions of the cropped table.
Any words that intersect the table are captured, even if they are not fully contained.
- Parameters:
remove_table_offset – if True, the coordinates are transformed (rotated and translated) so that the top-left corner of the table is (0, 0) and the bottom-right corner is (width, height). If False, transforms (including rotation) are ignored and original coordinates are returned.
outside – if True, returns the complement of the table: all the text positions outside the table. (default: False)
- Returns:
list of text positions, which is a tuple
(x0, y0, x1, y1, "string")
- to_dict()
- visualize(show_text=False, **kwargs)
Visualize the cropped table.
- property width
- class gmft.detectors.base.RotatedCroppedTable(page: BasePage, bbox: tuple[int, int, int, int], confidence_score: float, angle: float, label=0)
Bases:
CroppedTableTable that has been rotated.
Note:
self.bboxandself.rectare in coordinates of the original pdf. But text_positions() can possibly give transformed coordinates.Currently, only 0, 90, 180, and 270 degree rotations are supported. An angle of 90 would mean that a 90 degree cc rotation has been applied to a level image.
In practice, most rotated tables are rotated by 90 degrees.
Note: after v0.5, this class is nearly identical to CroppedTable. angle is now directly availble in CroppedTable.
Construct a CroppedTable object.
- Parameters:
page – BasePage
bbox – tuple of (xmin, ymin, xmax, ymax) or Rect object
confidence_score – confidence score of the table detection
label – label of the table detection. 0 means table 1 means rotated table
- static from_dict(d: dict, page: BasePage) CroppedTable | RotatedCroppedTable
Create a
RotatedCroppedTableobject from dict.