Welcome to gmft’s documentation!

gmft is a lightweight, performant, configurable, deep library for converting pdf tables to many formats, including cropped image, text + positions, plaintext, csv, and pandas dataframes.

To see the approximate quality of gmft, the eval notebook (colab) (github) shows the output of gmft on a variety of pdfs.

To see how gmft stacks up against the many alternatives, this comparison may help you decide which library is best for your use case.

Check out the Usage section, including Installation instructions.

Check out the Config Guide section for a description on gmft’s settings.

Note

This project is under active development.

Table of Contents