benkpress

Graphical user interface for creating and evaluating classifiers for PDF file/page/sentence targets using sklearn compatible pipelines.

The pitch

Imagine having to classify PDFs as either this or that. The key information needed to make the classification may be contained on a specific page or in a specific sentence. Imagine then wanting to train a supervised machine learning model to automatically perform this classification. How do you want to manually tag the training data? Do you want to convert all PDFs into plain text and read it as such, with potential optical text recognition errors?

Usage at a glance

Interface with benkpress-plugin-api to create your custom sklearn compatible pipeline as well as any preprocessing routines you need.
Install your pipeline and preprocessor as plugins using the setuptools entrypoints specified by benkpress-plugin-api.
Load your preprocessor and pipeline in benkpress and use the GUI to tag training data.
Save your dataset for later use.

The benkpress plugin API

Not yet described.

Examples

Not yet described.

benkpress?

"Benkpress" is Norwegian for "benchpress"¹. Its meaning is relevant in two different ways. First, benkpress is a tool to simplify the tagging of PDF-sourced training data for text classifiers. Second, benkpress can also continuously benchmark the classifier intended to be trained by the generated training data. You know when to quit tagging because your classifier has good enough predictive power on previously unseen samples.

But really, it's actually a way to write the Swedish word bänkpress as a valid Python package name. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
benkpress		benkpress
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

benkpress

The pitch

Usage at a glance

The benkpress plugin API

Examples

benkpress?

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

dennishedback/benkpress

Folders and files

Latest commit

History

Repository files navigation

benkpress

The pitch

Usage at a glance

The benkpress plugin API

Examples

benkpress?

Footnotes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages