This Python library will parse an OpenITI mARkdown document and return a python class representation of the document structures.
import oimdp
md_file = open("mARkdownfile", "r")
text = md_file.read()
md_file.close()
parsed = oimdp.parse(text)Please see the docs, but here are some highlights:
content: a list of content structures
get_clean_text(): get the text stripped of markup
Content classes contain an original value from the document and some extracted content such as a text string or a specific value.
Most other structures are listed in sequence (e.g. a Paragraph is followed by a Line).
Line objects and other line-level structures are divided in PhrasePart objects.
PhrasePart are phrase-level tags
Set up a virtual environment with venv
python3 -m venv .envActivate the virtual environment
source .env/bin/activateInstall
python setup.py installWith the environment activated:
python tests/test.py