catalog_tools

Work with PG catalog files

Project Gutenberg makes several catalog files available on its website. They are free for all to use.

All Project Gutenberg metadata are available digitally in the MARC, MARCXML, and XML/RDF formats. The MARC record extract is generated weekly from the Project Gutenberg Postgres database. This process uses a python script which recreates each MARC record from the entire collection of titles, excluding non-textual titles such as maps, audio files, data sets and so on. Further info on the MARC extract is available at (https://github.com/gutenbergtools/ebookconverter/blob/v10/MARC.md). RDF files are updated daily. You may find code in the "Gitenberg-dev" repo useful for reading these.

A CSV file zipped is compatible with Microsoft Excel. Note that titles with subtitles in the CSV file may span multiple lines. The process_csv.py script in this repo can be used to turn the titles onto single line titles. You can modify the script if you want to put subtitles into a separate column.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
process_csv.py		process_csv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

catalog_tools

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

gutenbergtools/catalog_tools

Folders and files

Latest commit

History

Repository files navigation

catalog_tools

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages