Releases: WormBase/ACKnowledge
v5.0
Introduction
ACKnowledge 5.0 is a landmark update that brings advanced functionality and significant improvements to our system.
This release aligns our data access infrastructure with the Alliance of Genome Resources (Alliance) central repository.
By leveraging the Alliance, we can now seamlessly fetch PDFs and related bibliographic data, ensuring our users have access to
the most comprehensive and up-to-date resources.
ACKnowledge 5.0 also introduces an enhanced backend architecture capable of extracting entities for multiple organisms.
This new modularity allows us to fetch entity lists directly from the Alliance, broadening our scope and making our
platform more versatile for diverse research needs.
The release also brings new features, performance improvements, and crucial bug fixes, all designed to provide a more
robust and efficient user experience.
Main Updates
- Multiple authors can now submit data for the same paper. Each author is recognized as a contributor on the WormBase paper page
- The author curation interface now works also for authors behind stringent firewalls and proxies
- PDFs and related bibliographic data are now accessed from the Alliance of Genome Resources (AGR) central repository
- PDF to text conversion is now performed through GROBID, a machine learning library for converting structured PDFs into TEI, a standard XML format specifically designed for scientific articles
- Curators can now identify sentences of interest for curation -- gene expression and protein kinase activity -- in articles using machine learning models integrated into the curator dashboard
- Reminder emails are now sent every two weeks to increase the response rate
- ACKnowledge can now extract data for multiple organisms
- ACKnowledge is now integrated into Caltech AWS infrastructure, with improved security and performance
- ACKnowledge is now listed on the International Society for Biocuration (ISB) curate page
User Updates
- Authors can now indicate if they contributed for the paper through other community curation initiatives
- Improved email exclusion list to consider user preferences from WormBase
- Removed images from all emails to improve accessibility and compatibility with email clients
- Added FAQ section for the 'new species' field
Bug Fixes
- Fixed links to PubMed using the new URL format
- The author portal login is now case-insensitive
- Fixed other UI issues with the author curation form
Curator Dashboard Updates
- Added a new page to view sentences of interest identified by machine learning
- Added a new button to download sentences of interest in CSV format
Release 4.0
- Changed project name to ACKnowledge and added new design
- Improved submission form with additional fields and better descriptions
Version 3.0
This version introduces several improvements, including:
- Improved accuracy of entity extraction
- Simplified data entry and updated instructions for data submission
- Created a link to the 'WormBase AFP Webinar' video
- Updated the FAQs
- Completely switched to wbtools for text mining and DB related functions
Neural Network document classification
We swapped SVM document classifiers with the new NN classifiers developed at WormBase. The new classifiers have higher accuracy compared to the old SVMs. This change is completely transparent to authors and only affects the backend and the APIs that the feedback form uses to read info about a specific paper from caltech's servers (tazendra and mangolassi).
Version 2.0
We improved the AFP system based on feedback received from authors and statistics
collected during one year of AFP v1.0 author submissions. We improved the definition of
data types in the form and we refined our text mining algorithms to obtain better results.
Main Features:
Implemented TFIDF in place of simple thresholds to improve precision of gene and allele recognition
Improved datatype descriptions in "?" mouse-overs
Added FAQs and Release Notes
WormBase AFP system version 1.0
This is the first release of the new Author First Pass system. It includes:
- backend software to process papers from WormBase, extract biological entities, and email authors asking them to validate the extracted data
- automated reminder emails for authors and alerts for curators on new author submissions for their datatypes
- feedback form for authors through which they can send feedback on extracted data and send final submission to WB
- dashboard for curators to monitor AFP submissions and to 'diff' extracted vs. submitted data