Investigation of scientific fraud across research topics

This project aims to investigate large-scale the patterns of potential scientific fraud at the level of single topics.

Our starting point is the retrieval bibliometric information on specific research topics from PubMed. Each topic is associated with a data folder in the repo.

For each of those research topics, we downloaded blibliometric information of the publications listed in PubMed. Of particular interest, we obtain title, abstract, journal, and year of publication. This analysis is conducted using the notebook titled process_data.ipynb and the functions for manual correction developed for each topic. On of the steps in the analysis aims to determine whether the article:

is a review,
is a comment,
was commented on by another article commented on it,
had an erratum published,
has been retracted.

The pre-processing step produces a file titled articles_clean.json that is available for further analysis.

The notebook titled compile_publisher_info.ipynb uses a search to the NLM catalogue and/or a manual determination to assign a publisher to a journal.

Finally, the notebook publication_patterns_by_topic.ipynb conducts some simple analysis relating to the relative occurrence of certain characteristics (retracted, review, etc) for all articles retrieved in a topic, for all articles published in a given journal, and for all article published by a given publisher.

The results of the analysis are saved a file named time_series.png (for all publications) or in folders named Journals and Publishers within each of the topic folders.

The results for some of those topics iare summarized in files named Report_{topic name}.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Case_brain_cancer_stem_cells		Case_brain_cancer_stem_cells
Case_brca2		Case_brca2
Case_chest_imaging_pneumonia		Case_chest_imaging_pneumonia
Case_circular_rna		Case_circular_rna
Case_crispr_cas9		Case_crispr_cas9
Case_deep_learning_tumor		Case_deep_learning_tumor
Case_graphene_sensors		Case_graphene_sensors
Case_green_synthesis_np		Case_green_synthesis_np
Case_green_synthesis_silver_np		Case_green_synthesis_silver_np
Case_lncrna		Case_lncrna
Case_mirna_cancer		Case_mirna_cancer
Case_mirna_development		Case_mirna_development
Case_prions		Case_prions
Case_rnai_cancer		Case_rnai_cancer
Case_skin_wound_healing		Case_skin_wound_healing
Case_statins_cancer		Case_statins_cancer
Manual_verification_articles		Manual_verification_articles
Project_libraries		Project_libraries
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compile_publisher_info.ipynb		compile_publisher_info.ipynb
fig-retractions_errata-publishers.pdf		fig-retractions_errata-publishers.pdf
fig-retractions_errata.pdf		fig-retractions_errata.pdf
integrate_retraction_watch_db.ipynb		integrate_retraction_watch_db.ipynb
process_data.ipynb		process_data.ipynb
publication_patterns_by_topic.ipynb		publication_patterns_by_topic.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Investigation of scientific fraud across research topics

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

amarallab/Science_fraud_topic_analysis

Folders and files

Latest commit

History

Repository files navigation

Investigation of scientific fraud across research topics

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages