Smlewis/update commons master oct20 #8

smlewis · 2020-10-20T18:41:51Z

Updates master with a few years of Commons stuff. I'm not sure why it won't --ff-only merge but it's unlikely to be concerning, probably a hygienic mistake in the past.

These tools are for generating the SQL databases required to run MHCEpitopePredictorExternal predictors with the mhc_energy scoreterm.

Added the capability to look directly for the Rosetta database, assuming that this is being run from Rosetta/tools/mhc_energy_tools and there also exists Rosetta/main/database. Also changed os.environ to os.getenv, which returns None instead of crashing if $ROSETTA hasn't been set, and tweaked the error handling. The logic of how to find the file could maybe be cleaner, but this works for now.

The script would search for all alleles, but only the alleles that met the threshold were stored in details. This also disrupted the summary table. Now, all peptide/allele combos have their details stored in details, but we count if the score meets the threshold using meet-thresh.

…. Added rudimentary plotting capabilities.

Commits 6e5909c and bd45bb4)

I moved matplotlib so that it is only imported if plotting is used, to remove this dependency unless needed. Also fixed the help text in score.py.

Peptides in reports were being output in an arbitrary order with incorrect position numbers. This has now been fixed, and should work with repetitive sequences as well.

A bit more stringent error checking when we keep track of the positions in this way. Also, changed the absolute scoring to the log-transformed format 1-log50k(aff) instead of the affinity. Also added a .gitignore that will ignore output files when running the demo script.

db.py now supports two formats of PSSMs: the one from command line psiblast, and the one from NCBI's PSSM viewer output. If the PSSM is not in that format, it will try to process it, outputting a warning along the way.

…tope breaks; forced 'X' to be epitope break in NetMHC

Previously, if a constructor's initializer list included the C++11 initializer list {} synatx, it would encounter the left curly brace ("{") and interpret it as the start of the constructor's body. Now it is smarter. Fixing a related problem where template-specialization functions where the word "template" shows up but isn't immediately followed by an opening left angle brace ("<"). e.g. template void somefunc<int>(int a, int b); Previously the beautifier would process everything from "template" to the right angle brace as the template part, and then hand the opening left paren to be interpretted as a function. This doesn't work in the new scheme, as the opening left paren needs to be paid attention to in process_function_preamble_or_variable. Now, if a non "<" is hit after the word template, then the process_template function returns.

…lready in db

…hanged flow accordingly

When outputting resfiles with mhc_gen_db.py, you can now specify which resfile command is set as the default, global setting. If none is specified, it will be left blank.

allele-set

Added the sequence name to the top of the CSV report, and open the file for appending instead of overwriting. If the user forgets to use a '$' over a multi-sequence scoring run, each report will be appended in the same file.

extern "C" {...} is treated like a namespace extern int foo() { ... } is treated like a function

The extra debugging output that I had commented back in was leading the serialization test pipeline to conclude that all of the files were failing.

This PR massively reshapes the python_cc_reader module as this module is converted to python3. The directory structure is as follows: ``` tools/ python_cc_reader/ python_cc_reader/ beauty/ code_improvement/ cpp_parser/ external/ inclusion_removal/ library_splitting/ tests/ utility/ ``` The rationale for this dirname-within-dirname structure was given on this page: https://docs.python-guide.org/writing/structure/ At the top level `python_cc_reader`directory live the user-level scripts such as `library_levels.py` and `beautify_changed_files_in_branch.py.` Within the lower level `python_cc_reader/python_cc_reader` directories live the modules that actually do all the heavy lifting. These scripts are imported by a number of other scripts in the `tools` repository, and I have updated all of these scripts. My intention is to create this PR as a permanent record of the merge to master which I am going to make immediately after opening this PR. I will merge this to master in the `tools` repository, and then I will be merging a PR in the `main` repository (PR 4590) that updates the `tools` submodule immediately afterwards.

The URL for the antibody numbering converter changed slightly. Update accordingly.

…/fix_antibody_renumber Fix the convert_pdb_to_antibody_numbering_scheme.py script The URL for the antibody numbering converter changed slightly. Update accordingly.

…submodule2

…repo. This may or may not work for the general public, as I think (but am not sure) I was able to yank out the Meiler-lab specific things.

…submodule2

…mmons_master_oct20

BYachnin and others added 30 commits September 5, 2018 12:26

Added mhc_energy_tools, for generating mhc databases

fe5698a

These tools are for generating the SQL databases required to run MHCEpitopePredictorExternal predictors with the mhc_energy scoreterm.

made epi_thresh actually be used

8859de6

flag to restrict mutable positions

ced6c26

Added exec permissions to mhc database python scripts

c5ae1ca

Added demo files and example invocations; fixed up scripts to support…

6e5909c

…. Added rudimentary plotting capabilities.

demo files

bd45bb4

Merging Chris' demo files

178a664

Commits 6e5909c and bd45bb4)

Made matplotlib optional, and fixed help of score.py

e4f67bd

I moved matplotlib so that it is only imported if plotting is used, to remove this dependency unless needed. Also fixed the help text in score.py.

Fixed a bug in db.py help text

a2d1efa

Fixed netmhcii.py to output peptides in order

c516b01

Peptides in reports were being output in an arbitrary order with incorrect position numbers. This has now been fixed, and should work with repetitive sequences as well.

Make PSSM parsing more robust in db.py

05066bf

db.py now supports two formats of PSSMs: the one from command line psiblast, and the one from NCBI's PSSM viewer output. If the PSSM is not in that format, it will try to process it, outputting a warning along the way.

multi-chain pdb files in db.py; chain breaks in pdb files causing epi…

8ad0ace

…tope breaks; forced 'X' to be epitope break in NetMHC

resfile generation

0e31871

multiprocessing for scoring peptides for db; don't rescore peptides a…

0037baf

…lready in db

made db optional; provided argument to estimate number of peptides; c…

990fb4e

…hanged flow accordingly

Merging in master

e1640fd

single chain pdb file for db.py

e38f9fc

renamed main python scripts to be more specific

01c9266

Add submodule updates to the release script.

2d3732f

Need to remove the .git directories from submodules, too.

a3e541e

Flagged overwriting output CSV files as a TODO

2987c67

Allow resfiles to have custom global specification

9f266ef

When outputting resfiles with mhc_gen_db.py, you can now specify which resfile command is set as the default, global setting. If none is specified, it will be left blank.

Updated mhc_energy_tools README.txt

353231d

Tweaked help test for mhc_gen_db.py --batch flag

bf404f9

Allow 'all' as an allele_set, and make paul15 the default NetMHCII

dfb48b6

allele-set

Make mhc_score.py csv reports append instead of overwrite

5565e6c

Added the sequence name to the top of the CSV report, and open the file for appending instead of overwriting. If the user forgets to use a '$' over a multi-sequence scoring run, each report will be appended in the same file.

aleaverfay and others added 30 commits March 26, 2020 11:54

Update library_levels.py to import correctly

19b762b

Temporarily print the PYTHONPATH from beautification script

4a2e303

Improve handling of extern in beautifier

968c8bf

extern "C" {...} is treated like a namespace extern int foo() { ... } is treated like a function

ok, let's print out the syspath on the testing server

708ae1b

Adding dummy fork_manager to test an import idea

fd33e72

trying to print a little more information

cdf3c8b

print path to python_cc_reader module

5a12033

Adding __init__.py files to python_cc_reader modules

c5f9a43

what version of python is running on the testing server?

0a7484d

Remove debugging code

b972fc5

Potentially fix Rocco's problem with [[ attribute ]] beautification

8ebe4a1

Fix python3 popen output processing in beautifier

f65e50d

Update import statements to python_cc_reader module

c9bccc6

Convert clang ast tools to python3

db996bc

Update serialization validator Popen call for python3

e599eca

Push serialization debugging modifications to the testing server

3d76cf0

Remove debugging output from serialization validator

e06bcf4

The extra debugging output that I had commented back in was leading the serialization test pipeline to conclude that all of the files were failing.

Update the in-code paths for includes

3d5564e

Fix missing addition operator.

ccac4f1

Updated reccea scripts for python3

162b3f2

Fix the convert_pdb_to_antibody_numbering_scheme.py script

70dc5c7

The URL for the antibody numbering converter changed slightly. Update accordingly.

Merge pull request RosettaCommons#92 from RosettaCommons/roccomoretti…

053a198

…/fix_antibody_renumber Fix the convert_pdb_to_antibody_numbering_scheme.py script The URL for the antibody numbering converter changed slightly. Update accordingly.

Merge remote-tracking branch 'origin/master' into roccomoretti/rdkit_…

e158a52

…submodule2

Add the score_to_b_factor.py application from the Meiler Lab scripts …

7a66d53

…repo. This may or may not work for the general public, as I think (but am not sure) I was able to yank out the Meiler-lab specific things.

Merge remote-tracking branch 'origin/master' into roccomoretti/rdkit_…

c74efab

…submodule2

Fix spacing issue in header compile

13c836f

Fix typo on external library path.

ed38bc1

Fix commenting mistake.

a52143c

Merge remote-tracking branch 'upstream/master' into smlewis/update_co…

6dd1050

…mmons_master_oct20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Smlewis/update commons master oct20 #8

Smlewis/update commons master oct20 #8

Uh oh!

smlewis commented Oct 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

14 participants

Smlewis/update commons master oct20 #8

Are you sure you want to change the base?

Smlewis/update commons master oct20 #8

Uh oh!

Conversation

smlewis commented Oct 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

14 participants