Skip to content

How to featurize docking poses from already docked result files to use Open-ComBind? #38

@Sowmya-R-Krishnan

Description

@Sowmya-R-Krishnan

Dear team,

Thank you very much for providing Open-ComBind as a command-line tool for docking pose selection. I have results from a previous docking job using GNINA with CNN-scoring. I have 10 proteins (PDB files already prepared) and 10 docking poses for each ligand. I would like to use Open-ComBind to finalize the docking pose for further analysis.
Based on exploring the help options available for each module of the tool, I realized that it follows a standard file path nomenclature like structure/proteins, structure/ligands etc. I tried using the featurize module with my docking result (sdf file) and it confirmed my fears - I am unable to figure out how to change the path names as per the nomenclature followed in Open-ComBind. Given that I have the following data in hand, can you kindly help me with the path and filename settings to be followed to run featurization and pose selection?

  1. PDB files of 10 proteins (already prepared for docking with GNINA).
  2. PDB files of crystal ligands separated from the co-crystal structures for grid box setting.
  3. Multi-SDF files for several ligands with 10 poses per file.

Also, while trying to rectify the error with the featurization step, I saw that in one of the codes (features/ifp.py), the protein filename has been defined/built as shown below:

prot_bname = input_file.split('-to-')[-1]
prot_fname = re.sub('-docked.*\.sdf(\.gz)?','_prot.pdb',prot_bname)
prot_file = f"structures/proteins/{prot_fname}"

Here, the input filename is expected to have a -to- phrase, the docking result file is not expected to have any preceding filepaths (since the next line uses structures/proteins/ as the hard-coded path to access the protein file, and the docking output file itself should be with the suffix -docked.sdf or -docked.sdf.gz. Is it possible to provide a detailed README or usage manual kind of file to understand these requirements beforehand and use Open-ComBind effectively? I think re-running all docking jobs through this pipeline again will not be possible for me. It will be great if there is a way to use the results directly here. Thank you for taking the time to read this and hoping to hear from the team soon.

Error from featurizer when a path was prefixed to the docking output filename
Screenshot from 2024-04-04 14-59-18

With regards,
Sowmya

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions