Skip to content

Planned Features #1

@atc3

Description

@atc3

Will update this as things are changed

  • test separate variance modeling in a structured way
  • fix experiment alignment figures image path - needs to be relative to the HTML.
  • iteratively run the filters, and check if there are experiments where all observations are filtered out, and then remove them and run the filters again
  • fix "sort" issue with pd.concat
q:\anaconda\lib\site-packages\pandas\core\frame.py:6201: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=True'.

To retain the current behavior and silence the warning, pass sort=False

  sort=sort)
  • fix RI columns being wiped when concatenating
  • quick and dirty pairwise correlation b/n experiments - to see outliers and warn the user that they should be filtered out
  • check the max(PEP) of each raw file, and warn the user if they input a raw file with PEPs that are too low (nothing to boost)
  • retention length filtering - raw file specific
  • rename output columns
  • migrate to config file instead of command-line options
  • improve input file-type converting
    • file-type determines column names
    • move filtering blocks into separate functions. file-type determines which functions are run
  • pip installable
  • violin plot of residual density by RT (RT on x-axis)
  • pairwise correlation of RTs - heatmap
  • diagnostic figures for the update portion
    • PEP vs PEP.new scatterplot
    • fold change increase in IDs as function of PEP threshold
  • validation figures
    • multiple peptides of the same protein - should have the same intensity (measure the CV)
  • generate HTML file to view figures
  • add and start throwing exceptions
  • create entire output directory including all subfolders
  • parameter for defining column headers - additional option instead of specifying the file type
  • fix experiment exclusion
  • optional save alignment parameters
  • split up outputs in same way the inputs are split up
    • then remove input_id column
  • remove id column as well?
  • verbose levels and actually enforce them in code
  • additional parameters to select which columns to have
    • default should just be pep_new. maybe have a "diagnostic" flag that includes the other columns?
  • logging -> logger
  • default retention length filter - (max_retention_time) / 60
  • optimize experiment updating
  • filter_decoys/contaminants -> include_decoys/contaminants
  • add PEP_updated column

FUTURE VERSION

  • move off of STAN
  • optimize data selection by RT bin, experiment, and peptide. remove as much as possible but retain the same amount of coverage.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions