-
Notifications
You must be signed in to change notification settings - Fork 5
Feature anomaly model #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
71e3e38
81a9a1f
d1059ee
e1c19cc
47b758f
a1552b9
7db28eb
843c586
a0908ac
c28dcfa
2841f12
11d48af
3704c0c
371535a
830df2b
6db1141
d3e3750
845bd24
6ad87bc
b2d1df4
e82625a
4803d4e
fdb863d
71bad8d
e6bcd1b
8ca9b36
fc641fc
0b9c6ea
6ab7fa2
b2041ce
ddd19c6
1744ad4
e325f43
e12c16a
0121d24
f1e2e4a
d58e315
da8fa5f
98706b3
9ebd83e
aa1e754
2ccd44b
3decde3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -2,9 +2,11 @@ | |||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| S3method(as.data.frame,MSstatsValidated) | ||||||||||||||||||||||||||||
| S3method(as.data.table,MSstatsValidated) | ||||||||||||||||||||||||||||
| export(CheckDataHealth) | ||||||||||||||||||||||||||||
| export(DIANNtoMSstatsFormat) | ||||||||||||||||||||||||||||
| export(DIAUmpiretoMSstatsFormat) | ||||||||||||||||||||||||||||
| export(FragPipetoMSstatsFormat) | ||||||||||||||||||||||||||||
| export(MSstatsAnomalyScores) | ||||||||||||||||||||||||||||
| export(MSstatsBalancedDesign) | ||||||||||||||||||||||||||||
| export(MSstatsClean) | ||||||||||||||||||||||||||||
| export(MSstatsImport) | ||||||||||||||||||||||||||||
|
|
@@ -25,7 +27,9 @@ export(getDataType) | |||||||||||||||||||||||||||
| export(getInputFile) | ||||||||||||||||||||||||||||
| exportMethods(getDataType) | ||||||||||||||||||||||||||||
| exportMethods(getInputFile) | ||||||||||||||||||||||||||||
| import(Rcpp) | ||||||||||||||||||||||||||||
| import(data.table) | ||||||||||||||||||||||||||||
| import(parallel) | ||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainparallel import: either use it or drop it; also list it in DESCRIPTION If you’re not calling parallel::* explicitly, remove this import to avoid check notes. If you do use it, add parallel to DESCRIPTION Imports. Apply if needed (DESCRIPTION): Imports:
data.table,
log4r,
methods,
checkmate,
utils,
stringi,
Rcpp,
+ parallel,🏁 Script executed: #!/bin/bash
rg -nP '\bparallel::|\blibrary\s*\(\s*parallel\s*\)' -g 'R/**'Length of output: 197 Add parallel to DESCRIPTION Imports DESCRIPTION: Imports:
data.table,
log4r,
methods,
checkmate,
utils,
stringi,
Rcpp,
+ parallel,📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||
| importFrom(data.table,as.data.table) | ||||||||||||||||||||||||||||
| importFrom(data.table,fread) | ||||||||||||||||||||||||||||
| importFrom(data.table,melt) | ||||||||||||||||||||||||||||
|
|
@@ -37,3 +41,4 @@ importFrom(log4r,file_appender) | |||||||||||||||||||||||||||
| importFrom(methods,new) | ||||||||||||||||||||||||||||
| importFrom(stats,na.omit) | ||||||||||||||||||||||||||||
| importFrom(utils,sessionInfo) | ||||||||||||||||||||||||||||
| useDynLib(MSstatsConvert, .registration = TRUE) | ||||||||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # Generated by using Rcpp::compileAttributes() -> do not edit by hand | ||
| # Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393 | ||
|
|
||
| calculate_anomaly_score <- function(df, n_trees, max_depth) { | ||
| .Call(`_MSstatsConvert_calculate_anomaly_score`, df, n_trees, max_depth) | ||
| } | ||
|
|
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,32 +1,39 @@ | ||||||||||||||||||||||||||||||
| #' Clean raw Spectronaut output. | ||||||||||||||||||||||||||||||
| #' @param msstats_object an object of class `MSstatsSpectronautFiles`. | ||||||||||||||||||||||||||||||
| #' @param intensity chr, specifies which column will be used for Intensity. | ||||||||||||||||||||||||||||||
| #' @param calculateAnomalyScores logical, whether to calculate anomaly scores | ||||||||||||||||||||||||||||||
| #' @param anomalyModelFeatures character vector, specifies which columns will be used for anomaly detection model. Can be NULL if calculateAnomalyScores=FALSE. | ||||||||||||||||||||||||||||||
| #' @return `data.table` | ||||||||||||||||||||||||||||||
|
Comment on lines
+4
to
6
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Add defaults to new parameters for backward compatibility Set calculateAnomalyScores = FALSE and anomalyModelFeatures = NULL in the signature; update roxygen to document defaults. Prevents breakage at indirect call sites. -#' @param calculateAnomalyScores logical, whether to calculate anomaly scores
-#' @param anomalyModelFeatures character vector, specifies which columns will be used for anomaly detection model. Can be NULL if calculateAnomalyScores=FALSE.
+.#' @param calculateAnomalyScores logical (default: FALSE), whether to calculate anomaly scores
+.#' @param anomalyModelFeatures character vector or NULL (default: NULL); columns used for anomaly detection model. Ignored when calculateAnomalyScores=FALSE.
@@
-.cleanRawSpectronaut = function(msstats_object, intensity,
- calculateAnomalyScores,
- anomalyModelFeatures) {
+.cleanRawSpectronaut = function(msstats_object, intensity,
+ calculateAnomalyScores = FALSE,
+ anomalyModelFeatures = NULL) {
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||||
| #' @keywords internal | ||||||||||||||||||||||||||||||
| .cleanRawSpectronaut = function(msstats_object, intensity) { | ||||||||||||||||||||||||||||||
| .cleanRawSpectronaut = function(msstats_object, intensity, | ||||||||||||||||||||||||||||||
| calculateAnomalyScores, | ||||||||||||||||||||||||||||||
| anomalyModelFeatures) { | ||||||||||||||||||||||||||||||
| FFrgLossType = FExcludedFromQuantification = NULL | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| spec_input = getInputFile(msstats_object, "input") | ||||||||||||||||||||||||||||||
| .validateSpectronautInput(spec_input) | ||||||||||||||||||||||||||||||
| spec_input = spec_input[FFrgLossType == "noloss", ] | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| if (is.character(spec_input$FExcludedFromQuantification)) { | ||||||||||||||||||||||||||||||
| spec_input = spec_input[FExcludedFromQuantification == "False", ] | ||||||||||||||||||||||||||||||
| } else { | ||||||||||||||||||||||||||||||
| spec_input = spec_input[!(as.logical(FExcludedFromQuantification)), ] | ||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| f_charge_col = .findAvailable(c("FCharge", "FFrgZ"), colnames(spec_input)) | ||||||||||||||||||||||||||||||
| pg_qval_col = .findAvailable(c("PGQvalue"), colnames(spec_input)) | ||||||||||||||||||||||||||||||
| interference_col = .findAvailable(c("FPossibleInterference"), | ||||||||||||||||||||||||||||||
| colnames(spec_input)) | ||||||||||||||||||||||||||||||
| exclude_col = .findAvailable(c("FExcludedFromQuantification"), | ||||||||||||||||||||||||||||||
| colnames(spec_input)) | ||||||||||||||||||||||||||||||
| cols = c("PGProteinGroups", "EGModifiedSequence", "FGCharge", "FFrgIon", | ||||||||||||||||||||||||||||||
| f_charge_col, "RFileName", "RCondition", "RReplicate", | ||||||||||||||||||||||||||||||
| "EGQvalue", pg_qval_col, paste0("F", intensity)) | ||||||||||||||||||||||||||||||
| "EGQvalue", pg_qval_col, interference_col, exclude_col, | ||||||||||||||||||||||||||||||
| paste0("F", intensity)) | ||||||||||||||||||||||||||||||
| if (calculateAnomalyScores){ | ||||||||||||||||||||||||||||||
| cols = c(cols, anomalyModelFeatures) | ||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||
|
Comment on lines
+27
to
+29
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Fail fast if requested anomalyModelFeatures are missing Currently missing features are silently dropped via intersect(), which can mask configuration errors. Validate when calculateAnomalyScores is TRUE. - if (calculateAnomalyScores){
- cols = c(cols, anomalyModelFeatures)
- }
+ if (isTRUE(calculateAnomalyScores)) {
+ if (is.null(anomalyModelFeatures) || length(anomalyModelFeatures) == 0L) {
+ stop("calculateAnomalyScores=TRUE requires non-empty anomalyModelFeatures.")
+ }
+ missing_feats = setdiff(anomalyModelFeatures, colnames(spec_input))
+ if (length(missing_feats)) {
+ stop(sprintf("Requested anomalyModelFeatures not found in Spectronaut input: %s",
+ paste(missing_feats, collapse = ", ")))
+ }
+ cols = c(cols, anomalyModelFeatures)
+ }📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||||
| cols = intersect(cols, colnames(spec_input)) | ||||||||||||||||||||||||||||||
| spec_input = spec_input[, cols, with = FALSE] | ||||||||||||||||||||||||||||||
| data.table::setnames( | ||||||||||||||||||||||||||||||
| spec_input, | ||||||||||||||||||||||||||||||
| c("PGProteinGroups", "EGModifiedSequence", "FGCharge", "FFrgIon", | ||||||||||||||||||||||||||||||
| f_charge_col, "RFileName", paste0("F", intensity), "RCondition", "RReplicate"), | ||||||||||||||||||||||||||||||
| f_charge_col, "RFileName", paste0("F", intensity), | ||||||||||||||||||||||||||||||
| "RCondition", "RReplicate"), | ||||||||||||||||||||||||||||||
| c("ProteinName", "PeptideSequence", "PrecursorCharge", "FragmentIon", | ||||||||||||||||||||||||||||||
| "ProductCharge", "Run", "Intensity", "Condition", "BioReplicate"), | ||||||||||||||||||||||||||||||
| skip_absent = TRUE) | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
New export CheckDataHealth — looks good; verify man and implementation exist
Export aligns with the PR goals. Please confirm the function is defined and documented so R CMD check doesn’t warn about missing objects.
🏁 Script executed:
Length of output: 72
Add documentation for exported function CheckDataHealth
Implementation found at R/MSstatsConvert_core_functions.R:579, but no corresponding man page was detected. Please add a
man/CheckDataHealth.Rdfile to prevent R CMD check warnings.🤖 Prompt for AI Agents