Skip to content

Get topSS from input GWAS using import_topSNPs #7

@AMCalejandro

Description

@AMCalejandro

I find a bit painful having to generate the Gene Locus columns from the input GWAS every time.
I would like to see import_topSNPs being able to generate that for me rather than passing Gene Locus, or a function called make_topSNPs that does it for you.

I came up with a function to do this, basically wrapping the function here https://github.com/oyhel/vautils

This is the approach I used

make_topSNPs = function(data,
                        build = "hg19",
                        write.out = T,
                        custom_gene = NULL,
                        .metadata_file = metadata) {
  # I need to do this in advance to enter vautils function
  data = data %>%
    dplyr::rename(rsid = `#SNP`,
                  chromosome = CHR,
                  position = POS)

  snps_mapped = vautils::find_nearest_gene(as.data.frame(data),
                                           build=build,
                                           collapse=F,
                                           snp = "rsid",
                                           flanking=1000)


  if(any(snps_mapped$distance == "intergenic")) {
    snps_mapped = snps_mapped %>%
      mutate(distance = recode(distance, "intergenic" = "0"))
  }

  snps_mapped_filt = snps_mapped %>%
    mutate(distance = abs(as.numeric(distance))) %>%
    arrange(distance) %>%
    group_by(rsid) %>%
    filter(row_number() == 1) %>%
    ungroup()

  if (!is.null(custom_gene)) {
     custom_gene_filt = snps_mapped %>%
       dplyr::filter(GENE %in% custom_gene)
     # Remove the picked gene from vautils
     snps_mapped_filt = snps_mapped_filt %>%
       dplyr::filter(!((chromosome == custom_gene_filt$chromosome) & (position == custom_gene_filt$position)))

     snps_mapped_filt = rbind(snps_mapped_filt, custom_gene_filt)
  }

  snps_mapped_filt = data.frame(Locus = snps_mapped_filt$GENE,
                     Gene = snps_mapped_filt$GENE,
                     CHR = snps_mapped_filt$chromosome,
                     POS = snps_mapped_filt$position,
                     SNP = snps_mapped_filt$rsid)


  tmp_2 = data %>%
    dplyr::filter(rsid %in% snps_mapped_filt$SNP) %>%
    dplyr::select(SNP = rsid, Effect, P = Pval)
  mydf = snps_mapped_filt %>% inner_join(tmp_2)


  if (write.out) {
    fwrite(mydf, paste(.metadata_file[1],"top_SNPs.txt", sep = "/"),
           col.names = T,
           row.names = F,
           sep ="\t",
           quote = F)
  }
  return(mydf)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions