安装参考:
https://geekdaxue.co/read/biotrainee@wes/rrwrli#fghc8w
https://github.com/SiYangming/HAPSEG
ABSOLUTE下载地址:https://github.com/Yunuuuu/ABSOLUTE/blob/main/ABSOLUTE_1.0.6.tar.gz
HAPSEG下载地址:https://github.com/SiYangming/HAPSEG/releases
## 首先需要先安装依赖包
install.packages(c("numDeriv","Cairo","RGtk2","cairoDevice"))
BiocManager::install(c("DNAcopy", "geneplotter", "DNAcopy", "Rcpp", "numDeriv", "foreach"))
## 再安装两个R包
install.packages("HAPSEG_1.1.1.tar.gz",repos=NULL)
install.packages("ABSOLUTE_1.0.6.tar.gz",repos=NULL)
使用:
## 测试,一般参数使用默认参数,这里使用的是上面链接的推荐参数
RunAbsolute(
seg.dat.fn = "SNP6_solid_tumor.seg.txt",
sigma.p = 0,
max.sigma.h = 0.2,
min.ploidy = 0.5,
max.ploidy = 8,
primary.disease = "BLCA",
platform = "SNP_6.0",
results.dir = "solid",
sample.name = "solid",
max.as.seg.count = 1500,
max.neg.genome = 0.005
max.non.clonal = 1,
copy_num_type = "total",
maf.fn = "solid_tumor.maf.txt",
min.mut.af = 0.1
)
输入文件格式要求:
segments 文件:
file in either of the following two formats:
- For ALLELIC copy number type analysis, supply an RData file produced by HAPSEG or AllelicCapseg. These datasets allow incorporation of copy neutral LOH events. Segmentation data produced by any other means must conform to the output formats of HAPSEG/AllelicCapseg for ABSOLUTE to consider copy neutral LOH events.
- For TOTAL copy number type analysis, suppy a tab-delimited segmentation file in plain-text format. File extension does not matter. ABSOLUTE algorithm v1.0.6 requires the following five columns. Additional columns are ignored.
- Chromosome
- In either chr# or # format.
- Start
- End
- Num_Probes
- Segment_Mean
maf 文件:
(Optional) Somatic mutation data in mutation annotation format (MAF) and as a plain text file. File extension does not matter and hashtagged header rows (#) may be present. ABSOLUTE algorithm v1.0.6 requires the following seven columns. Additional columns are ignored.
- t_ref_count OR i_t_ref_count
Count of reference alleles in tumor. - t_alt_count OR i_t_alt_count
Count of alternate alleles in tumor. Together with t_ref_count adds up to the depth of reads in the tumor BAM alignment. You can calculate a missing value if two of these three values are known or with read depth and the frequency of the alternate allele within the sample. These and other MuTect output columns are described further in the GATK forum. - dbSNP_Val_Status
Fields may be blank and multiple values are separated with nonspaced semicolon. Example values include bySubmitter, by1000genomes, by2Hit2Allele, and byHapMap. - Start_position
Note the lowercase “p”. Also, note that the End_position column is not required. This implies that ABSOLUTE algorithm v1.0.6 treats all mutation data equally as point mutations, the expected type of mutation data. - Tumor_Sample_Barcode
Fields may be blank. - Hugo_Symbol
Fields may be blank or “unknown”. - Chromosome
Must be in # format and not chr# format. The # value must correspond to that in the segmented copy ratios data file identically. For example, ABSOLUTE does not equate X with 23 and will exclude these mutations as unmapped mutations. Note ABSOLUTE algorithm v1.0.6 excludes X chromosome data but not numbered chromosome, e.g. chr23, data.