Getting Started

Welcome Installing Bioconductor Command line tools (Docker) Getting the data

Quality Control

Overview Depth of Sequencing Coverage Cross Sample Contamination Tumor-in-normal Tumor Purity

Mutational Signatures

Overview SomaticSignatures (NMF) deconstructSigs (NNLS) SigMa (Likelihood) scarHRD (Homologous Recombination Deficiency) MSIsensor (Microsatellite Instability)

Mutational Significance Analysis

Overview MutSigCV OncodriveFML MutPanning Brown's Method (for combining p-values)

Tumor Phylogenetics

Overview Calculate Mutation Cancer Cell Fractions (CCFs) PhylogicNDT PyClone

Gene Set Analysis

Overview GSEA ssGSEA Pathway Overrepresentation

Differential Expression

Overview edgeR DESeq2 Independent Hypothesis Weighting (IHW)

FACETS

Generating pileup file

library(facets)

# to get the same results every time set the seed
set.seed(42)

# read in and format the snp-pileup file
rcmat <- readSnpMatrix()

# restrict SNP sites to those within sequencing depth thresholds
# also restrict SNP sites to those that pass het thresholds (being a true het site in sample)
# also make sure there's, on average, 1 snp every snp.nbhd for segmentation (if not add psuedo snp)
xx <- preProcSample(rcmat)

# perform segmentation
oo <- procSample(xx, cval=150)

# determine allelic copy numbers, tumor ploidy, and tumor purity from segmentation info
fit <- emcncf(oo)

# plot segmentation info, allelic copy numbers, and cellular fraction of copy number events
plotSample(x = oo, emfit = fit)

# QC plot (the closer the points are to the lines, the better calibrated the results)
logRlogORspider(oo$out, oo$dipLogR)

Calculate copy number CCF

When analyzing the cncf output from FACETS, most scientists assume that the cf.em (cellular fraction) column is the cancer cell fraction (CCF) of the copy number event. However, cf.em is the cellular fraction of the entire sample, which includes both tumor and normal components (unless the tumor sample is 100% pure). You should notice that the only segments with a cf.em value of 1.0 (100%) are normal ploidy segments. To calculate the CCF of each copy number event, simply divide the cf.em column by the purity of the sample. In some cases the cf.em value may be slightly higher than the purity value due to noise, which in that case the CCF estimates can be rounded to 1.0 (100%).

# grab all segments with copy number event
cncf.ccf <- fit$cncf[which(fit$cncf$tcn.em != 2 & fit$cncf$lcn.em != 1), ]

# calculate CCF
cncf.ccf$ccf <- sapply(cncf.ccf, function(x) {
  x <- x / fit$purity
  if(x > 1) {
    x <- 1
  }
  return(x)
})

Improve this page