findSpliceOverlaps {GenomicRanges} | R Documentation |
The findSpliceOverlaps
function identifies ranges (reads) that are
compatible with a specific transcript isoform. The non-compatible ranges are
analyzed for the presence of novel splice events.
findSpliceOverlaps(query, subject, ignore.strand=FALSE, ...) ## S4 method for signature 'GappedAlignments,GRangesList' findSpliceOverlaps(query, subject, ignore.strand=FALSE, ..., cds=NULL) ## S4 method for signature 'GappedAlignmentPairs,GRangesList' findSpliceOverlaps(query, subject, ignore.strand=FALSE, ..., cds=NULL) ## S4 method for signature 'GRangesList,GRangesList' findSpliceOverlaps(query, subject, ignore.strand=FALSE, ..., cds=NULL) ## Low-level utils: ## High-level convenience wrappers (coming soon): #summarizeSpliceOverlaps(query, subject, ignore.strand=FALSE, ...)
query |
Single or paired-end reads are specified with the |
subject |
A GRangesList containing the annotations. This list is expected to be exons by transcripts. |
ignore.strand |
When set to |
cds |
Optional GRangesList of coding regions for each transcript
in the |
... |
When a read maps compatibly and uniquely to a transcript isoform we can quantify the expression and look for shifts in the balance of isoform expression. If a read does not map in compatible way, novel splice events such as splice junctions, novel exons or retentions can be quantified and compared across samples.
findSpliceOverlaps
detects which reads (query) match to
transcripts (subject) in a compatible fashion. Compatibility is based
on both the transcript bounds and splicing pattern. Assessing the
splicing pattern involves comparision of the read splices (i.e., the
"N" gaps in the cigar) with the transcript introns. For paired-end
reads, the inter-read gap is not considered a splice. The analysis
of non-compatible reads for novel splice events is under construction.
The output is a Hits object with
the metadata columns defined below. Each column is a logical
indicating if the read (query) met the criteria.
compatible: Every splice (N) in a read alignment matches an intron in an annotated transcript. The read does not extend into an intron or outside the transcript bounds.
unique: The read is compatible with only one annotated transcript.
strandSpecific: The query (read) was stranded.
## Methods in Rsamtools :
findSpliceOverlaps,character,ANY(query, subject, ignore.strand=FALSE, ..., param=ScanBamParam(), singleEnd=TRUE, cds=NULL)
findSpliceOverlaps,BamFile,ANY(query, subject, ignore.strand=FALSE, ..., param=ScanBamParam(), singleEnd=TRUE, cds=NULL)
Michael Lawrence and Valerie Obenchain <vobencha@fhcrc.org>
The GRangesList, GappedAlignments, and GappedAlignmentPairs classes.
## ----------------------------------------------------------------------- ## Isoform expression : ## ----------------------------------------------------------------------- ## findSpliceOverlaps() can assist in quantifying isoform expression ## by identifying reads that map compatibly and uniquely to a ## transcript isoform. library(Rsamtools) library(TxDb.Dmelanogaster.UCSC.dm3.ensGene) library(pasillaBamSubset) se <- untreated1_chr4() ## single-end reads txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene exbytx <- exonsBy(txdb, "tx") cdsbytx <- cdsBy(txdb, "tx") param <- ScanBamParam(which=GRanges("chr4", IRanges(1e5,3e5))) sehits <- findSpliceOverlaps(se, exbytx, cds=cdsbytx, param=param) ## Tally the reads by category to get an idea of read distribution. lst <- lapply(mcols(sehits), table) nms <- names(lst) tbl <- do.call(rbind, lst[nms]) tbl ## Reads compatible with one or more transcript isoforms. rnms <- rownames(tbl) tbl[rnms == "compatible","TRUE"]/sum(tbl[rnms == "compatible",]) ## Reads compatible with a single isoform. tbl[rnms == "unique","TRUE"]/sum(tbl[rnms == "unique",]) ## All reads fall in a coding region as defined by ## the txdb annotation. lst[["coding"]] ## Check : Total number of reads should be the same across categories. lapply(lst, sum) ## ----------------------------------------------------------------------- ## Paired-end reads : ## ----------------------------------------------------------------------- ## 'singleEnd' is set to FALSE for a Bam file with paired-end reads. pe <- untreated3_chr4() hits2 <- findSpliceOverlaps(pe, exbytx, singleEnd=FALSE, param=param) ## In addition to Bam files, paired-end reads can be supplied in a ## GappedAlignmentPairs object. genes <- GRangesList( GRanges("chr1", IRanges(c(5, 20), c(10, 25)), "+"), GRanges("chr1", IRanges(c(5, 22), c(15, 25)), "+")) galp <- GappedAlignmentPairs( GappedAlignments("chr1", 5L, "11M4N6M", strand("+")), GappedAlignments("chr1", 50L, "6M", strand("-")), isProperPair=TRUE) findSpliceOverlaps(galp, genes)