encodeOverlaps-methods {GenomicRanges}R Documentation

encodeOverlaps method for GRangesList objects, and related utilities

Description

encodeOverlaps method for GRangesList, and related utilities.

Usage

## S4 method for signature 'GRangesList,GRangesList'
encodeOverlaps(query, subject, hits=NULL,
               flip.query.if.wrong.strand=FALSE)

## Low-level utils:

flipQuery(x, i)

selectEncodingWithCompatibleStrand(ovencA, ovencB,
                                   query.strand, subject.strand, hits=NULL)

isCompatibleWithSplicing(x)
isCompatibleWithSkippedExons(x, max.skipped.exons=NA)

extractSteppedExonRanks(x, for.query.right.end=FALSE)
extractSpannedExonRanks(x, for.query.right.end=FALSE)
extractSkippedExonRanks(x, for.query.right.end=FALSE)

extractQueryStartInTranscript(query, subject, hits=NULL, ovenc=NULL,
                              flip.query.if.wrong.strand=FALSE,
                              for.query.right.end=FALSE)

## High-level convenience wrappers:

findCompatibleOverlaps(query, subject)
countCompatibleOverlaps(query, subject)

Arguments

x

For flipQuery: a GRangesList object.

For isCompatibleWithSplicing, isCompatibleWithSkippedExons, extractSteppedExonRanks, extractSpannedExonRanks, and extractSkippedExonRanks: an OverlapEncodings object, a factor or a character vector.

i

Subscript specifying the elements in x to flip. If missing, all the elements are flipped.

ovencA, ovencB, ovenc

OverlapEncodings objects.

query, subject

GRangesList objects except for findCompatibleOverlaps and countCompatibleOverlaps where query must be a GappedAlignments or GappedAlignmentPairs object.

hits

A Hits object. See ?`encodeOverlaps` for a description of how a supplied Hits object is handled.

flip.query.if.wrong.strand

See the "Overlap encodings" vignette in the GenomicRanges package.

query.strand, subject.strand

Vector-like objects containing the strand of the query and subject, respectively.

max.skipped.exons

Not supported yet. If NA (the default), the number of skipped exons must be 1 or more (there is no max).

for.query.right.end

If TRUE, then the information reported in the output is for the right ends of the paired-end reads. Using for.query.right.end=TRUE with single-end reads is an error.

Details

In the context of an RNA-seq experiment, encoding the overlaps between 2 GRangesList objects, one containing the reads (the query), and one containing the transcripts (the subject), can be used for detecting hits between reads and transcripts that are compatible with the splicing of the transcript.

The topic of working with overlap encodings is covered in details in the "Overlap encodings" vignette in the GenomicRanges package.

Author(s)

H. Pages

See Also

Examples

## Here we only show a simple example illustrating the use of
## countCompatibleOverlaps() on a very small data set. Please
## refer to the "Overlap encodings" vignette in the GenomicRanges
## package for a more comprehensive presentation of "overlap
## encodings" and related tools/concepts (e.g. "compatible"
## overlaps, "almost compatible" overlaps etc...), and for more
## examples.

## sm_treated1.bam contains a small subset of treated1.bam, a BAM
## file containing single-end reads from the "Pasilla" experiment
## (RNA-seq, Fly, see the pasilla data package for the details)
## and aligned to reference genome BDGP Release 5 (aka dm3 genome on
## the UCSC Genome Browser):
sm_treated1 <- system.file("extdata", "sm_treated1.bam",
                           package="GenomicRanges", mustWork=TRUE)

## Load the alignments:
library(Rsamtools)
flag0 <- scanBamFlag(isDuplicate=FALSE, isNotPassingQualityControls=FALSE)
param0 <- ScanBamParam(flag=flag0)
gal <- readGappedAlignments(sm_treated1, use.names=TRUE, param=param0)

## Load the transcripts (IMPORTANT: Like always, the reference genome
## of the transcripts must be *exactly* the same as the reference
## genome used to align the reads):
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
exbytx <- exonsBy(TxDb.Dmelanogaster.UCSC.dm3.ensGene, by="tx", use.names=TRUE)

## Number of "compatible" transcripts per alignment in 'gal':
gal_ncomptx <- countCompatibleOverlaps(gal, exbytx)
mcols(gal)$ncomptx <- gal_ncomptx
table(gal_ncomptx)
mean(gal_ncomptx >= 1)
## --> 33% of the alignments in 'gal' are "compatible" with at least
## 1 transcript in 'exbytx'.

## Keep only alignments compatible with at least 1 transcript in
## 'exbytx':
compgal <- gal[gal_ncomptx >= 1]
head(compgal)

[Package GenomicRanges version 1.12.1 Index]