GAlignmentsList-class {GenomicRanges} | R Documentation |
The GAlignmentsList class is a container for storing a collection of GappedAlignments objects.
A GAlignmentsList object contains a list of GappedAlignments. The
majority of operations on this page are described in more detail
on the GappedAlignments man page, see ?GappedAlignments
.
GAlignmentsList(...)
:
Creates a GAlignmentsList from a list of GappedAlignments objects.
readGAlignmentsList(file, format="BAM", use.names=FALSE, ...)
:
Read a file containing aligned reads as a GAlignmentsList object.
Note that this function is just a front-end that delegates to the
readBamGAlignmentsList
function defined
in the Rsamtools package.
See ?readBamGAlignmentsList
for
more information.
The param
and asProperPairs
arguments described on the
readBamGAlignmentsList
man page fine tune the
records are returned. param
is specifed by the standard
ScanBamParam()
options. When asProperPairs=TRUE
(default),
only the proper pairs are returned. This is the same QC filtering that
is done by readBamGappedAlignmentPairs
but the records are
returned in a GAlignmentsList
instead of
GappedAlignmentsPairs
object. If asProperPairs=FALSE
, all
records in the file are retained (i.e., no QC) and the list structure is
split by read id (QNAME in SAM/BAM). This format allows the flexibility
to retain singletons, pairs, and multi-fragment reads in a single class.
makeGAlignmentsListFromFeatureFragments(seqnames=Rle(factor()),
fragmentPos=list(),
fragmentCigar=list(),
strand=character(0),
sep=",")
:
Constructs a GAlignmentsList from a list of fragmented features.
In the code snippets below, x
is a GAlignmentsList object.
length(x)
:
Return the number of elements in x
.
names(x)
, names(x) <- value
:
Get or set the names of the elements of x
.
seqnames(x)
, seqnames(x) <- value
:
Get or set the name of the reference sequences of the
alignments in each element of x
.
rname(x)
, rname(x) <- value
:
Same as seqnames(x)
and seqnames(x) <- value
.
strand(x)
, strand(x) <- value
:
Get or set the strand of the alignments in each element
of x
.
cigar(x)
:
Returns a character list of length length(x)
containing the CIGAR string for the alignments in
each element of x
.
qwidth(x)
:
Returns an integer list of length length(x)
containing the length of the alignments in each element of
x
*after* hard clipping (i.e. the length of the
query sequence that is stored in the corresponding SAM/BAM record).
start(x)
, end(x)
:
Returns an integer list of length length(x)
containing the "start" and "end" (respectively) of the
alignments in each element of x
.
width(x)
:
Returns an integer list of length length(x)
containing
the "width" of the alignments in each element of x
.
ngap(x)
:
Returns an integer list of length x
containing the number
of gaps (i.e. N operations in the CIGAR) for the alignments
in each element of x
.
seqinfo(x)
, seqinfo(x) <- value
:
Get or set the information about the underlying sequences in each
element of x
. value
must be a list of Seqinfo
objects.
seqlevels(x)
, seqlevels(x) <- value
:
Get or set the sequence levels of the alignments in each element
of x
.
seqlengths(x)
, seqlengths(x) <- value
:
Get or set the sequence lengths for each element of x
.
seqlengths(x)
is equivalent to seqlengths(seqinfo(x))
.
value
can be a named non-negative integer or numeric vector
eventually with NAs.
isCircular(x)
, isCircular(x) <- value
:
Get or set the circularity flags for the alignments in each
element in x
. value
must be a named logical list
eventually with NAs.
genome(x)
, genome(x) <- value
:
Get or set the genome identifier or assembly name for the alignments
in each element of x
. value
must be a named character
list eventually with NAs.
seqnameStyle(x)
:
Get or set the seqname style for alignments in each element of x
.
In the code snippets below, x
is a GAlignmentsList object.
granges(x, ignore.strand=FALSE)
,
ranges(x)
:
Return either a GRanges or a IRanges
object of length length(x)
. Note this coercion IGNORES
the cigar information. The resulting ranges span the entire
range, including any gaps or spaces between paired-end reads.
The granges
coercion supports ignore.strand
to
allow ranges of opposite strand to be combined (see examples).
grglist(x)
,
rglist(x)
:
Return either a GRangesList or a IRangesList
object of length length(x)
. This coercion RESPECTS the cigar
information. The resulting ranges are fragments of the original ranges
that do not include gaps or spaces between paired-end reads.
as(x, "GRangesList")
, as(x, "GRanges")
,
as(x, "RangesList")
, as(x, "Ranges")
:
Alternate ways of doing grglist(x)
, granges(x)
,
rglist(x)
, ranges(x)
, respectively.
In the code snippets below, x
is a GAlignmentsList object.
x[i]
, x[i] <- value
:
Get or set list elements i
. i
can be a numeric
or logical vector. value
must be a GappedAlignments.
x[[i]]
, x[[i]] <- value
:
Same as x[i]
, x[i] <- value
.
x[i, j]
, x[i, j] <- value
:
Get or set list elements i
with optional metadata columns
j
. i
can be a numeric, logical or missing.
value
must be a GappedAlignments.
c(...)
:
Concatenates the GAlignmentsList objects in ...
.
In the code snippets below, x
is a GAlignmentsList object.
qnarrow(x, start=NA, end=NA, width=NA)
:
Return a new GAlignmentsList object of the same length as x
describing how the narrowed query sequences align to the reference.
The start
/end
/width
arguments describe how
to narrow the query sequences. They must be vectors of integers.
NAs and negative values are accepted and "solved" according to the
rules of the SEW (Start/End/Width) interface (see
?solveUserSEW
for the details).
Valerie Obenchain <vobencha@fhcrc.org
http://samtools.sourceforge.net/
gal1 <- GappedAlignments( seqnames=Rle(factor(c("chr1", "chr2", "chr1", "chr3")), c(1, 3, 2, 4)), pos=1:10, cigar=paste0(10:1, "M"), strand=Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)), names=head(letters, 10), score=1:10) gal2 <- GappedAlignments( seqnames=Rle(factor(c("chr2", "chr4")), c(3, 4)), pos=1:7, cigar=c("5M", "3M2N3M2N3M", "5M", "10M", "5M1N4M", "8M2N1M", "5M"), strand=Rle(strand(c("-", "+")), c(4, 3)), names=tail(letters, 7), score=1:7) galist <- GAlignmentsList(noGaps=gal1, Gaps=gal2) ## --------------------------------------------------------------------- ## A. BASIC MANIPULATION ## --------------------------------------------------------------------- length(galist) names(galist) seqnames(galist) strand(galist) head(cigar(galist)) head(qwidth(galist)) head(start(galist)) head(end(galist)) head(width(galist)) head(ngap(galist)) seqlevels(galist) ## Rename the reference sequences: seqlevels(galist) <- sub("chr", "seq", seqlevels(galist)) seqlevels(galist) grglist(galist) # a GRangesList object rglist(galist) # an IRangesList object ## --------------------------------------------------------------------- ## B. SUBSETTING ## --------------------------------------------------------------------- galist[strand(galist) == "-"] gaps <- sapply(galist, function(x) any(grepl("N", cigar(x), fixed=TRUE))) galist[gaps] ## Different ways to subset: galist[2] # a GappedAlignments object of length 1 galist[[2]] # a GappedAlignments object of length 1 grglist(galist[2]) # a GRangesList object of length 1 rglist(galist[2]) # a NormalIRangesList object of length 1 ## --------------------------------------------------------------------- ## C. mcols()/elementMetadata() ## --------------------------------------------------------------------- ## Metadata can be defined on the individual GappedAlignment elements ## and the overall GAlignmentsList object. By default, 'level=between' ## extracts the GALignmentsList metadata. Using 'level=within' ## will extract the metadata on the individual GappedAliginments. mcols(galist) ## no metadata on the GAlignmentsList object mcols(galist, level="within") ## --------------------------------------------------------------------- ## D. readBamGAlignmentsList() ## --------------------------------------------------------------------- library(pasillaBamSubset) library(Rsamtools) fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE) bf <- BamFile(fl, index=character(0), yieldSize=50, obeyQname=TRUE) galist <- readGAlignmentsList(bf, asProperPairs=FALSE) galist ## --------------------------------------------------------------------- ## E. COERCION ## --------------------------------------------------------------------- ## The granges() coercion supports 'ignore.strand' to allow ranges ## from different strand to be combined. In this example paired-end ## reads aligned to opposite strands were read into a GAlignmentsList. ## If the desired operation is to combine these ranges, reguardless of ## gaps or the space between pairs, 'ignore.strand' must be TRUE. granges(galist[1], ignore.strand=TRUE) granges(galist[1], ignore.strand=FALSE) ## grglist() splits ranges by gap and the space bein each list element galist <- GAlignmentsList(noGaps=gal1, Gaps=gal2) grglist(galist)