nearest-methods {GenomicRanges} | R Documentation |
The nearest
, precede
, follow
, distance
and distanceToNearest
methods for GenomicRanges
objects and subclasses.
## S4 method for signature 'GenomicRanges,GenomicRanges' precede(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,missing' precede(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,GenomicRanges' follow(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,missing' follow(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,GenomicRanges' nearest(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,missing' nearest(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,GenomicRanges' distanceToNearest(x, subject, ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,missing' distanceToNearest(x, subject, ignore.strand=FALSE, ...) ## S4 method for signature 'GenomicRanges,GenomicRanges' distance(x, y, ignore.strand=FALSE, ...)
x |
The query GenomicRanges instance. |
subject |
The subject GenomicRanges instance
within which the nearest neighbors are found. Can be missing,
in which case |
y |
For the |
select |
Logic for handling ties. By default, all methods
select a single interval (arbitrary for When |
ignore.strand |
A |
... |
Additional arguments for methods. |
nearest:
Performs conventional nearest neighbor finding.
Returns an integer vector containing the index of the nearest neighbor
range in subject
for each range in x
. If there is no
nearest neighbor NA
is returned. For details of the algorithm
see the man page in IRanges, ?nearest
.
precede:
For each range in x
, precede
returns
the index of the range in subject
that is directly
preceded by the range in x
. Overlapping ranges are excluded.
NA
is returned when there are no qualifying ranges in
subject
.
follow:
The opposite of precede
, follow
returns
the index of the range in subject
that is directly followed by the
range in x
. Overlapping ranges are excluded. NA
is returned
when there are no qualifying ranges in subject
.
Orientation and Strand:
The relevant orientation for precede
and follow
is 5' to 3', consistent with the direction of translation.
Because positional numbering along a chromosome is from left to
right and transcription takes place from 5' to 3', precede
and
follow
can appear to have ‘opposite’ behavior on the +
and -
strand. Using positions 5 and 6 as an example, 5 precedes
6 on the +
strand but follows 6 on the -
strand.
A range with strand *
can be compared to ranges on either the
+
or -
strand. Below we outline the priority when ranges
on multiple strands are compared. When ignore.strand=TRUE
all
ranges are treated as if on the +
strand.
x on + strand can match to ranges on both + and * strands. In the case of a tie the first range by order is chosen.
x on - strand can match to ranges on both - and * strands. In the case of a tie the first range by order is chosen.
x on * strand can match to ranges on any of +, - or * strands. In the case of a tie the first range by order is chosen.
distanceToNearest: Returns the distance for each range in x
to its nearest neighbor in the subject
.
distance:
Returns the distance for each range in x
to the range in y
.
The behavior of distance
has changed in Bioconductor 2.12.
See the man page ?distance
in IRanges for details.
For nearest
, precede
and follow
, an integer
vector of indices in subject
, or a Hits
if
select="all"
.
For distanceToNearest
, a Hits
object with a column for
the query
index (queryHits), subject
index (subjectHits)
and the distance
between the pair.
For distance
, an integer vector of distances between the ranges
in x
and y
.
P. Aboyoun and V. Obenchain <vobencha@fhcrc.org>
The GenomicRanges and GRanges classes.
The nearest-methods man page in the IRanges package.
findOverlaps-methods for finding just the overlapping ranges.
## ----------------------------------------------------------- ## precede() and follow() ## ----------------------------------------------------------- query <- GRanges("A", IRanges(c(5, 20), width=1), strand="+") subject <- GRanges("A", IRanges(rep(c(10, 15), 2), width=1), strand=c("+", "+", "-", "-")) precede(query, subject) follow(query, subject) strand(query) <- "-" precede(query, subject) follow(query, subject) ## ties choose first in order query <- GRanges("A", IRanges(10, width=1), c("+", "-", "*")) subject <- GRanges("A", IRanges(c(5, 5, 5, 15, 15, 15), width=1), rep(c("+", "-", "*"), 2)) precede(query, subject) precede(query, rev(subject)) ## ignore.strand=TRUE treats all ranges as '+' precede(query[1], subject[4:6], select="all", ignore.strand=FALSE) precede(query[1], subject[4:6], select="all", ignore.strand=TRUE) ## ----------------------------------------------------------- ## nearest() ## ----------------------------------------------------------- ## When multiple ranges overlap an "arbitrary" range is chosen query <- GRanges("A", IRanges(5, 15)) subject <- GRanges("A", IRanges(c(1, 15), c(5, 19))) nearest(query, subject) ## select="all" returns all hits nearest(query, subject, select="all") ## Ranges in 'x' will self-select when 'subject' is present query <- GRanges("A", IRanges(c(1, 10), width=5)) nearest(query, query) ## Ranges in 'x' will not self-select when 'subject' is missing nearest(query) ## ----------------------------------------------------------- ## distance(), distanceToNearest() ## ----------------------------------------------------------- ## Adjacent, overlap, separated by 1 query <- GRanges("A", IRanges(c(1, 2, 10), c(5, 8, 11))) subject <- GRanges("A", IRanges(c(6, 5, 13), c(10, 10, 15))) distance(query, subject) ## recycling distance(query[1], subject) ## zero-width ranges zw <- GRanges("A", IRanges(4,3)) stopifnot(distance(zw, GRanges("A", IRanges(3,4))) == 0L) sapply(-3:3, function(i) distance(shift(zw, i), GRanges("A", IRanges(4,3)))) query <- GRanges(c("A", "B"), IRanges(c(1, 5), width=1)) distanceToNearest(query, subject)