GenomicRanges-comparison {GenomicRanges}R Documentation

Comparing and ordering genomic ranges

Description

Methods for comparing and ordering the elements in one or more GenomicRanges objects.

Usage

## Element-wise (aka "parallel") comparison of 2 GenomicRanges objects
## -------------------------------------------------------------------

## S4 method for signature 'GenomicRanges,GenomicRanges'
e1 == e2

## S4 method for signature 'GenomicRanges,GenomicRanges'
e1 <= e2

## duplicated()
## ------------

## S4 method for signature 'GenomicRanges'
duplicated(x, incomparables=FALSE, fromLast=FALSE,
           method=c("auto", "quick", "hash"))

## match()
## -------

## S4 method for signature 'GenomicRanges,GenomicRanges'
match(x, table, nomatch=NA_integer_, incomparables=NULL,
      method=c("auto", "quick", "hash"), ignore.strand=FALSE, match.if.overlap=FALSE)

## order() and related methods
## ----------------------------

## S4 method for signature 'GenomicRanges'
order(..., na.last=TRUE, decreasing=FALSE)

## S4 method for signature 'GenomicRanges'
rank(x, na.last=TRUE,
     ties.method=c("average", "first", "random", "max", "min"))

## Generalized element-wise (aka "parallel") comparison of 2 GenomicRanges
## objects
## ------------------------------------------------------------------------

## S4 method for signature 'GenomicRanges,GenomicRanges'
compare(x, y)

Arguments

e1, e2, x, table, y

GenomicRanges objects.

incomparables

Not supported.

fromLast, method, nomatch

See ?`Ranges-comparison` in the IRanges package for a description of these arguments.

ignore.strand

Whether or not the strand should be ignored when comparing 2 genomic ranges.

match.if.overlap

You temporarily need to explicitly provide this argument otherwise match() will issue a warning.

Starting with BioC 2.12, the default behavior of match() on GenomicRanges objects has changed to use equality instead of overlap for comparing elements between GenomicRanges objects x and table. Now x[i] and table[j] are considered to match when they are equal (i.e. x[i] == table[j]), instead of when they overlap. This new behavior is consistent with base::match().

If you need the new behavior, use match.if.overlap=FALSE.

If you need the old behavior, you can either do: findOverlaps(x, table, select="first") which is the recommended way. Alternatively, you can call match() with match.if.overlap=TRUE.

...

Additional GenomicRanges objects used for breaking ties.

na.last

Ignored.

decreasing

TRUE or FALSE.

ties.method

A character string specifying how ties are treated. Only "first" is supported for now.

Details

Two elements of a GenomicRanges object (i.e. two genomic ranges) are considered equal iff they are on the same underlying sequence and strand, and have the same start and width. duplicated() and unique() on a GenomicRanges object are conforming to this.

The "natural order" for the elements of a GenomicRanges object is to order them (a) first by sequence level, (b) then by strand, (c) then by start, (d) and finally by width. This way, the space of genomic ranges is totally ordered. Note that the reduce method for GenomicRanges uses this "natural order" implicitly. Also, note that, because we already do (c) and (d) for regular ranges (see ?`Ranges-comparison`), genomic ranges that belong to the same underlying sequence and strand are ordered like regular ranges. order(), sort(), and rank() on a GenomicRanges object are using this "natural order".

Also the ==, !=, <=, >=, < and > operators between 2 GenomicRanges objects are using this "natural order".

Author(s)

H. Pages

See Also

Examples

gr0 <- GRanges(
        seqnames=Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
        ranges=IRanges(c(1:9,7L), end=10),
        strand=Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
        seqlengths=c(chr1=11, chr2=12, chr3=13))
gr <- c(gr0, gr0[7:3])

## ---------------------------------------------------------------------
## A. ELEMENT-WISE (AKA "PARALLEL") COMPARISON OF 2 GenomicRanges OBJECTS
## ---------------------------------------------------------------------
gr[2] == gr[2]  # TRUE
gr[2] == gr[5]  # FALSE
gr == gr[4]
gr >= gr[3]

## ---------------------------------------------------------------------
## B. duplicated(), unique()
## ---------------------------------------------------------------------
duplicated(gr)
unique(gr)

## ---------------------------------------------------------------------
## C. match(), %in%
## ---------------------------------------------------------------------
table <- gr[1:7]
match(gr, table)  # Warning! The warning will be removed in BioC 2.13.

## In the meantime, specify 'match.if.overlap=FALSE' to suppress the warning:
match(gr, table, match.if.overlap=FALSE)
match(gr, table, ignore.strand=TRUE, match.if.overlap=FALSE)

gr %in% table  # Warning! The warning will be removed in BioC 2.13.
## In the meantime, use suppressWarnings() to suppress the warning:
suppressWarnings(gr %in% table)

## ---------------------------------------------------------------------
## D. findMatches(), countMatches()
## ---------------------------------------------------------------------
findMatches(gr, table)
countMatches(gr, table)

findMatches(gr, table, ignore.strand=TRUE)
countMatches(gr, table, ignore.strand=TRUE)

gr_levels <- unique(gr)
countMatches(gr_levels, gr)

## ---------------------------------------------------------------------
## E. order() AND RELATED METHODS
## ---------------------------------------------------------------------
order(gr)
sort(gr)
rank(gr)

## ---------------------------------------------------------------------
## F. GENERALIZED ELEMENT-WISE COMPARISON OF 2 GenomicRanges OBJECTS
## ---------------------------------------------------------------------
gr2 <- GRanges(c(rep("chr1", 12), "chr2"), IRanges(c(1:11, 6:7), width=3))
strand(gr2)[12] <- "+"
gr3 <- GRanges("chr1", IRanges(5, 9))

compare(gr2, gr3)
rangeComparisonCodeToLetter(compare(gr2, gr3))

[Package GenomicRanges version 1.12.1 Index]