findNeighbors {BiocNeighbors}R Documentation

Find all neighbors

Description

Find all neighboring data points within a certain distance with the KMKNN algorithm.

Usage

findNeighbors(X, threshold, get.index=TRUE, get.distance=TRUE, 
    BPPARAM=SerialParam(), precomputed=NULL, subset=NULL, 
    raw.index=FALSE, ...)

Arguments

X

A numeric matrix where rows correspond to data points and columns correspond to variables (i.e., dimensions).

threshold

A positive numeric scalar specifying the maximum distance at which a point is considered a neighbor.

get.index

A logical scalar indicating whether the indices of the neighbors should be recorded.

get.distance

A logical scalar indicating whether distances to the neighbors should be recorded.

BPPARAM

A BiocParallelParam object indicating how the search should be parallelized.

precomputed

A KmknnIndex object from running buildKmknn on X.

subset

A vector indicating the rows of X for which the neighbors should be identified.

raw.index

A logial scalar indicating whether raw column indices to precomputed$data should be returned.

...

Further arguments to pass to buildKmknn if precomputed=NULL.

Details

This function uses the same algorithm described in findKmknn to identify all points in X that within threshold of each point in X. For Euclidean distances, this is equivalent to identifying all points in a hypersphere centered around the point of interest.

By default, a search is performed for each data point in X, but it can be limited to a specified subset of points with subset. This yields the same result as (but is more efficient than) subsetting the output matrices after running findNeighbors with subset=NULL.

Turning off get.index or get.distance may provide a slight speed boost when these returned values are not of interest. Using BPPARAM will also split the search by query points, which usually provides a linear increase in speed.

If multiple queries are to be performed to the same X, it may be beneficial to use buildKmknn directly and pass the result to precomputed. In such cases, it is also possible to set raw.index=TRUE to obtain indices of neighbors in the reordered data set in precomputed, though this will change both the nature of the output index and the interpretation of subset - see ?findKmknn for details.

Value

A list is returned containing:

If subset is not NULL, each row of the above matrices refers to a point in the subset, in the same order as supplied in subset.

If raw.index=TRUE, the values in index refer to columns of KmknnIndex_clustered_data(precomputed).

Author(s)

Aaron Lun

See Also

buildKmknn to build an index ahead of time.

Examples

Y <- matrix(runif(100000), ncol=20)
out <- findNeighbors(Y, threshold=1)

[Package BiocNeighbors version 1.0.0 Index]