queryKNN {BiocNeighbors} | R Documentation |
Find the k-nearest neighbors in one data set for each point in another query data set, using exact or approximate algorithms.
queryKNN(X, query, k, subset=NULL, get.index=TRUE, get.distance=TRUE, BPPARAM=SerialParam(), transposed=FALSE, ..., BNINDEX, BNPARAM)
X |
A numeric data matrix where rows are points and columns are dimensions. |
query |
A numeric query matrix where rows are points and columns are dimensions. |
k |
An integer scalar for the number of nearest neighbors. |
subset |
A vector specifying the subset of points in |
get.index |
A logical scalar indicating whether to return row indices of the neighbors. |
get.distance |
A logical scalar indicating whether to return distances to neighbors. |
BPPARAM |
A BiocParallelParam object for parallelization. |
transposed |
A logical scalar indicating whether |
... |
Further arguments to pass to specific methods. |
BNINDEX |
A BiocNeighborIndex object containing precomputed index information.
This can be missing if |
BNPARAM |
A BiocNeighborParam object specifying the algorithm to use.
This can be missing if |
The class of BNINDEX
and BNPARAM
will determine dispatch to specific methods.
Only one of these arguments needs to be defined to resolve dispatch.
However, if both are defined, they cannot specify different algorithms.
If BNINDEX
is supplied, X
does not need to be specified.
In fact, any value of X
will be ignored as all necessary information for the search is already present in BNINDEX
.
Similarly, any parameters in BNPARAM
will be ignored.
If both BNINDEX
and BNPARAM
are missing, the function will default to the KMKNN algorithm by setting BNPARAM=KmknnParam()
.
A list is returned containing:
index
, if get.index=TRUE
.
This is an integer matrix where each row corresponds to a point (denoted here as i) in query
.
The row for i contains the row indices of X
that are the nearest neighbors to point i, sorted by increasing distance from i.
distance
, if get.distance=TRUE
.
This is a numeric matrix where each row corresponds to a point (as above) and contains the sorted distances of the neighbors from i.
If subset
is not NULL
, each row of the above matrices refers to a point in the subset, in the same order as supplied in subset
.
Aaron Lun
queryKmknn
,
queryVptree
,
queryAnnoy
and queryHnsw
for specific methods.
Y <- matrix(rnorm(100000), ncol=20) Z <- matrix(rnorm(10000), ncol=20) str(k.out <- queryKNN(Y, Z, k=10)) str(a.out <- queryKNN(Y, Z, k=10, BNPARAM=AnnoyParam())) k.dex <- buildKmknn(Y) str(k.out2 <- queryKNN(Y,Z, k=10, BNINDEX=k.dex)) str(k.out3 <- queryKNN(Y,Z, k=10, BNINDEX=k.dex, BNPARAM=KmknnParam())) a.dex <- buildAnnoy(Y) str(a.out2 <- queryKNN(Y,Z, k=10, BNINDEX=a.dex)) str(a.out3 <- queryKNN(Y,Z, k=10, BNINDEX=a.dex, BNPARAM=AnnoyParam()))