kmerSplit {FindMyFriends} | R Documentation |
This function splits up gene groups based on cosine similarity of kmer
feature vectors. It uses hard splitting based on a similarity cutoff where
unconnected components constitutes new groups. Unlike
neighborhoodSplit
, paralogues cannot be forced into separate
groups as information needed for this is not present.
kmerSplit(object, ...) ## S4 method for signature 'pgVirtual' kmerSplit(object, kmerSize, lowerLimit, maxLengthDif, pParam)
object |
A pgVirtual subclass |
... |
Arguments passed on |
kmerSize |
The length of kmers used for sequence similarity |
lowerLimit |
The lower limit of sequence similarity below which it will be set to 0 |
maxLengthDif |
The maximum deviation in sequence length to allow. Between 0 and 1 it describes a percentage. Above 1 it describes a fixed length |
pParam |
An optional BiocParallelParam object that defines the workers used for parallelisation. |
A new pgVirtual subclass object of the same class as 'object'
pgVirtual
: Kmer similarity based group splitting for pgVirtual
subclasses
Other group-splitting: neighborhoodSplit
# Get a grouped pangenome pg <- .loadPgExample(withGroups = TRUE) ## Not run: # Split groups by similarity (Too heavy to include) pg <- kmerSplit(pg, lowerLimit = 0.8) ## End(Not run)