R/kmers.R
split_to_kmers.Rd
Analysis immune repertoire kmer statistics: sequence profiles, etc.
split_to_kmers(.data, .k)
kmer_profile(.data, .method = c("freq", "prob", "wei", "self"), .remove.stop = TRUE)
Character vector or the output from getKmers
.
Integer. Size of k-mers.
Character vector of length one. If "freq" then returns a position frequency matrix (PFM) - a matrix with occurences of each amino acid in each position.
If "prob" then returns a position probability matrix (PPM) - a matrix with probabilities of occurences of each amino acid in each position. This is a traditional representation of sequence motifs.
If "wei" then returns a position weight matrix (PWM) - a matrix with log likelihoods of PPM elements.
If "self" then returns a matrix with self-information of elements in PWM.
For more information see https://en.wikipedia.org/wiki/Position_weight_matrix.
Logical. If TRUE (by default) remove stop codons.
split_to_kmers
- Data frame with two columns (k-mers and their counts).
kmer_profile
- a matrix with per-position amino acid statistics.
data(immdata)
kmers <- getKmers(immdata$data[[1]], 5)
kmer_profile(kmers) %>% vis()
#> Warning: Warning: removed 5 non-amino acid symbol(s): A
#> Please make sure your data doesn't have them in the future.
#> Warning: Removed 5 rows containing missing values (`geom_point()`).
#> Warning: Removed 5 rows containing missing values (`geom_label_repel()`).
#> Warning: ggrepel: 20 unlabeled data points (too many overlaps). Consider increasing max.overlaps