`R/kmers.R`

`split_to_kmers.Rd`

Analysis immune repertoire kmer statistics: sequence profiles, etc.

split_to_kmers(.data, .k) kmer_profile(.data, .method = c("freq", "prob", "wei", "self"), .remove.stop = TRUE)

.data | Character vector or the output from |
---|---|

.k | Integer. Size of kmers. |

.method | Character vector of length one. If "freq" then return a position frequency matrix (PFM) - a matrix with occurences of each amino acid in each position. If "prob" then return a position probability matrix (PPM) - a matrix with probabilities of occurences of each amino acid in each position. This is a traditional representation of sequence motifs. If "wei" then return a position weight matrix (PWM) - a matrix with log likelihoods of PPM elements. If "self" then return a matrix with self-information of elements in PWM. For more information see https://en.wikipedia.org/wiki/Position_weight_matrix. |

.remove.stop | Logical. If TRUE (by default) remove stop codons. |

`split_to_kmers`

- Data frame with two columns (kmers and their counts).

`kmer_profile`

- a matrix with per-position amino acid statistics.

data(immdata) kmers <- getKmers(immdata$data[[1]], 5) kmer_profile(kmers) %>% vis() #> Warning: ggrepel: 20 unlabeled data points (too many overlaps). Consider increasing max.overlaps