Function for assigning clusters based on sequences similarity
Source:R/v0_seqCluster.R
seqCluster.RdArguments
- .data
The data which was used to caluculate .dist object. Can be data.frame, data.table::data.table, or a list of these objects.
Every object must have columns in the immunarch compatible format immunarch_data_format
- .dist
List of distance objects produced with seqDist function.
- .perc_similarity
Numeric value between 0 and 1 specifying the maximum acceptable weight of an edge in a graph. This threshold depends on the length of sequences.
- .nt_similarity
Numeric between 0-sequence length specifying the threshold of allowing a 1 in n nucleotides mismatch in sequencies.
- .fixed_threshold
Numeric specifying the threshold on the maximum weight of an edge in a graph.
Value
Immdata data format object. Same as .data, but with extra 'Cluster' column with clusters assigned.
Examples
# \dontrun{
data(immdata)
# In this example, we will use only 2 samples with 500 clonotypes in each for time saving
input_data <- lapply(immdata$data[1:2], head, 500)
dist_result <- seqDist(input_data)
cluster_result <- seqCluster(input_data, dist_result, .fixed_threshold = 1)
#> Error in mutate(.data, ..., .by = { { .by }}, .keep = .keep, .before = { { .before }}, .after = { { .after }}): ℹ In argument: `length_value = map_chr(.y, ~ifelse(all(.x == .x[1]), yes
#> = .x[1], no = glue("range_{min(.x)}:{max(.x)}")))`.
#> Caused by error in `vapply()`:
#> ! values must be type 'character',
#> but FUN(X[[1]]) result is type 'integer'
# }