Sample epitopes based on epitope probabilities. Note that the positions returned assume that the start of the amino acid sequence is also the start of the founder sequence in the simulation. We also assume that there are no frameshift mutations in the founder sequence.
Usage
sample_epitopes(
epitope_probabilities,
start_aa_pos = 0,
end_aa_pos = NULL,
num_epitopes = 10,
aa_epitope_length = 10,
max_fit_cost = 0.3,
max_resamples = 100,
ref_founder_map = NULL
)Arguments
- epitope_probabilities
Epitope probability tibble as output by
get_epitope_frequencies(), including columnsaa_positionandepitope_probability.aa_positionshould be indexed at 0- start_aa_pos
Starting amino acid position to consider for epitopes, indexed at 0 (default: 0, i.e. the first position)
- end_aa_pos
Ending amino acid position to consider for epitopes, indexed at 0 (default: NULL, i.e. through the final position in
epitope_probabilities$aa_position)- num_epitopes
Number of epitopes to sample
- aa_epitope_length
Amino acid epitope length
- max_fit_cost
Maximum fitness cost of an epitope, must be in the range [0,1) where 0 indicates no cost. 1, which indicates no ability to survive, is not allowed (default: 0.3) note that the model output is very sensitive to this parameter
- max_resamples
Maximum number of resampling events to attempt; this is to prevent an infinite loop (default: 100)
- ref_founder_map
Output from
map_ref_founder(), including nucleotide reference and founder positions (ref_posandfounder_pos). NOTE: The reference positions here, if they were converted to amino acid positions, are expected to match with the reference positions inepitope_probabilities. Further, we assume that the founder indices align with the founder sequence positions to be used in the simulation (default: NULL)
Value
tibble with the num_epitopes rows and the following columns:
epi_start_nt: nucleotide epitope start positionepi_end_nt: nucleotide epitope end positionmax_fitness_cost: maximum fitness cost for that epitope
Examples
sample_epitopes(get_epitope_frequencies(env_features$Position - 1))
#> 5 resamples required
#> # A tibble: 10 × 3
#> epi_start_nt epi_end_nt max_fitness_cost
#> <dbl> <dbl> <dbl>
#> 1 882 912 0.03
#> 2 819 849 0.06
#> 3 489 519 0.09
#> 4 576 606 0.12
#> 5 522 552 0.15
#> 6 459 489 0.18
#> 7 357 387 0.21
#> 8 951 981 0.24
#> 9 1080 1110 0.27
#> 10 1272 1302 0.3
