Table 1. Statistical test for clonal proliferation
IDaActual isolate set configurationbModified isolate set configurationcχ2 statistic (df = 1)P-valued
Size of isolate set/no. of sets of this sizeeθ^fLLgSize of isolate set/no. of sets of this sizehθ^fLLg
  • a Subjects with a single large set of identical isolate were included in the analysis.

  • b The nested model (i.e., model without clonal proliferation) uses the isolate set configuration observed for each subject.

  • c Clonal proliferation alters the configuration of sets of identical isolates by taking one sequence and increasing its multiplicity by a mechanism not present in the standard coalescent model. To model this behavior, we suppose that the proliferating sequence is the most frequent sequence observed and that it effectively replaces several singleton sequences that would have otherwise been observed in the sample. To maximize the likelihood, there are now two parameters: the mutation parameter (as in the nested model) and the number of sequences that the largest clone effectively replaced. We again use Ewens’ sampling formula but on an isolate set configuration that is modified to reverse the effect of this supposed replacement.

  • d P-value obtained using the likelihood ratio test with one degree of freedom. A value <0.05 (italicized) indicates that mutation-free viral replication is an insufficient explanation for the data.

  • e The largest set of isolates from each subject is shown in bold.

  • f For each subject, the mutation parameter estimate θ^ is chosen to maximize the log-likelihood of observing the isolate set configuration according to Ewens’ sampling formula for the coalescent (Ewens, 1972). The maximum permitted value for θ^ is 100.

  • g Log-likelihood.

  • h The size of the reduced isolate set that is created to reverse the effect of clonal proliferation is shown in bold.