Use this to add sequences with partial sequence data into an existing tree. The current tree topology will not be optimized after insertion.
The branchlengths of the added partial sequences represent the number of (weighted) mutations against the full sequence.
Only the overlapping part of both sequences is taken into account and the distance to the FLS will be weighted by the length of the overlap.
As for non-partial sequences, when using a filter only the unfiltered positions will be taken into account to calculate the number of mutations.
The insertion order has no effect on the resulting placement of the partial sequences. For each sequence, the best matching full-length sequence (FLS) is searched and afterwards all partial sequences are placed next to their detected FLS.
Often partial sequences have equal distances to more than one FLS. This happens whenever two FLS only differ in alignment regions, where the partial sequence has no data. In that case a warning is printed ("Insertion of '<name>' is ambiguous") and one of the ambiguous insertion possibilities is chosen. This is more likely to happen for low amounts of sequence data (i.e. very short sequences). Consider to remove these species from the tree, as their placement might be meaningless or misleading.
|