Google
More docs on the ARB website.
See also index of helppages.
Last update on 04. Mar 2022 .
Main topics:
Related topics:

Branch analysis

OCCURRENCE  

ARB_NT/Tree/Branch analysis

 

DESCRIPTION  

Branch analysis functions

Functions provided here may be useful to

  • detect wrong placed species or groups (or poor data, which might also lead to wrong placement)
  • detect anomalies caused by (wrong) tree reconstruction
  • gather information about tree topologies.

Each function reports several values gathered during execution.

 

Analyse distances in tree  

For the whole tree

  • the in-tree-distance (ITD = sum of all branchlengths) and
  • the per-species-distance (PSD = ITD / number of species) are displayed.

For all leafs in the tree the following values are calculated:

  • mean distance to all other leafs
  • minimum distance to any other leaf
  • maximum distance to any other leaf

It reports the mean and the range of each of these 3 values separately for all and for marked species.

 

Using the PSD to compare trees  

The PSD is useful when comparing tree topologies (based on similar sets of species) that were reconstructed using different methods. Imagine you have two trees:

  • tree_raxml (reconstructed with RAxML)
  • tree_arbpars (reconstructed with ARB parsimony)

The PSDs of both trees will be quite different (maybe by factor 50 or 60). Calculating the ratio of both PSDs, give you a good value for scaling the branchlengths of (a copy of) one of the trees. For example the PSDs might be

  • PSDr = PSD(tree_raxml) = 0.002219
  • PSDp = PSD(tree_arbpars) = 0.118483

Now you may scale the branchlengths of tree_arbpars by factor 0.01873 (=PSDr/PSDp) or those of tree_raxml by factor 53.39 (==PSDp/PSDr) to ease comparison of the two trees.

 

Mark long branches  

For each furcation in the tree, the relative difference between the distances of its subtrees is calculated.

'Distance' here is the sum of all branches between the furcation and the least distant leaf of the left resp. right subtree.

Relative difference     Meaning
                        The nearer subtree has at least
10%                     90%
50%                     50%
75%                     25%
90%                     10%
                        of the farther subtrees distance.

Starting from the tree-tips, this function marks the more distant subtree of any furcation where the relative and absolute difference are above the specified minimas.

When a subtree has been marked, all further furcations between that subtree and the root of the whole tree will be ignored.

Poorly aligned sequences often result in long branches in the tree. Being able to identify those branches quickly helps to find those sequences.

The indented workflow is

  • search long branches
  • check alignment and data and fix any problems
  • recalculate tree parts (see ´Add species with local optimization´)
  • search again. Now you may find other branches, nearer to the tree root.

 

Mark deep leafs  

Marks alls leafs in tree that have

  • depth above min.depth and
  • root-distance above min.root-distance

'depth' is the number of branches between root and leaf. Multifurcations are respected properly.

The 'root-distance' is the sum of the lengths of all branches between the root and a leaf.

 

Mark degenerated branches  

Branches are considered degenerated when two subtrees of an inner node differ in size (=number of members) by a reasonable factor.

This function allows you to specify that degeneration factor.

For each degenerated inner node, the smaller subtree will be marked as whole. The not-marked subtree will be examined for further degenerated nodes.

Common reasons for degenerated trees:

  • subsequently adding species using the 'quick add marked'-feature of ARB parsimony without ever optimizing the whole tree.
  • some "phylogenetic areas" are explored more thoroughly than others, resulting in unbalanced representation of the evolution as it took place. This is especially relevant if your database contains many clone-variants and you try to calculate a tree.
    Solutions:
  • Optimize your tree. For big trees you might try to
    • mark questionable species using this function and then
    • perform local/global optimization of marked species in ARB parsimony.

  • Replace over-represented areas by one or few representatives (see also ´Cluster detection´). Calculate a new or optimize an existing tree with that subset of species. Then quick-add previously removed species into that tree.

 

Automarking  

If the 'Auto mark?'-toggle is checked, changing any of the parameters will instantly trigger the execution of the corresponding mark function.

 

NOTES  

To compare the information of two or more trees, open new ARB_NT-window using 'File/New window' and popup their 'Branch analysis'-windows.

 

EXAMPLES  

None

 

WARNINGS  

None

 

BUGS  

No bugs known