HANDLING:
Fill in the values you think are appropriate. The default values are the values that worked best in the first test runs. Many criteria are evaluated (see 'THE VALUES' below for details).
A final "quality-value" (percentage) for each sequence is calculated and all sequences below the given threshold may get marked.
HOW IT WORKS:
In the section "weights" you have quite a few options to fill in.
These are some of the criteria used to evaluate the quality of the sequences.. The values represent the share of the criteria in the final evaluation-formula. All values represent percentages, therefore all values together should sum up to 100.
The final evaluation value is stored in the field 'quality/ali_XXX/evaluation'.
THE WEIGHTS:
Base analysis:
This is the number of bases that are stored in the sequence. "-" and "." are not counted.
Deviation:
This is the deviation of the number of bases from a sequence to the average number of bases in a group.
No Helices:
This is the number of positions in a sequence where a helix structure is expected, but base pairings form no bond (i.e. are one of AA AC CC CT CU TT UU).
The number of weak and strong base pairings are also calculated and stored in quality database fields ('number_of_weak_helix' and 'number_of_strong_helix'), but are NOT used for the final evaluation value.
It is not possible to define which base pairings count as "none", "weak" or "strong", like possible in
´Define Helix Symbols´. The sequence quality tool always uses the default values.
Consensus:
For each named group found in the tree (selected below) a consensus sequence is calculated.
Every species' sequence is compared against the consensus sequences of all groups of which the species is a member.
That comparison uses conformity with and deviation from the consensus sequence.
IUPAC:
This is the number of IUPAC-ambiguity-codes stored in a sequence.
GC proportion:
This is the deviation in GC proportion from a sequence to group.