More docs on the ARB website.
See also index of helppages.
Last update on 25. Nov 2018 .
Main topics:
Related topics:

Translate DNA to Protein





Translates nucleic acid sequences. The alignment of the amino acid sequences is adapted to that of the nucleic acids. The one letter code is used.

See NOTES below for intended workflow and code table details.



To allow easy translation of gene sequences you may select to use two entries of the species:

  • codon_start, which can be 1, 2 or 3
  • transl_table, which has to be a valid genetic code number. Refer to the numbers in the codon table selector to determine which genetic codes are known by ARB. These numbers are the same as the translation table numbers used in the EMBL database.

These 2 fields are extracted from genes to gene-species automatically (see ´Extract genes to gene-species´).

If both fields are missing, the selected values for 'Start position' and 'Codon table' are used. If one field is missing an error is raised.



  1. Select source and destination alignment from the respective subwindows.
  2. Select reading frame by pressing the 'Start position' button and selecting first, second or third absolute position. Note that setting the cursor to the start of the reading frame in ARB_EDIT4 will update the value of 'Start position'. Alternatively you may select 'choose best': in this case the translation start position is chosen out of the set of positions 1, 2 and 3 such that the number of stop codons is minimised. This is done separately for each translated sequence.
  3. Select the codon table to use.
  4. Press the 'TRANSLATE' button. All marked sequences will be translated.
    DNA:    ---UGG...GUAUGGUUA
    PRO:    -Y.LYG



By checking the 'Save settings' toggle, the used values for the start position and the translation table are written into the corresponding fields ('codon_start' and 'transl_table') of every "translated species". That happens in Manual and Automatic mode.

By checking the 'Translate all data' button, the translation insert a 'X' in front of the generated amino acid sequence, if you select starting position 2 or 3 and if there are nucleotides in front of that starting position. Later automatic realignments would fail if that 'X' is missing!



The program does NOT begin at the first three bases, but at the first three alignment positions. That means that all your three letter codons should start at every third position.

Example:        (### codon for aminoacid # )
DNA:            ...111...222...333444
PRO:            .1.2.34
DNA:            ..111...222....333444
PRO:            XXXX.34     // 1 2 are out of sync
DNA:            ...111...22.2..333444
PRO:            .1.XX34         // bad alignment for 2


Refer to ´Translation tables´ for details.



If the editor is opened the reading frame changes after translation.