| * ReadSeq -- 1 Feb 93
| * Reads and writes nucleic/protein sequences in various
| * formats. Data files may have multiple sequences.
| * Copyright 1990 by d.g.gilbert
| * biology dept., indiana university, bloomington, in 47405
| * e-mail: email@example.com
| * This program may be freely copied and used by anyone.
| * Developers are encourged to incorporate parts in their
| * programs, rather than devise their own private sequence
| * format.
| * This should compile and run with any ANSI C compiler.
| * Please advise me of any bugs, additions or corrections.
Readseq has been updated. There have been a number of enhancements and a few bug corrections since the previous general release in Nov 91 (see below). If you are using earlier versions, I recommend you update to this release.
Readseq is particularly useful as it automatically detects many sequence formats, and interconverts among them. Formats added to this release include
MSF multi sequence format used by GCG software
PAUP's multiple sequence (NEXUS) format
PIR/CODATA format used by PIR
ASN.1 format used by NCBI
Pretty print with various options for nice looking output.
As well, Phylip format can now be used as input. Options to reverse-compliment and to degap sequences have been added. A menu addition for users of the GDE sequence editor is included.
This program is available thru Internet gopher, as
gopher ftp.bio.indiana.edu browse into the IUBio-Software+Data/molbio/readseq/ folder select the readseq.shar document
Or thru anonymous FTP in this manner:
my_computer> ftp ftp.bio.indiana.edu (or IP address 188.8.131.52)
ftp> cd molbio/readseq
ftp> get readseq.shar
readseq.shar is a Unix shell archive of the readseq files. This file can be editted by any text editor to reconstitute the original files, for those who do not have a Unix system or an Unshar program. Read the top of this .shar file for further instructions.
There are also pre-compiled executables for the following computers: Silicon Graphics Iris, Sparc (Sun Sparcstation & clones), VMS-Vax, Macintosh. Use binary ftp to transfer these, except Macintosh. The Mac version is just the command-line program in a window, not very handy.
C source files:
readseq.c ureadseq.c ureadasn.c ureadseq.h
Readme (this doc)
Readseq.help (longer than this doc)
Formats (description of sequence file formats)
add.gdemenu (GDE program users can add this to the .GDEmenu file)
Stdfiles -- test sequence files
Makefile -- Unix make file
Make.com -- VMS make file
*.std -- files for testing validity of readseq
-- for interactive use
readseq my.1st.seq my.2nd.seq -all -format=genbank -output=my.gb
-- convert all of two input files to one genbank format output file
readseq my.seq -all -form=pretty -nameleft=3 -numleft -numright -numtop -match
-- output to standard output a file in a pretty format
readseq my.seq -item=9,8,3,2 -degap -CASE -rev -f=msf -out=my.rev
-- select 4 items from input, degap, reverse, and uppercase them
cat *.seq | readseq -pipe -all -format=asn > bunch-of.asn
-- pipe a bunch of data thru readseq, converting all to asn
The brief usage of readseq is as follows. The "" denote optional parts of the syntax:
readSeq (27Dec92), multi-format molbio sequence reader.
usage: readseq [-options] in.seq > out.seq
-a[ll] select All sequences
-c[aselower] change to lower case
-C[ASEUPPER] change to UPPER CASE
-degap[=-] remove gap symbols
-i[tem=2,3,4] select Item number(s) from several
-l[ist] List sequences only
-o[utput=]out.seq redirect Output
-p[ipe] Pipe (command line, <stdin, >stdout)
-r[everse] change to Reverse-complement
-v[erbose] Verbose progress
-f[ormat=]# Format number for output, or
-f[ormat=]Name Format name for output:
| 1. IG/Stanford 10. Olsen (in-only)
| 2. GenBank/GB 11. Phylip3.2
| 3. NBRF 12. Phylip
| 4. EMBL 13. Plain/Raw
| 5. GCG 14. PIR/CODATA
| 6. DNAStrider 15. MSF
| 7. Fitch 16. ASN.1
| 8. Pearson/Fasta 17. PAUP
| 9. Zuker 18. Pretty (out-only)
Pretty format options:
-wid[th]=# sequence line width
-tab=# left indent
-col[space]=# column space within sequence line on output
-gap[count] count gap chars in sequence numbers
-nameleft, -nameright[=#] name on left/right side [=max width]
-nametop name at top/bottom
-numleft, -numright seq index on left/right side
-numtop, -numbot index on top/bottom
-match[=.] use match base for 2..n species
-inter[line=#] blank line(s) between sequence blocks
4 May 92