[Jmol-users] Protein sequence alignment display/analysis: MSAReveal.Org

Hi All.

Eric Martz posted this announcement on the Jmol list about his new
Alignment visualisation tool ! This could be another candidate for the
HTML publisher mechanism.


Attached Message Part (80 Bytes)

Attached Message Part (162 Bytes)


-------- Forwarded Message --------
Subject: [Jmol-users] Protein sequence alignment display/analysis:
Date: Mon, 26 Sep 2016 12:07:03 -0400
From: Eric Martz <emartz@microbio.umass.edu>
Reply-To: jmol-users@lists.sourceforge.net
To: jmol-users@lists.sourceforge.net

Apologies for this off-topic message which involves Jmol only
tangentially, but will hopefully be of interest to some Jmol users.

Recently I wanted to view some FASTA format protein sequence alignments
and was not able to find free software that made me happy. So I made

MSAReveal.Org (free, open source)

  * Paste in protein sequences or alignments in FASTA format.
  * Genus and species, gene names, and UniProt IDs are extracted from
    the headers and tabulated. UniProt IDs are linked to UniProt.
  * Checkboxes control coloring of amino acids/groups.
  * Touching an amino acid reports its 3-letter abbreviation and
    sequence number in a tooltip. You don’t have to memorize all 20
    one-letter amino acid codes.
  * The sequence listing can be horizontally scrolled, or wrapped at a
    specified number of amino acids.
  * A sequence numbering offset can be specified with “start=N” in the
    header of each sequence. A negative number enables numbering to
    start at 1 for the mature protein, after a negatively numbered
    signal sequence.
  * A consensus is reported with the frequencies of amino acids in each
  * Statistics are tabulated including lengths (excluding gaps),
    percentages identity, percentages in gaps, percentages charged,
    percentages aromatic, net charge at neutral pH, etc. etc..
  * Minimum, average, and maximum are given for each column in Statistics.
  * The Statistics Table can be sorted on any column.
  * A search mechanism finds and highlights sequence fragments/motifs
    regardless of gaps. It accepts multiple amino acid possibilities at
    a given position.
  * Search hits are listed with links so you can jump to any hit instantly.
  * Numerous irregularities or warnings are reported, including the
    ambiguous codes B, J, O, U, X, Z and illegal characters, which can
    be located easily with the search mechanism.
  * A description of each alignment, added to the first header, will be
  * A description of each sequence, added to its header, will be
    displayed when its taxon is touched.
  * A Protein Data Bank ID, when added to the header (e.g. “PDB=2ace”)
    will be tabulated and linked to display the 3D model in

A more detailed overview with snapshots:

Demonstration and test alignments are built in. Instructions are
provided for downloading sequences from UniProt, and aligning them with
free, easy, quick Jalview (using MAFFT, TCOFFEE, MUSCLE, etc.). Displays
any FASTA alignments, e.g. from the ConSurf Server. MSAReveal.Org works
in all popular browsers, Windows or OS X. Tested up to a total of one
million amino acids, and with alignments containing hundreds of sequences.

Sincerely, Eric

Eric Martz, Professor Emeritus, Dept Microbiology
University of Massachusetts, Amherst MA US
Martz.MolviZ.Org <http://Martz.MolviZ.Org>