[Re: Cleaning up protein sequences]

Sorry, meant to copy this to the list.
G.

···

-------- Original Message --------
Subject: Re: [Jalview-discuss] Cleaning up protein sequences
Date: Mon, 26 Jul 2010 09:39:05 +0100
From: Geoff Barton <g.j.barton@dundee.ac.uk>
Organisation: University of Dundee
To: Jeremy Semeiks <jeremy.semeiks@utsw.edu>
References: <AANLkTiniaVsJP6B4N=ybGKqXpfE+3zofNL5knPWpNWED@mail.gmail.com>

I'm not sure if this is exactly what you want, but in Jalview you can
"hide" columns or rows much like you can in a spreadsheet. Any
subsequent operation (e.g. alignment) is then done only on what is
visible. I use this feature to help clean up big alignments without
losing the full sequences, or to keep a big alignment while only working
on a small part. This feature is also useful for selecting a subset of
the alignment for display or subsequent tree building etc.

Geoff.

Jeremy Semeiks wrote:

Hi,

Suppose I've created a protein alignment based on a few hundred
sequences from a PSI-BLAST query. Many of the sequences in this
alignment will contain extra junk regions or, conversely, omitted
regions. I want to clean up these sequences so I can re-align them and
analyze them further.

When I'm reviewing my alignment in Jalview, I'd like to change the
junk regions into gaps. I don't want to just delete the junk with the
backspace key, because that would mess up the alignment and make my
review harder.

As far as I can tell, there's no single command that would allow me to
change a selection to all gaps in Jalview. I've thought of a few
different workarounds, but they are pretty inefficient.

Am I missing something? Or is there better software than Jalview to
use for this kind of thing?

Actually, the ideal general solution to my problem would be for
Jalview to automatically run tblastn on specified sequences. Then it
could either find a missing region and insert it or verify that a long
run of junk-like stuff is in fact junk and delete it. This is what I
do manually for particularly valuable sequences. But I realize that
this might be hard to automate.

Thanks,
Jeremy
_______________________________________________
Jalview-discuss mailing list
Jalview-discuss@jalview.org
http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-discuss

--
Geoff Barton, Professor of Bioinformatics, College of Life Sciences
University of Dundee, Scotland, UK. g.j.barton@dundee.ac.uk
Tel:+44 1382 385860/388731 (Fax:385764) www.compbio.dundee.ac.uk

The University of Dundee is registered Scottish charity: No.SC015096

--
Geoff Barton, Professor of Bioinformatics, College of Life Sciences
University of Dundee, Scotland, UK. g.j.barton@dundee.ac.uk
Tel:+44 1382 385860/388731 (Fax:385764) www.compbio.dundee.ac.uk

The University of Dundee is registered Scottish charity: No.SC015096