ID & similarity measures for a MSA?

I was wondering if it were possible to extract from Jalview, for a given multiple sequence alignment (that has not been calculated by Jalview's external servers, but just input already made by text window or the like), measures of identity and similarity (say, for the chosen coloring scheme, clustalX or blosum62 matrices, for instance) between all the constituent protein sequences. Many thanks for answering this simple question, Fernando Bazan

Hello Fernando.

I was wondering if it were possible to extract from Jalview, for a
given multiple sequence alignment (that has not been calculated by
Jalview's external servers, but just input already made by text
window or the like), measures of identity and similarity (say, for
the chosen coloring scheme, clustalX or blosum62 matrices, for
instance) between all the constituent protein sequences. Many thanks
for answering this simple question, Fernando Bazan
   

The short answer is no - Jalview does not currently provide an option allowing you to extract similarity matrices. However, it would be trivial to extend jalview to enable this - since matrices are calculated for both the tree and pca analysis.

You are not the first person to ask about such a function - which suggests that it might be of use to other Jalview users. So, I'll add this to the jalview issues database. However, with regard to this new feature, I have a couple of questions:

1. Do you have a format in mind for matrix export ? There are a number of common matrix formats in use in bioinformatics (e.g. PAUP, clustalW similarity matrix, MCL, Clans, etc), and I'm not sure which is the preferred format (I guess that answers my question, implement them all!).

2. I am not sure what you mean by colourscheme or 'clustalX' matrix. Do you mean that a sequence similarity matrix could be generated from the similarity of colours for each sequence in the alignment? This is possible, but would require more time to implement - because the colours are only generated at the last stage of rendering the alignment, rather than as a low level analysis function.

3. How would this be used ? I could imagine some kind of 'save matrix' function in the user interface, but would you want an equivalent method in the javascript api ? this would facilitate integrating the alignment view with something else in the webapp that uses the similarity matrix (e.g. something that would cluster the sequences and generate a tree or sequence groups that could be imported back to Jalview).

Thanks for the question!
Jim.

···

On 09/06/2010 23:46, Fernando Bazan wrote: