Score matrix question

Clarification needed please.

Jalview’s BLOSUM62 matrix is nearly but not quite symmetric:

C → R scores -3 (ResidueProperties.BLOSUM62[1, 4])

R → C scores +3 (ResidueProperties.BLOSUM62[4, 1])

(help page shows this the other way round - I think)

Is this a typo? I thought pairwise scoring has to be symmetric (PAM250 matrix is).

The help page for PCA says that the Jalview mode calculation (e.g. using BLOSUM62) creates an asymmetric score matrix. Is that correct, or is it actually symmetric (if the score matrix also is)?

Thanks

mungo

The University of Dundee is a registered Scottish Charity, No: SC015096

···

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017

BLOSUM62 should be symmetrical, so that is is a typo.

This is the one I use that comes from NCBI BLAST. The amino acid order is defined by the string on the second line.

G.

BLOSUM 62 matrix made from BLOCKS v. 5.0 and scaled in half-bits.
ARNDCQEGHILKMFPSTWYVBZX
4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -2 -1 0
-1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -1 0 -1
-2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 3 0 -1
-2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 4 1 -1
0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -3 -3 -2
-1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 0 3 -1
-1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 1 4 -1
0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 -1 -2 -1
-2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 0 0 -1
-1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 -3 -3 -1
-1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 -4 -3 -1
-1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 0 1 -1
-1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1 -3 -1 -1
-2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 -3 -3 -1
-1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 -2 -1 -2
1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 0 0 0
0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 -1 -1 0
-3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 -4 -3 -2
-2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 -3 -2 -1
0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 -3 -2 -1
-2 -1 3 4 -3 0 1 -1 0 -3 -4 0 -3 -3 -2 0 -1 -4 -3 -3 4 1 -1
-1 0 0 1 -3 3 4 -2 0 -3 -3 1 -1 -3 -1 0 -1 -3 -2 -2 1 4 -1
0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 -1 -1

Clarification needed please.

Jalview’s BLOSUM62 matrix is nearly but not quite symmetric:

C → R scores -3 (ResidueProperties.BLOSUM62[1, 4])

R → C scores +3 (ResidueProperties.BLOSUM62[4, 1])

(help page shows this the other way round - I think)

Is this a typo? I thought pairwise scoring has to be symmetric (PAM250 matrix is).

The help page for PCA says that the Jalview mode calculation (e.g. using BLOSUM62) creates an asymmetric score matrix. Is that correct, or is it actually symmetric (if the score matrix also is)?

Thanks

mungo

Email                 signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of                           Dundee FacebookUniversity of                           Dundee TwitterUniversity of                           Dundee LinkedInUniversity of                           Dundee YouTubeUniversity of                           Dundee InstagramUniversity of                           Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017

The University of Dundee is a registered Scottish Charity, No: SC015096

_______________________________________________
Jalview-dev mailing list
[Jalview-dev@jalview.org](mailto:Jalview-dev@jalview.org)
[http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-dev](http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-dev)

The University of Dundee is a registered Scottish Charity, No: SC015096

···

On 03/02/2017 09:17, Mungo Carstairs (Staff) wrote:

-- 
Geoff Barton | Professor of Bioinformatics | Head of Division of Computational Biology   
School of Life Sciences | University of Dundee, Scotland, UK | [g.j.barton@dundee.ac.uk](mailto:g.j.barton@dundee.ac.uk) 
Tel: +44 1382 385860 | [www.compbio.dundee.ac.uk](http://www.compbio.dundee.ac.uk) | twitter: @gjbarton
 

The University of Dundee is registered Scottish charity: No.SC015096 

Hi. I have to say I'm embarrassed by this one, but then, that's life.
The error has been in the code since the dawn of time, it seems!

BLOSUM62 should be symmetrical, so that is is a typo.

[sic] :slight_smile:

Mungo - may I suggest that rather than simply fixing the typo, we:
1. Provide a legacy option allow the broken BLOSUM62 matrix to be used
in place of the real matrix. (see, e.g.
https://issues.jalview.org/browse/JAL-728 )

2. Move the score matrices out of static initialisers into flat files
under resources, either as the NCBI matrix format or the aaindex format:
http://www.genome.jp/dbget-bin/www_bget?aaindex:HENS920102

(there might not be an issue about importing AAindex matrices yet..)

Is this a typo? I thought pairwise scoring has to be symmetric (PAM250
matrix is).

Not sure that holds in general (there's no requirement in biology for
it, and most alignment methods don't depend on it).

Log odds matrices as formulated by Henikoff & Henikoff are symmetric,
however.

The help page for PCA says that the Jalview mode calculation (e.g.
using BLOSUM62) creates an asymmetric score matrix. Is that correct,

I guess that's just a knock on effect.

However, it's not really relevant - the 'Jalview mode' is the variant of
the seqspace algorithm first provided by Jalview.

I discovered we were doing something different to the original method
back in 2012:
https://issues.jalview.org/browse/JAL-1125

or is it actually symmetric (if the score matrix also is)?

perhaps you can verify ?

Thanks again for your eagle-eyed exhaustive testing, Mungo!
Jim.

···

On 03/02/2017 09:28, Geoff Barton wrote:

On 03/02/2017 09:17, Mungo Carstairs (Staff) wrote:

--
-------------------------------------------------------------------
Dr JB Procter, Jalview Coordinator, The Barton Group
Division of Computational Biology, School of Life Sciences
University of Dundee, Dundee DD1 5EH, UK.
+44 1382 388734 | www.jalview.org | www.compbio.dundee.ac.uk