I have seen in large sequence alignments that the conservation score is 0 or “-” at all positions. I think this is because the conservation score measures the number of amino acid characteristics (e.g. “small” or “polar” or “positive”) which are true or false for ALL of the sequences in the alignment. The higher the number of sequences the more likely it is that at least one sequence will differ at a given position, meaning a particular characteristic cannot be added to the list of characteristics that are true at that position.
Nonetheless, there must be some way of scoring how conserved a position is (i.e. the chemical similarity of the amino acids at a given position in the alignment), no matter how large the alignment? I can’t find this in Jalview. There are variations similar to this - e.g. colour by conservation, BLOSUM62 score or percentage identity… but I would prefer to use a score of how similar the amino acids at a given position are, without the score focussing on the consensus amino acid or requiring all amino acids to match a category. Is that possible?
Jim Procter will be able to answer this more completely, but he is away at the moment. In the meantime: The default conservation score in Jalview is Zvelebil’s method which is sensitive to gaps in a column and of course, if your alignment has a lot of columns that do not show much conservation then you will see zeros. In the implementation in AMAS (https://www.compbio.dundee.ac.uk/www-amas/) you can tune the number of gaps allowed per column but I don’t think we built that into Jalview.
This was mainly because, in Jalview what I usually do is first cluster a big alignment, then select subsets for analysis because often in a big alignment there are just a few sequences that cause the issues. Or, simply colour by groups having first clustered and cut the tree at a suitable point to filter out the outlier sequences from initial analysis.
There is a video that takes you through these steps if you are not sure how to do it. I hope this helps, Geoff.
P.S. You can also calculate different conservation methods in Jalview - see the “Web Service” menu. There are currently 17 different conservation methods so one of those might fit the behaviour you are looking for.
Web Service → Conservation->Change AA Con Settings
You can select which methods you want to see under the alignment. These update as you edit the alignment. Sometimes it is good to turn off the automatic update if you are editing a really big alignment. That option is also in the menu.
As I said before though, for large alignments I recommend that you first cluster the alignment and select the most informative regions/sequences to consider physico-chemical properties in each column or between sub groups in your sequence set. It was exactly for this kind of analysis that Jalview was developed to make it easy to work with big alignments. You might find this short video helpful in doing this kind of analysis:
I want to colour a .pdb file by conservation. Although I can show different conservation score method profiles below the alignment, I can’t see any option to colour the alignment according to different conservation score methods. Can this be done? Alternatively, can I export a list of conservation scores, so that I can manually put this in the b factor column of a .pdb file?