exporting conservation scores from one sequence in alignment

dmgarcia · 4 June 2014 17:33

Hello,

When using “Export Annotation” from the Conservation track under an MSA, it can generate a .csv file that shows conservation scores for the number of residues equivalent to the longest protein in the alignment. Is there a way I can extract the conservation values corresponding to one of the other proteins in the alignment? I would like to be able to extract conservation scores for the residues only present in certain proteins in my alignment (in an excel friendly format), without all the gaps.

thank you,
David

jalviewcrowdadmin · 5 June 2014 09:23

Hi David.

Is there a way I can extract the conservation values corresponding to one of the other proteins in the alignment? I would like to be able to extract conservation scores for the residues only present in certain proteins in my alignment (in an excel friendly format), without all the gaps.

I've lodged a feature request about this (http://issues.jalview.org/browse/JAL-1516). I also came up with a fairly messy script that might do what you need:

Open Jalview's groovy console (http://www.jalview.org/help/html/features/groovy.html) and paste in the following:

// very messy script to output the scores in annotation rows for the first sequence in a selection on the topmost alignment
def curviewport = Jalview.getAlignframes()[Jalview.getAlignframes().length-1].getViewport();
if (curviewport.getSelectionGroup()) {
   // gets selection for 'first' alignment - note this is the 'oldest' one - can't access the current alignment as yet
   def selreg = curviewport.getSelectionGroup();
   def gaps = selreg.getSequenceAt(0).gapMap(); // aligned positions of first sequence selected
   String csvfile="";
   curviewport.getAlignment().getAlignmentAnnotation().eachWithIndex{ aa, apos ->
     String csv=""
     gaps.eachWithIndex{col,spos -> if (col>=selreg.getStartRes() && col<=selreg.getEndRes()) {
       // output height of histogram
       csv+=","+aa.annotations[col].value;
       // Uncomment to output string shown in tooltip
       // csv+=","+aa.annotations[col].description;
     }}
     if (csv.length()>0) {
         csvfile+=aa.label+csv+"\n"
     }
   }
   print csvfile;
} else {
     "Select a region in the alignment window.";
}
///

Hope that helps!
Jim

ps. You can also find this script in a comment on the feature request: http://issues.jalview.org/browse/JAL-1516?focusedCommentId=13296&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13296

···

On 04/06/2014 18:33, David M Garcia wrote:

jalviewcrowdadmin · 5 June 2014 11:10

Hi Jim,

···

Heh - I didn’t test the copy/paste to excel part… here’s a better version, that will create a TSV file. See the comments for how to change it to a CSV! As a tip, in Excel, there’s also the ‘Data to columns’ function in the Tools menu that allows you to process commas and tabs in cut/pasted data that was incorrectly pasted into a single column.

// Requested by David M Garcia (v2)
// very messy script to output the scores in annotation rows
// for the first sequence in a selection on the topmost alignment
def curviewport = Jalview.getAlignframes()[Jalview.getAlignframes().length-1].getViewport();

// TSV output by default.
// change “\t” to “,” to output CSV file
def sep = “\t”;

if (curviewport.getSelectionGroup()) {
// gets selection for topmost alignment
def selreg = curviewport.getSelectionGroup();
// get aligned positions of first sequence selected
def gaps = selreg.getSequenceAt(0).gapMap();
String csvfile=“”;
curviewport.getAlignment().getAlignmentAnnotation().eachWithIndex{ aa, apos →
String csv=“”
gaps.eachWithIndex{col,spos → if (col>=selreg.getStartRes() && col<=selreg.getEndRes()) {
// output height of histogram
csv+=sep+aa.annotations[col].value;
// Uncomment to output string shown in tooltip
// csv+=sep+aa.annotations[col].description;
}}
if (csv.length()>0) {
csvfile+=aa.label+sep+csv+“\n”
}
}
print csvfile;
} else {
“Select a region in the alignment window.”;
}

On 05/06/2014 11:37, David M Garcia wrote: