Jalview and a2m files (UCSC SAM suite of hidden Markov model tools)

jalviewcrowdadmin · 30 May 2011 01:27

Hi Saira.

Thanks for this email. I’ve cc’ed it to Jalview’s discussion list - since it will be of general interest.

Although Jalview can read the aligned FASTA format a2m (see below), one feature of this format, “insertion positions”, poses problems when I try to create coloured and annotated alignments for publication purposes.

As far as I can tell, Jalview can read an a2m file and is unable to convert automatically the N-terminal, internal and C-terminal insertions (lowercase letters) to numbers specifying the length of the insertion (see lmna.out). The result is an unwieldy and excessively long alignment.

Yes. I see what you mean. I wasn’t aware that SAMtools had its own FASTA extension (there are a few!), and I can see that visualizing certain types of result in this format would be a real pain if the tool did not honour the insertion annotation. I’ve opened a bug on our issues database about this ( http://issues.jalview.org/browse/JAL-835 ). Jalview should somehow recognise that this is an a2m file and hide or exclude flanking sequence regions marked as insertions when it imports the file.

For the moment, I can only recommend that you use the ‘hide columns’ function to exclude the flanking regions. Simply select the flanking regions of the alignment and press ‘H’.

I would like also to annotate the alignment with secondary structure elements taken from the RCSB/PDB entries and non-synonymous coding SNPs for the sequence HsapreLMNA_2.

Do you have any suggestions and/or ideas?

These are slightly different issues.
Firstly, secondary structure and SNP sequence annotation is available from various DAS annotation sources - but you need to find out if HsapreLMNA_2 is in the uniprot database first - then, edit its sequence ID to include the uniprot accession using the edit name/description dialog box opened by right clicking the sequence’s ID in the alignment window.
Secondly, whilst you might be able to retrieve the PDB secondary structure assignments from some source (uniprot, PFAM, or the PDB’s own DAS sources), Jalview doesn’t currently automatically translate secondary stucture sequence features into alignment annotation (like the secondary structure annotation shown in the example file that launches when jalview starts up for the first time). This means, you’d have to create the helices and sheets on the alignment manually. This is a bit onerous, but the only means available at the moment - I’m hoping that a future version will allow much more flexibility with regards to moving annotation between the region below the alignment and highlighting on the alignment.

I hope this helps a bit, and if you’d like to register on issues.jalview.org, you can post a ‘watch’ on the a2m feature so you’ll get an email when someone starts working on it!

Jim.