RNA support for Jalview - looking good!

Hi Lauren ! Thanks for the nice comments :slight_smile:

I just took a look at what you did for the GSoC and it looks great! I
might use it to help me make figures for my thesis proposal.

great! A word of warning though : we still haven't got VARNA snapshots stored within the Jalview project.. and VARNA snapshots from one version of VARNA are not compatible with more recent versions, so you can't save any annotated RNA diagrams at the moment. Yann Ponty is working on a new file format for storing the VARNA display, but it won't be ready for a month or two yet.

It wasn't clear to me at first on how to access VARNA to look at the
structure of the sequences, but I figured it out. Highlighting the
nucleotide your mouse is hovering over in the alignment is very
useful.

it is ! (though you might notice a tiny bug in the highlighted position if you are using a 'trimmed' model and there are gaps in the alignment)

Did the documentation for VARNA get put in the "Nucleic Acid
Support" section? I couldn't find it in the webstart version.

it actually is in the nucleic acids section - but there was a problem with the automated build, so the version you used didn't include it.

Happily, thanks to your email, I just fixed the issue, so you should be able to see the help the next time you launch Jan's Jalview build.

What do you think of the 'structure consensus' histogram and base-pair sequence logo ?

Jim.

路路路

On 05/09/2011 00:48, Lauren Lui wrote:

Hi Jim,

Sorry for such a late response. I'm finally done with everything
related to my orals, which went well.

It wasn't immediately clear to me how to read the base-pair sequence
logo but I figured it out after reading the documentation. =) I like
that when I hover over the base pair I can see the percentages of the
different types of base pairs.

Having two bases squeezed into one column makes it difficult for me to
tell the difference between G and C, but I can always look at the
sequence consensus to see which type of base pair I'm probably looking
at. When there's quite a few different base pairs that occur in one
position, it's hard to see what is occurring. Is there a way to
export the percentages of the different types of base pairs?

I work with non-canonical base pairs with the kink-turn sequence. GA
base pairs actually stabilize the structure. I'd prefer if the base
pairs that are displayed are based on the secondary structure
specified, instead of just the canonical (and GU) base pairs. However,
I can see that displaying all base pairs might be problematic.

My advisor is showing Jan's version in his RNA bioinformatics class
today. Do you think that output to a stockholm file would be
possible?

Cheers,

Lauren

路路路

On Mon, Sep 5, 2011 at 2:56 AM, Jim Procter <jprocter@compbio.dundee.ac.uk> wrote:

Hi Lauren ! Thanks for the nice comments :slight_smile:

On 05/09/2011 00:48, Lauren Lui wrote:

I just took a look at what you did for the GSoC and it looks great! I
might use it to help me make figures for my thesis proposal.

great! A word of warning though : we still haven't got VARNA snapshots
stored within the Jalview project.. and VARNA snapshots from one version of
VARNA are not compatible with more recent versions, so you can't save any
annotated RNA diagrams at the moment. Yann Ponty is working on a new file
format for storing the VARNA display, but it won't be ready for a month or
two yet.

It wasn't clear to me at first on how to access VARNA to look at the
structure of the sequences, but I figured it out. Highlighting the
nucleotide your mouse is hovering over in the alignment is very
useful.

it is ! (though you might notice a tiny bug in the highlighted position if
you are using a 'trimmed' model and there are gaps in the alignment)

Did the documentation for VARNA get put in the "Nucleic Acid
Support" section? I couldn't find it in the webstart version.

it actually is in the nucleic acids section - but there was a problem with
the automated build, so the version you used didn't include it.

Happily, thanks to your email, I just fixed the issue, so you should be able
to see the help the next time you launch Jan's Jalview build.

What do you think of the 'structure consensus' histogram and base-pair
sequence logo ?

Jim.

Hi All.

I've been doing a bit of work on incorporating RNA support into the next Jalview release and thought I'd give you all a progress update. I can also make a few comments on Lauren's last email at the same time :wink:

It wasn't immediately clear to me how to read the base-pair sequence
logo but I figured it out after reading the documentation. =) I like
that when I hover over the base pair I can see the percentages of the
different types of base pairs.

This is pretty essential, IMHO.

Having two bases squeezed into one column makes it difficult for me to
tell the difference between G and C, but I can always look at the
sequence consensus to see which type of base pair I'm probably looking
at. When there's quite a few different base pairs that occur in one
position, it's hard to see what is occurring.

yes. I've fixed this to some extent by implementing a 'normalised sequence logo', which is something Kersten asked for. A normalised logo is when all the base pair stacks add up to the same height (rather than additionally being scaled by the histogram height), and makes for a more readable display, but it still gets difficult if you are interested in the rarer base pairs.

   Is there a way to
export the percentages of the different types of base pairs?

There is, but its still a bit buggy for the structure consensus. Typically, you should be able to right-click on an annotation row and use the 'export' option to export that line as a CSV or jalview annotation row. However, a bug is preventing the structure consensus percentages being exported in the CSV.

I work with non-canonical base pairs with the kink-turn sequence. GA
base pairs actually stabilize the structure. I'd prefer if the base
pairs that are displayed are based on the secondary structure
specified, instead of just the canonical (and GU) base pairs. However,
I can see that displaying all base pairs might be problematic.

I'll let Jan look into that, but I think it should actually be OK. There are a finite number of possible interactions, after all !
The reason the combinations are limited currently is that the 'structure consensus' measures the fraction of canonical (+GU) vs non-canonical interactions present in the column, and the logo excludes the canonical ones. I guess it might be worth having an option to include all pairs in the logo display for exactly this kind of situation.

My advisor is showing Jan's version in his RNA bioinformatics class
today.

I was really pleased to hear this - I hope the class liked it !

   Do you think that output to a stockholm file would be
possible?

heh - unfortunately not at the time you wrote. Sorry about that. However, it is now a real priority, I think.

Here's a summary of what's been happening with the Jalview code w.r.t. RNA:

1. All GSOC2010 and GSOC2011 code is now merged into the Jalview 'development' branch on the git repository. This means that the new RNA viz features will be available in the next major jalview release (I think this will be 2.8, since there have already been some big changes 'under the hood' over the past few weeks).

2. WUSS annotation display and local structure consensus is now available in applet as well as application

3. secondary-structure dependent calculations (like helix colouring and local structure consensus) are now updated when the user modifies secondary structure using the annotation editing functions or by deleting or inserting whole columns in the alignment.

This is all available from the development branch build, which is currently at http://www.compbio.dundee.ac.uk/user/ws-dev1/jalview/develop. There's a demo of RNA secondary structure display in the applet here: http://www.compbio.dundee.ac.uk/user/ws-dev1/jalview/develop/applet/applets.html

The only outstanding RNA-related issue that needs to be fixed before release is to include full support for storing and retrieving RNA secondary structure annotation in the Jalview project. However, there are also a whole bunch of 'nice to haves', some of which may need to be deferred to the next release:

1. Stockholm export. (not essential but would be really good if we can round-trip).
2. Complete and debug the integration of VARNA in the desktop using new VARNA listeners/methods
3. Allow VARNA integration for applet.
4. Allow VARNA state to be stored and retrieved from a Jalview project.

#4 is dependent on having a suitable format for storing the VARNA display state. Yann is working on this, but the format implementation might not be ready for the first release. It will be essential for publication, though. The others are all fair game, if anyone has some to spare to do some Java hacking!

Jim.

路路路

On 11/10/2011 22:29, Lauren Lui wrote: