groovy scripting on jalview startup

I was wondering if anyone had developed a template for a groovy start-up script that, after initial loading of an sequence set, can fire off one of the web services (e.g., Muscle).

thx,
-David

David M. Goodstein, Ph.D.
Phytozome Group Lead
Plant and Computational Genomics Group
Joint Genome Institute - U.S. Dept. of Energy
Center for Integrative Genomics - UC Berkeley

Hey David .. thanks for the email.

I was wondering if anyone had developed a template for a groovy start-up script that, after initial loading of an sequence set, can fire off one of the web services (e.g., Muscle).

nope :frowning: but that is mostly because there are not many people working with groovy in Jalview as yet.

Looking at the code, this probably isn't immediately straightforward to do, since the web service clients in jalview desktop are designed to be run on an already displayed alignment. That being said, I do have an interactive testing script that loads a file into a new alignment window which could be adapted to launch a web service job and close the window once the job has completed.

Since the adaptation of the script might be made easier if I modify some of Jalview's java code, I've opened a bug to collect all the git commits together: http://issues.jalview.org/browse/JAL-894

What timescale were you thinking of for being about to do this ?
Jim.

···

On 03/08/2011 00:09, David M. Goodstein wrote:

Hi David. Your mail was too large to come through the mailing list, but I've included your reply to my last email below..

Yesterday!

oops - didn't make that deadline :wink:

Actually, a little bit of background. Our project (http://www.phytozome.net) has been using Jalview for 6+ years. We branched our local install from v2.0.6 (28/9/2005) and made several in-code modifications, to allow us to pull sequences and alignments directly from our database and expose various tools.

Wow - I like the rearrangements you've made to the calculations menu! The 'run gblocks' option is very intuitive - although it does proliferate alignment windows - my ideal way of including gblocks like services is to create a new alignment view from the results that would exclude phylogenetically uninformative columns.

We are currently bundling up Phytozome for licensing and distribution to academic and commercial users. For licensing purposes (and to benefit from subsequent Jalview development), it is infinitely better if Phytozome works with the unmodified Jalview, so we've been trying to see if we can get the current release to work with our system. One of the first issues we've encountered is trying to reproduce this behavior:

1) go to http://www.phytozome.net/show_cluster.php?method=2355&search=1&searchText=clusterid%3A28761972&detail=1
2) click on Align Family Members -> Load member protein sequences

You'll see it pulls up a Jalview window of the unaligned sequences, and then a window with the sequences aligned. There's a bit going on under the hood (the system first checks our database to see if we have already stored a pre-computed MSA for this gene family, and pulls it out if so and immediately displays it in a second window. If none is found, and the number of sequences is 50 or less, an MSA calculation is automatically launched (via cgi), and the result displayed upon completion.

Ok. I presume you've already managed to implement something that can do the cache-check and CGI submission step already.

We had hoped to reproduce this conditional launching of the MSA calculation via a Groovy script, but you point out that apparently --groovy /SCRIPT /on the Jalview startup fires off a script /before /completion of the initial load, correct? So what would be really useful for us would be the ability to have that commandline script execute subsequent to the initial sequence load.

yes. In fact, this is actually the case (I just checked the code to make sure :slight_smile: ). The script is only executed once the alignment is loaded if you have passed in an alignment on the command line. In fact, I've just checked in an example script that will do roughly what you need - see
http://source.jalview.org/gitweb/?p=jalview.git;a=blob;f=examples/groovy/alignLoadedFile.groovy;

The remaining problem was that the groovy argument in the currently released versions is only able to execute a script from a local file or from STDIN. In your case, you'd want to specify a script as a URL. So - completely ignoring any security issues (see - http://issues.jalview.org/browse/JAL-899 ) I've tweaked Jalview to enable this functionality, and added in some progress indicators so the user knows what's going on whilst groovy gets its act together (since the script fetch step seems to take ages on my machine .. any suggestions on speeding that step up would be much appreciated). Get the latest code from the development URL or from the git repository at http://source.jalview.org/git/jalview.git

We're also working on getting various coloring behaviors to work in the new Jalview (e.g, when we send up more than one gene family to Jalview at the same time, each family's sequence labels get a distinct color, e.g.)

ah - I guess this is related to this jalview-discuss from Joni Fazo at lbl.gov (http://www.jalview.org/pipermail/jalview-discuss/2011-August/000644.html).

The way this would be traditionally done was by setting a sequence group's 'idColour' property in an annotation file (http://www.jalview.org/help/html/features/annotationsFormat.html) - but you could also script the creation of groups for the currently displayed alignments in groovy. Bear in mind, however, that these colours would be 'per alignment' annotation, so you'd have to reapply them when the alignment completes. Of course, I suspect that ultimately you would prefer an automatic colouring mechanism that gets applied for new alignment windows. If you (or someone in your group) fancies implementing a new 'colour by ID' system for Jalview, I'd be happy to assist.

I hope this helps! The exercise with groovy helped squish a few bugs related to progress bars, so Jalview has benefited already :slight_smile:

Jim.

···

On 03/08/2011 18:55, David M. Goodstein wrote:

Thanks Jim. This is super-helpful. We will pull down the code changes and sample groovy scripts and see if we can apply them to our case.

-David

···

On Aug 5, 2011, at 9:13 AM, Jim Procter wrote:

Hi David. Your mail was too large to come through the mailing list, but I’ve included your reply to my last email below…

On 03/08/2011 18:55, David M. Goodstein wrote:

Yesterday!

oops - didn’t make that deadline :wink:

Actually, a little bit of background. Our project (http://www.phytozome.net) has been using Jalview for 6+ years. We branched our local install from v2.0.6 (28/9/2005) and made several in-code modifications, to allow us to pull sequences and alignments directly from our database and expose various tools.

Wow - I like the rearrangements you’ve made to the calculations menu! The ‘run gblocks’ option is very intuitive - although it does proliferate alignment windows - my ideal way of including gblocks like services is to create a new alignment view from the results that would exclude phylogenetically uninformative columns.

We are currently bundling up Phytozome for licensing and distribution to academic and commercial users. For licensing purposes (and to benefit from subsequent Jalview development), it is infinitely better if Phytozome works with the unmodified Jalview, so we’ve been trying to see if we can get the current release to work with our system. One of the first issues we’ve encountered is trying to reproduce this behavior:

  1. go to http://www.phytozome.net/show_cluster.php?method=2355&search=1&searchText=clusterid%3A28761972&detail=1 <http://www.phytozome.net/show_cluster.php?method=2355&search=1&searchText=clusterid%3A28761972&detail=1>
  1. click on Align Family Members → Load member protein sequences

You’ll see it pulls up a Jalview window of the unaligned sequences, and then a window with the sequences aligned. There’s a bit going on under the hood (the system first checks our database to see if we have already stored a pre-computed MSA for this gene family, and pulls it out if so and immediately displays it in a second window. If none is found, and the number of sequences is 50 or less, an MSA calculation is automatically launched (via cgi), and the result displayed upon completion.

Ok. I presume you’ve already managed to implement something that can do the cache-check and CGI submission step already.

We had hoped to reproduce this conditional launching of the MSA calculation via a Groovy script, but you point out that apparently --groovy /SCRIPT /on the Jalview startup fires off a script /before /completion of the initial load, correct? So what would be really useful for us would be the ability to have that commandline script execute subsequent to the initial sequence load.

yes. In fact, this is actually the case (I just checked the code to make sure :slight_smile: ). The script is only executed once the alignment is loaded if you have passed in an alignment on the command line. In fact, I’ve just checked in an example script that will do roughly what you need - see
http://source.jalview.org/gitweb/?p=jalview.git;a=blob;f=examples/groovy/alignLoadedFile.groovy;

The remaining problem was that the groovy argument in the currently released versions is only able to execute a script from a local file or from STDIN. In your case, you’d want to specify a script as a URL. So - completely ignoring any security issues (see - http://issues.jalview.org/browse/JAL-899 ) I’ve tweaked Jalview to enable this functionality, and added in some progress indicators so the user knows what’s going on whilst groovy gets its act together (since the script fetch step seems to take ages on my machine … any suggestions on speeding that step up would be much appreciated). Get the latest code from the development URL or from the git repository at http://source.jalview.org/git/jalview.git

We’re also working on getting various coloring behaviors to work in the new Jalview (e.g, when we send up more than one gene family to Jalview at the same time, each family’s sequence labels get a distinct color, e.g.)

ah - I guess this is related to this jalview-discuss from Joni Fazo at lbl.gov (http://www.jalview.org/pipermail/jalview-discuss/2011-August/000644.html).

The way this would be traditionally done was by setting a sequence group’s ‘idColour’ property in an annotation file (http://www.jalview.org/help/html/features/annotationsFormat.html) - but you could also script the creation of groups for the currently displayed alignments in groovy. Bear in mind, however, that these colours would be ‘per alignment’ annotation, so you’d have to reapply them when the alignment completes. Of course, I suspect that ultimately you would prefer an automatic colouring mechanism that gets applied for new alignment windows. If you (or someone in your group) fancies implementing a new ‘colour by ID’ system for Jalview, I’d be happy to assist.

I hope this helps! The exercise with groovy helped squish a few bugs related to progress bars, so Jalview has benefited already :slight_smile:

Jim.

David M. Goodstein, Ph.D.
Phytozome Group Lead
Plant and Computational Genomics Group
Joint Genome Institute - U.S. Dept. of Energy
Center for Integrative Genomics - UC Berkeley