I drilled down into the Group by Tree function to see why it seems slow.
With an alignment of 50 sequences x 1800 (I’m using Mikk Puustusmaa’s DNA data), if you click ‘deep’ in the tree (so creating many groups), the refresh with group colouring takes over a second (Jalview desktop 2.8.2).
It turns out most of the time is spent in SequenceGroup.recalcConservation(). This takes around 13ms but is done for each group.
For example:
This gets more noticeable in a split frame view when there are two alignments performing this action.
Not sure if there are any easy optimisations here.
It still happens with no annotations displayed, and Puridine/Pyrimidine colour scheme, which feels unnecessary…?
The University of Dundee is a registered Scottish Charity, No: SC015096
···
Mungo Carstairs
Jalview Computational Scientist
The Barton Group
Division of Computational Biology
College of Life Sciences
University of Dundee, Dundee, Scotland, UK.
www.jalview.org
www.compbio.dundee.ac.uk
Update:
The time is all spent in
AAFrequency.completeConsensus()
so this would be the place to put any optimisation effort in if wanted.
mungo
The University of Dundee is a registered Scottish Charity, No: SC015096
···
Mungo Carstairs
Jalview Computational Scientist
The Barton Group
Division of Computational Biology
College of Life Sciences
University of Dundee, Dundee, Scotland, UK.
www.jalview.org
www.compbio.dundee.ac.uk
From: jalview-dev-bounces@jalview.org jalview-dev-bounces@jalview.org on behalf of Mungo Carstairs (Staff) g.m.carstairs@dundee.ac.uk
Sent: 03 March 2015 11:34
To: Jalview Development List
Subject: [Jalview-dev] Group by Tree - performance notes
I drilled down into the Group by Tree function to see why it seems slow.
With an alignment of 50 sequences x 1800 (I’m using Mikk Puustusmaa’s DNA data), if you click ‘deep’ in the tree (so creating many groups), the refresh with group colouring takes over a second (Jalview desktop 2.8.2).
It turns out most of the time is spent in SequenceGroup.recalcConservation(). This takes around 13ms but is done for each group.
For example:
This gets more noticeable in a split frame view when there are two alignments performing this action.
Not sure if there are any easy optimisations here.
It still happens with no annotations displayed, and Puridine/Pyrimidine colour scheme, which feels unnecessary…?
The University of Dundee is a registered Scottish Charity, No: SC015096
Mungo Carstairs
Jalview Computational Scientist
The Barton Group
Division of Computational Biology
College of Life Sciences
University of Dundee, Dundee, Scotland, UK.
www.jalview.org
www.compbio.dundee.ac.uk
Hi.
I drilled down into the Group by Tree function to see why it seems slow.
Yes. Conservation is manually calculated - without much regard to colourschemes. It’s an ugly inefficiency that was to be addressed in a future issue related to:
http://issues.jalview.org/browse/JAL-961
I’ve added the following discussion to a new issue:
http://issues.jalview.org/browse/JAL-1680
Any decision as to whether group conservations are needed needs to know about:
- state of group consensus/conservation autoannotation display
- apply to all groups checkbox
- current global colourscheme
If conservation is needed, then it has to be done before the groups are created or a visual delay is shown after the groups are created whilst an alignment calculation worker does it’s thing.
Jim.
···
--
-------------------------------------------------------------------
Dr JB Procter, Jalview Coordinator, The Barton Group
Division of Computational Biology, College of Life Sciences
University of Dundee, Dundee DD1 5EH, UK.
+44 1382 388734 | [www.jalview.org](http://www.jalview.org) | [www.compbio.dundee.ac.uk](http://www.compbio.dundee.ac.uk)
The University of Dundee is registered Scottish charity: No.SC015096