Help needed - Marking up a list of peptides onto a protein alignment

Hi

I have been looking for a way to map peptide sequences onto alignments. I recently came across jalview and it looks like this might do it. I am particularly interested in the features list created by the searches.

Here is what I want to achieve:

I have an alignment.

I have long list of peptide sequences (>1000 peptides) divided into 6 separate groups. Most of these peptides will not be present in my alignment but some will. I want search the alignment for these peptides and mark up when a 100% match is found. I want to use 6 different colours for my 6 peptide groups.

It looks like that it should be possible for me to define a feature list file which contains peptide sequence, peptide group and the colour I want for a group.

But I don’t know how to create and use such a feature list file. Any help would save me 100s of hours. I am currently manually shading each peptide onto alignments using GeneDoc.

Thank you for your help.

Best Wishes

Manoj

Hello Manoj,

Thanks for getting in touch. Let me check I have understood what you are aiming for.

Jalview lets you search any alignment for any sequence motif (which could be an exact match or a regular expression pattern). You can search for all matches, and create sequence features on the alignment for matches found (http://www.jalview.org/help/html/features/search.html). When you create a feature, you can assign a name and a colour to it.

It sounds like you want to do this but with a couple of complications:

  • you have a large number of motifs (peptides) you want to search for
  • these are grouped, and you want to generate a distinct feature type / colour for each group

Assuming that is right, I can have a think about whether we could provide a script to perform this in one action (or maybe one action per group - which would be simpler and I think still save you a lot of time).

Best regards,

Mungo

···

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017


From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Manoj Kumar manoj.kumar@manchester.ac.uk
Sent: 21 December 2019 19:15
To: jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hi

I have been looking for a way to map peptide sequences onto alignments. I recently came across jalview and it looks like this might do it. I am particularly interested in the features list created by the searches.

Here is what I want to achieve:

I have an alignment.

I have long list of peptide sequences (>1000 peptides) divided into 6 separate groups. Most of these peptides will not be present in my alignment but some will. I want search the alignment for these peptides and mark up when a 100% match is found. I want to use 6 different colours for my 6 peptide groups.

It looks like that it should be possible for me to define a feature list file which contains peptide sequence, peptide group and the colour I want for a group.

But I don’t know how to create and use such a feature list file. Any help would save me 100s of hours. I am currently manually shading each peptide onto alignments using GeneDoc.

Thank you for your help.

Best Wishes

Manoj

-----------------------------------------------------------------

Manoj Kumar PhD FHEA

Research Fellow

Faculty of Biology, Medicine and Health

Michael Smith Building, University of Manchester

Oxford Road, Manchester. M13 9PT.


The University of Dundee is a registered Scottish Charity, No: SC015096

Hello Manoj,

Jalview can’t do this directly (but we can raise an enhancement request for this), but it can be done by adapting and running the attached script in the Groovy console (see http://www.jalview.org/help/html/features/groovy.html).

First format your search motifs in a plain text, tab-delimited file with columns for motif (peptide to match), feature name to generate, and feature colour (a colour name, rgb or hex colour, see http://www.jalview.org/help/html/features/featuresFormat.html#colourdefs).
For example:
STS motif1 green

LRS motif1 green

LKS motif2 blue

NTQ motif2 blue

This example can be used with Jalview example file http://www.jalview.org/examples/uniref50.fa if you want to try this out.

Edit the attached script to include the full path to your ‘motifs file’ at line 11.
Load your alignment into Jalview, and open the Groovy console (from the Tools menu).
NB this may not work if running Jalview with Java 11, but should work with Java 8.
Paste the script into the Groovy window.
Check you don’t have an active ‘sub-selection’ (hit the Esc key) as this would restrict any ‘Find’ action.
In the alignment window, Calculate menu, choose ‘Run Groovy console script’.
All being well, this should find motif matches in the alignment, and create and colour sequence features for the matches.

The script is coded to accept a fourth column for feature score if wanted, say if some of the motifs you are matching are more significant / informative than others. This would allow you to graduate the feature colours between lowest and highest scores, or to filter based on a threshold (see http://www.jalview.org/help/html/features/featureschemes.html).

But feel free to ignore this if not wanted, scores will default to all zero.

In the example above, I have assigned the same feature name to each motif (in the same group / with the same colour), enabling a score-graduated colour scheme if wanted.

Alternatively, you could give each motif a different feature name (e.g. just repeat the motif as the second column) - this would then allow you to hide or show selected motifs in Jalview.

Any problems, or further requests, please let me know!

Best regards,

Mungo

groovyMotifs.txt (988 Bytes)

···

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017


From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Mungo Carstairs (Staff) g.m.carstairs@dundee.ac.uk
Sent: 24 December 2019 20:58
To: Manoj Kumar manoj.kumar@manchester.ac.uk; jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: Re: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hello Manoj,

Thanks for getting in touch. Let me check I have understood what you are aiming for.

Jalview lets you search any alignment for any sequence motif (which could be an exact match or a regular expression pattern). You can search for all matches, and create sequence features on the alignment for matches found (http://www.jalview.org/help/html/features/search.html). When you create a feature, you can assign a name and a colour to it.

It sounds like you want to do this but with a couple of complications:

  • you have a large number of motifs (peptides) you want to search for
  • these are grouped, and you want to generate a distinct feature type / colour for each group

Assuming that is right, I can have a think about whether we could provide a script to perform this in one action (or maybe one action per group - which would be simpler and I think still save you a lot of time).

Best regards,

Mungo

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017

From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Manoj Kumar manoj.kumar@manchester.ac.uk
Sent: 21 December 2019 19:15
To: jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hi

I have been looking for a way to map peptide sequences onto alignments. I recently came across jalview and it looks like this might do it. I am particularly interested in the features list created by the searches.

Here is what I want to achieve:

I have an alignment.

I have long list of peptide sequences (>1000 peptides) divided into 6 separate groups. Most of these peptides will not be present in my alignment but some will. I want search the alignment for these peptides and mark up when a 100% match is found. I want to use 6 different colours for my 6 peptide groups.

It looks like that it should be possible for me to define a feature list file which contains peptide sequence, peptide group and the colour I want for a group.

But I don’t know how to create and use such a feature list file. Any help would save me 100s of hours. I am currently manually shading each peptide onto alignments using GeneDoc.

Thank you for your help.

Best Wishes

Manoj

-----------------------------------------------------------------

Manoj Kumar PhD FHEA

Research Fellow

Faculty of Biology, Medicine and Health

Michael Smith Building, University of Manchester

Oxford Road, Manchester. M13 9PT.


The University of Dundee is a registered Scottish Charity, No: SC015096

The University of Dundee is a registered Scottish Charity, No: SC015096


And just to add:

The script ignores lines starting with ‘#’, so you can use these as comments in the data file to document its contents, change history or whatever.

The generated features are assigned the peptide motif as description, so this shows in the tooltip when mousing over the alignment. Obvious enough when doing exact matches, but the script will also work with regular expression patterns.

Mungo

···

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017


From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Mungo Carstairs (Staff) g.m.carstairs@dundee.ac.uk
Sent: 26 December 2019 17:42
To: Manoj Kumar manoj.kumar@manchester.ac.uk; jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: Re: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hello Manoj,

Jalview can’t do this directly (but we can raise an enhancement request for this), but it can be done by adapting and running the attached script in the Groovy console (see http://www.jalview.org/help/html/features/groovy.html).

First format your search motifs in a plain text, tab-delimited file with columns for motif (peptide to match), feature name to generate, and feature colour (a colour name, rgb or hex colour, see http://www.jalview.org/help/html/features/featuresFormat.html#colourdefs).
For example:
STS motif1 green

LRS motif1 green

LKS motif2 blue

NTQ motif2 blue

This example can be used with Jalview example file http://www.jalview.org/examples/uniref50.fa if you want to try this out.

Edit the attached script to include the full path to your ‘motifs file’ at line 11.
Load your alignment into Jalview, and open the Groovy console (from the Tools menu).
NB this may not work if running Jalview with Java 11, but should work with Java 8.
Paste the script into the Groovy window.
Check you don’t have an active ‘sub-selection’ (hit the Esc key) as this would restrict any ‘Find’ action.
In the alignment window, Calculate menu, choose ‘Run Groovy console script’.
All being well, this should find motif matches in the alignment, and create and colour sequence features for the matches.

The script is coded to accept a fourth column for feature score if wanted, say if some of the motifs you are matching are more significant / informative than others. This would allow you to graduate the feature colours between lowest and highest scores, or to filter based on a threshold (see http://www.jalview.org/help/html/features/featureschemes.html).

But feel free to ignore this if not wanted, scores will default to all zero.

In the example above, I have assigned the same feature name to each motif (in the same group / with the same colour), enabling a score-graduated colour scheme if wanted.

Alternatively, you could give each motif a different feature name (e.g. just repeat the motif as the second column) - this would then allow you to hide or show selected motifs in Jalview.

Any problems, or further requests, please let me know!

Best regards,

Mungo

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017

From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Mungo Carstairs (Staff) g.m.carstairs@dundee.ac.uk
Sent: 24 December 2019 20:58
To: Manoj Kumar manoj.kumar@manchester.ac.uk; jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: Re: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hello Manoj,

Thanks for getting in touch. Let me check I have understood what you are aiming for.

Jalview lets you search any alignment for any sequence motif (which could be an exact match or a regular expression pattern). You can search for all matches, and create sequence features on the alignment for matches found (http://www.jalview.org/help/html/features/search.html). When you create a feature, you can assign a name and a colour to it.

It sounds like you want to do this but with a couple of complications:

  • you have a large number of motifs (peptides) you want to search for
  • these are grouped, and you want to generate a distinct feature type / colour for each group

Assuming that is right, I can have a think about whether we could provide a script to perform this in one action (or maybe one action per group - which would be simpler and I think still save you a lot of time).

Best regards,

Mungo

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017

From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Manoj Kumar manoj.kumar@manchester.ac.uk
Sent: 21 December 2019 19:15
To: jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hi

I have been looking for a way to map peptide sequences onto alignments. I recently came across jalview and it looks like this might do it. I am particularly interested in the features list created by the searches.

Here is what I want to achieve:

I have an alignment.

I have long list of peptide sequences (>1000 peptides) divided into 6 separate groups. Most of these peptides will not be present in my alignment but some will. I want search the alignment for these peptides and mark up when a 100% match is found. I want to use 6 different colours for my 6 peptide groups.

It looks like that it should be possible for me to define a feature list file which contains peptide sequence, peptide group and the colour I want for a group.

But I don’t know how to create and use such a feature list file. Any help would save me 100s of hours. I am currently manually shading each peptide onto alignments using GeneDoc.

Thank you for your help.

Best Wishes

Manoj

-----------------------------------------------------------------

Manoj Kumar PhD FHEA

Research Fellow

Faculty of Biology, Medicine and Health

Michael Smith Building, University of Manchester

Oxford Road, Manchester. M13 9PT.


The University of Dundee is a registered Scottish Charity, No: SC015096

The University of Dundee is a registered Scottish Charity, No: SC015096

The University of Dundee is a registered Scottish Charity, No: SC015096



For any interested readers, https://issues.jalview.org/browse/JAL-3499 has the example Groovy script and input file attached, and will track any follow-up permanent enhancement to Jalview.

···

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017


From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Mungo Carstairs (Staff) g.m.carstairs@dundee.ac.uk
Sent: 26 December 2019 17:42
To: Manoj Kumar manoj.kumar@manchester.ac.uk; jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: Re: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hello Manoj,

Jalview can’t do this directly (but we can raise an enhancement request for this), but it can be done by adapting and running the attached script in the Groovy console (see http://www.jalview.org/help/html/features/groovy.html).

First format your search motifs in a plain text, tab-delimited file with columns for motif (peptide to match), feature name to generate, and feature colour (a colour name, rgb or hex colour, see http://www.jalview.org/help/html/features/featuresFormat.html#colourdefs).
For example:
STS motif1 green

LRS motif1 green

LKS motif2 blue

NTQ motif2 blue

This example can be used with Jalview example file http://www.jalview.org/examples/uniref50.fa if you want to try this out.

Edit the attached script to include the full path to your ‘motifs file’ at line 11.
Load your alignment into Jalview, and open the Groovy console (from the Tools menu).
NB this may not work if running Jalview with Java 11, but should work with Java 8.
Paste the script into the Groovy window.
Check you don’t have an active ‘sub-selection’ (hit the Esc key) as this would restrict any ‘Find’ action.
In the alignment window, Calculate menu, choose ‘Run Groovy console script’.
All being well, this should find motif matches in the alignment, and create and colour sequence features for the matches.

The script is coded to accept a fourth column for feature score if wanted, say if some of the motifs you are matching are more significant / informative than others. This would allow you to graduate the feature colours between lowest and highest scores, or to filter based on a threshold (see http://www.jalview.org/help/html/features/featureschemes.html).

But feel free to ignore this if not wanted, scores will default to all zero.

In the example above, I have assigned the same feature name to each motif (in the same group / with the same colour), enabling a score-graduated colour scheme if wanted.

Alternatively, you could give each motif a different feature name (e.g. just repeat the motif as the second column) - this would then allow you to hide or show selected motifs in Jalview.

Any problems, or further requests, please let me know!

Best regards,

Mungo

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017

From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Mungo Carstairs (Staff) g.m.carstairs@dundee.ac.uk
Sent: 24 December 2019 20:58
To: Manoj Kumar manoj.kumar@manchester.ac.uk; jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: Re: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hello Manoj,

Thanks for getting in touch. Let me check I have understood what you are aiming for.

Jalview lets you search any alignment for any sequence motif (which could be an exact match or a regular expression pattern). You can search for all matches, and create sequence features on the alignment for matches found (http://www.jalview.org/help/html/features/search.html). When you create a feature, you can assign a name and a colour to it.

It sounds like you want to do this but with a couple of complications:

  • you have a large number of motifs (peptides) you want to search for
  • these are grouped, and you want to generate a distinct feature type / colour for each group

Assuming that is right, I can have a think about whether we could provide a script to perform this in one action (or maybe one action per group - which would be simpler and I think still save you a lot of time).

Best regards,

Mungo

Email signature

University of Dundee shield logo

Mungo Carstairs
Jalview Computational Scientist

The Barton Group
Division of Computational Biology

School of Life Sciences

University of Dundee, Dundee, Scotland, UK

www.jalview.org

www.compbio.dundee.ac.uk
g.m.carstairs@dundee.ac.uk

University of Dundee FacebookUniversity of Dundee TwitterUniversity of Dundee LinkedInUniversity of Dundee YouTubeUniversity of Dundee InstagramUniversity of Dundee Snapchat
We’re Scottish University of the Year again!
The Times / Sunday Times Good University Guide 2016 and 2017

From: jalview-discuss-bounces@jalview.org jalview-discuss-bounces@jalview.org on behalf of Manoj Kumar manoj.kumar@manchester.ac.uk
Sent: 21 December 2019 19:15
To: jalview-discuss@jalview.org jalview-discuss@jalview.org
Subject: [Jalview-discuss] Help needed - Marking up a list of peptides onto a protein alignment

Hi

I have been looking for a way to map peptide sequences onto alignments. I recently came across jalview and it looks like this might do it. I am particularly interested in the features list created by the searches.

Here is what I want to achieve:

I have an alignment.

I have long list of peptide sequences (>1000 peptides) divided into 6 separate groups. Most of these peptides will not be present in my alignment but some will. I want search the alignment for these peptides and mark up when a 100% match is found. I want to use 6 different colours for my 6 peptide groups.

It looks like that it should be possible for me to define a feature list file which contains peptide sequence, peptide group and the colour I want for a group.

But I don’t know how to create and use such a feature list file. Any help would save me 100s of hours. I am currently manually shading each peptide onto alignments using GeneDoc.

Thank you for your help.

Best Wishes

Manoj

-----------------------------------------------------------------

Manoj Kumar PhD FHEA

Research Fellow

Faculty of Biology, Medicine and Health

Michael Smith Building, University of Manchester

Oxford Road, Manchester. M13 9PT.


The University of Dundee is a registered Scottish Charity, No: SC015096

The University of Dundee is a registered Scottish Charity, No: SC015096

The University of Dundee is a registered Scottish Charity, No: SC015096