fetch of multiple PDB sequences

maria_luisa_rodrigue · 7 April 2009 11:02

Dear all,

I'm trying to load multiple sequences from PDB database, but jalview is not recognizing the multiple accession codes that I'm giving as input to the sequence fetcher!

When I give 2GZD;2HV8 as input, I get the following message:
"Error retrieving the 2GZD;2HV8 from PDB"

It seems that it is not using the semi-colon to separate different codes...

Thanks in advance,

Luisa Rodrigues

jimp · 7 April 2009 11:41

Hello Maria.

I do apologise - you've hit a bug which was actually fixed several
months ago in the development version of jalview, but was not backported
to the latest build of the release version.

The latest build of the current release branch is now building, and it
includes this fix. See the version archive for the links to the
different builds (http://www.jalview.org/versions.html).

sorry!
Jim.

Maria Luisa Rodrigues wrote:

···

Dear all,

I'm trying to load multiple sequences from PDB database, but jalview
is not recognizing the multiple accession codes that I'm giving as
input to the sequence fetcher!

When I give 2GZD;2HV8 as input, I get the following message:
"Error retrieving the 2GZD;2HV8 from PDB"

It seems that it is not using the semi-colon to separate different
codes...

Thanks in advance,

Luisa Rodrigues

--
-------------------------------------------------------------------
J. B. Procter (ENFIN/VAMSAS) Barton Bioinformatics Research Group
Phone/Fax:+44(0)1382 388734/345764 http://www.compbio.dundee.ac.uk
The University of Dundee is a Scottish Registered Charity, No. SC015096.

jimp · 13 April 2009 12:24

Hi Maria.

The PDB viewer in Jalview 2.4 does keep track of mappings between a
sequence and a structure, but it ignores the SEQRES entry, and instead
extracts the sequence directly from the residues in the alpha carbon
trace (i.e. the displayed sequence only contains residues with an alpha
carbon ATOM entry). In the 2ZETA case, you'll notice that the displayed
sequence around the loop is :

>PDB>2ZET|2ZET|A/57-60
GASG

If you mouse over this region, the PDBRESNUM feature labels give the PDB
residue number field and insertion code for each residue's corresponding
CA ATOM entry, but these values are not used to map features.

Unfortunately, although this isn't perfect for everyone, there are no
plans (currently) to change this behaviour - but I think there is a
workaround for you.

It sounds like what you really want to do is to import the uniprot
sequences associated with each structure. This means the alignment
viewer contains the full sequence, and the CA-trace sequences of any
structures will be aligned to this, so features in the uniprot
coordinate system are properly located on the structure. If you don't
have the uniprot IDs handy, then have a look at PICR
(www.ebi.ac.uk/Tools/picr) which you can use to lookup the uniprot IDs
that cross-reference the structures you want to view.

I hope this helps.
Jim.

Maria Luisa Rodrigues wrote:

···

Hi Jim,

I'm using the PDB fetcher option, which is really nice, but I've noticed
that it is not able to cope with missing residues in the structure. It
seems that the program reads the number of the first residue of a chain
and then it numbers all other residues accordingly to that. For
instance, since residues 59-61 of 2ZET chain A (Rab27b proten) are
missing in the structure (I think they should have been included in the
model with high B factors...) Jalview numbered Rab27b Ser62 as Ser59!
This caused an error in the sequence feature annotation.
I am doing something wrong?

Thanks again,

Luisa

--
-------------------------------------------------------------------
J. B. Procter (ENFIN/VAMSAS) Barton Bioinformatics Research Group
Phone/Fax:+44(0)1382 388734/345764 http://www.compbio.dundee.ac.uk
The University of Dundee is a Scottish Registered Charity, No. SC015096.