2. A Structural Classification of Proteins
Database for the Investigation of Sequences
and Structures
Website:
http://scop.mrc-lmb.cam.ac.uk/scop/
Manual classification of protein structural
domains based on similarities of their
structures and amino acid sequences.
Created in 1994.
Maintained by Alexei G. Murzin and his
colleagues at the Laboratory of Molecular
Biology in Cambridge, England.
3. • It provides a detailed and
comprehensive description of the
relationships of known Protein
structures.
• Classification is on hierarchical levels.
• It is freely accessible.
• Current version of SCOP is 2.03
(October 2013)-
http://scop.berkeley.edu/
4. Family (identical structure and function)
◦ All proteins that have residue identities of 30%
and greater or Pairwise sequence similarity > 25%
◦ The proteins that have lower sequence identities
but whose functions and structures are very
similar.
◦ Clear evolutionary relationship
Superfamily (common structure and function)
◦ Probable common ancestry
◦ Families whose proteins have low sequence
identities but whose structures and, in many
cases, functional features suggest that a common
evolutionary origin is probable.
◦ variable and constant domains of
immunoglobulin.
5. Fold (common core structure )
◦ Major structural similarity
◦ SSE’s in similar arrangement
◦ Superfamilies and families are defined as having a common
fold if their proteins have the same major secondary
structures in the same arrangement and with the same
topological connections.
Class
◦ Similar secondary structure content
◦ Folds have been grouped into five structural classes:
1. all-α, those whose structure is essentially formed by α-
helices;
2. all-β, those whose structure is essentially formed by β-
sheets;
3. α/β, those with α-helices and β-strands;
4. α+β, those in which α-helices and β-strands are largely
segregated;
5. multi-domain, those with domains of different fold and
for which no homologues are known at present.
10. Access methods:
◦ Enter SCOP at the top of the hierarchy
◦ Keyword search of SCOP entries
◦ From a PDB identifier
◦ SCOP parseable files (MRC site)
◦ All SCOP releases and reclassified entry
history (MRC site)
◦ pre-SCOP - preview of the next release
◦ SCOP domain sequences and pdb-style coordinate
files (ASTRAL)
◦ Hidden Markov Model library for SCOP superfamilies
(SUPERFAMILY)
◦ Structural alignments for proteins with non-trivial
relationships (SISYPHUS)
◦ Online resources of potential interest to SCOP users
18. SCOP classification is used:
1. As reference set of data to develop automatic
classification methods used in analyzing families,
superfamilies, and folds
2. For integrative structural data mining to develop
predictive methods and structure-comparison
tools
19. Understanding evolution of protein enzymatic
functions, evolutionary change of protein folds,
hierarchical structural evolution.
To study distantly related proteins with the
same fold.
To study sequence and structure variability and
its dependence in homologous proteins
To derive amino acid similarity matrices and
substitution tables useful for sequence
comparison and fold recognition studies
20. To study the structural anatomy of folds and
domains, to extract structural principles for use in
protein design experiments
SCOP domains have been used to study
combinations of different domains and their
decomposition in multidomain proteins
Recent structural genome projects have been
using SCOP extensively in identifying new targets
SCOP families have been used for developing
value-added and more specialized databases
21. Experimental structural biologists: to explore the region of
structure space near their protein of current research.
Molecular biologists: the categorization assists in locating
proteins of interest and the links make exploration easy.
Theoreticians will likely find it most useful to browse the
wide range of protein folds currently known.
SCOP will find pedagogical use, for it organizes structures
in an easily comprehensible manner
22. SCOP can be used for detailed searching of
particular families.
Analysis of the growth of structural data:
gives an estimate of the total numbers of
protein folds and superfamilies that exist in
nature.
It provides a level of classification not
present in the Protein Data Bank (PDB).