Mathematical tools developed in the context of Shannon information theory were used to analyze the meaning of the BLOSUM score, which was split into three components termed as the BLOSUM spectrum (or BLOSpectrum). These relate respectively to the sequence convergence (the stochastic similarity of the two protein sequences), to the background frequency divergence (typicality of the amino acid probability distribution in each sequence), and to the target frequency divergence (compliance of the amino acid variations between the two sequences to the protein model implicit in the BLOCKS database).
This site uses this knowledge to classify a sequence in the SCOP database and to realize the fingerprints of a SCOP family.
- On this website you can:
- Build a database
- Classify a sequence
- Draw fingerprints of a database
- Find the BLOSpectrum between two sequences
References:
- A. Casagrande and F. Fabris.
Family Fingerprints: A Global Approach to Structural Classification
Journal of Bioinformatics and Computational Biology, 2012 - A. Casagrande and F. Fabris.
SCOP Family Fingerprints: An Information Theoretic Approach to Structural Classification of Protein Domains
Proceedings of Computational Structural Bioinfomatics Workshop 2011 (CSBW 2011), Atlanta (USA), November 12 2011. - F. Fabris, A. Sgarro and A. Tossi.
Splitting the BLOSUM Score into Numbers of Biological Significance
EURASIP Journal on Bioinformatics and Systems Biology, 2007.