Benchmark dataset

The benchmark dataset and the predicted contacts by 15 predictors can be downloaded from the following links.

Link Description
Target information (17 KB) The target information of the dataset. The file has 5 colomns:
   1. target: The name of each target
   2. MSA used by MetaPSICOV: The number of MSA used by MetaPSICOV
   3. N_eff: The number of remaining sequences after filtering by hhfilter
   4. structural class: a for alpha proteins, b for beta proteins and c/d for alpha+beta proteins
   5. target type: easy, medium, and hard
Sequences (75 KB) Amino acid sequences of proteins in the dataset
Structures (13 M) PDB structures of proteins in the dataset
MSA (225 M) The multiple sequence asignments used by MetaPSICOV
Annotations of all residue-residue pairs (60 M) Annotations of all residue-residue pairs for all targets. Each file has 4 colomns:
   1. res1: The first residue of a pair
   2. res2: The second residue of a pair
   3. distance: The distance (in Angstrom) of the first residue and the second residue
   4. label: TRUE for contact pairs (i.e., distance <= 8 Angstrom), FALSE for non-contact pairs (i.e., distance > 8 Angstrom)
Predicted contacts on dataset of 680 proteins (967 M) The results of predicted contacts from 15 locally installed contact predictors on dataset of 680 proteins
Data on CASP12 (25 M) The annotations of contacts for 17 CASP12 targets and the assessment results for the predicted contacts by 15 locally installed predictors.


Reference: