The similarity between the original training and test samples. In each of the file below, the first line is the PDB IDs of the test proteins. The first column is the PDB IDs of the training proteins.
pairwise structural similarity
pairwise sequences similarity
Similarities inside the training and test sets
(not used in the paper but may be useful for related studies)
Y. Li and J. Yang,
Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions
, Journal of Chemical Information and Modeling, 57: 1007-1012 (2017).