Here are the links to download datasets used in development of CellPPD:



CellPPD Dataset

DatasetDescriptionDownload

CPPsite1:

It contains 708 known CPPs (positive examples) and 708 randomly generated peptide as non-CPPs.P    N

CPPsite2:

It contains 187 known CPPs with high uptake efficiency as positive examples and equal number of randomly generated peptide as non-CPPs.P    N

CPPsite3:

It contains 187 known CPPs with high uptake efficiency as positive examples and 187 CPPs with low uptake efficiency as non-CPPs.P    N

Independent:

It contains 99 known CPPs collected from the literature, which were not included in the training and 99 randomly generated non-CPPsP    N




Benchmark Datasets

DatasetDescriptionDownloadReference

Sanders-2011a:

It contains 111 experimentally validated CPPs and equal number of non-CPPs (generated randomly from the chicken proteome). P    NPMID: 21779156

Sanders-2011b:

It contains 111 experimentally validated CPPs and 34 experimentally validated non-CPPs.P    NPMID: 21779156

Sanders-2011c:

It contains 111 known CPPs and 111 peptides randomly sampled from 34 known non-CPPs.P    NPMID: 21779156

Dobchev-2010:

It contans 74 known CPPs and 24 known non-CPPs.P    NPMID: 20402661

Hansen-2008:

It contans 66 known CPPs and 19 known non-CPPs.P    NPMID: 18045726

Hallbrink-2005:

It contans 53 known CPPs and 16 known non-CPPs.P    NLink