ESLPred2"ESLpred2" is an improved version of our previous most popular method, ESLpred , which can predict four major localizations (cytoplasmic, mitochondrial, nuclear and extracellular) with an accuracy of 88%. In the present method besides adding new protein features, two training datasets - RH2427 dataset (a generalized dataset for eukarytoic proteins),and BaCelLo dataset (3 kingdom specific dataset such as -animal, fungi and plant proteins) has been incorporated. The major improvement of ESLpred2 over ESLpred has been observed with the use of position specific scoring matrices (PSSM) as an input feature for the same RH2427 dataset. The profile composition based SVM model has been able to classify proteins with an accuracy of 88.6%. Further, the use of whole PSSM profile along with whole and N-terminal sequence composition for the training of SVM model yielded accuracy of 91.7%. Finally, incorporation of similarity search based information with amino acid composition of a single sequence (whole and N-terminal) and profiles enhanced an overall accuracy to ~94% and average accuracy for four localizations to 93.1%. Using this hybrid approach, cytoplasmic, mitochondrial, nuclear, and extracellular proteins has been predicted with 89.6%, 90.7%, 96.4%, and 95.7% of accuracies respectively. Additionally, ESLpred2 has also been able to attain best accuracies of 80.8,75.9%, and 76.6% for kingdom specific animal, fungi and plant proteins respectively, which is the best accuracy reported till date for the same dataset. ESLpred2 provides more crucial and promising features for prediction of subcellular localizations. Click DATA to obtain PK7579 dataset of eukaryotic proteins used to develop ESLPred2.   If you are using this webserver, please cite: Aarti Garg and G. P. S. Raghava (2008). ESLpred2: Improved Method for Predicting Subcellular Localization of Eukaryotic Proteins. BMC Bioinformatics 9:503
| |