Home // BIOTECHNO 2014, The Sixth International Conference on Bioinformatics, Biocomputational Systems and Biotechnologies // View article


Improving Protein Sub-cellular Localization Prediction Through Semi-supervised Learning

Authors:
Jorge Alberto Jaramillo-Garzón
César Germán castellanos-Domínguez

Keywords: Sub-cellular localization, Gene Ontology, Semi-supervised, Support Vector Machines

Abstract:
Prediction of sub-cellular localization of proteins is a fundamental task in bioinformatics, since it can provide useful information to determine its function. Several prediction techniques have been proposed in the recent years and methods based on machine learning techniques have achieved state of the art classification, usually employing support vector machines and neural networks. However, those methods need high amounts of labeled samples (proteins with known function) in order to train accurate classifiers, and such information is not easily available for this task. In this paper, an alternative methodology that uses semi-supervised learning is proposed. This type of machine learning allows to use unlabeled samples (which are easily available) in order to improve the estimation of the classifiers. All the needed steps for using semi-supervised learning in the problem of predicting protein sub-cellular localizations are described in detail and the methodology is compared with the standard supervised alternative. The results show that using semi-supervised learning significantly improves the prediction performance of the classifier in several cases, proving to be a valuable tool in bioinformatics.

Pages: 99 to 103

Copyright: Copyright (c) IARIA, 2014

Publication date: April 20, 2014

Published in: conference

ISSN: 2308-4383

ISBN: 978-1-61208-335-3

Location: Chamonix, France

Dates: from April 20, 2014 to April 24, 2014