Developing a Machine Learning Model Using Gene Expression for Breast Cancer Prediction

Babatunde Abdulrauph Olarewaju; Alausa Babatunde Mubarak; Oke Afeez Adeshina

doi:10.53974/unza.zajlis.9.2.208

PDF

Published Dec 30, 2025

DOI: https://doi.org/10.53974/unza.zajlis.9.2.208

Babatunde Abdulrauph Olarewaju

University of Ilorin

Alausa Babatunde Mubarak

University of Ilorin

Oke Afeez Adeshina

Federal College of Education, Nigeria

Abstract

Recent advancements in genomics have generated vast gene expression datasets, offering profound insights into cancer biology. This study investigates an ensemble machine learning model, integrating K-Nearest Neighbors (KNN), Support Vector Classifier (SVC), and XGBoost, to predict and classify breast cancer subtypes from gene expression profiles. The methodology encompassed data preprocessing, including one-hot encoding, followed by model training and evaluation using standard metrics. The ensemble model achieved a strong overall accuracy of 90.32%. Crucially, it demonstrated a high precision of 0.9240, effectively minimizing false positives which is a key consideration for clinical diagnostics. While the model showed balanced performance with an F1-score of 0.9015, a comparative analysis revealed that, although individual baseline models (SVM, RF) reported higher raw accuracy of ~99%, the proposed ensemble provides a robust and interpretable framework optimized for reliable multi-class discrimination.

How to Cite

OLAREWAJU, Babatunde Abdulrauph; MUBARAK, Alausa Babatunde; ADESHINA, Oke Afeez. Developing a Machine Learning Model Using Gene Expression for Breast Cancer Prediction. Zambia Journal of Library & Information Science (ZAJLIS ), ISSN: 2708-2695, [S.l.], v. 9, n. 2, p. 10-21, dec. 2025. ISSN 2708-2695. Available at: <https://zajlis.unza.zm/index.php/journal/article/view/208>. Date accessed: 23 july 2026. doi: https://doi.org/10.53974/unza.zajlis.9.2.208.

ABNT APA BibTeX CBE EndNote - EndNote format (Macintosh & Windows) MLA ProCite - RIS format (Macintosh & Windows) RefWorks Reference Manager - RIS format (Windows only) Turabian

Issue

Vol 9 No 2 (2025): Zambia Journal of Library & Information Science

Section

Information and Communication Technologies(ICTs)

Articles submitted to ZAJLIS should not have been published before in their current or substantially similar form, or be under consideration for publication with another journal. Authors submitting articles for publication warrant that the work is not an infringement of any existing copyright and will indemnify the publisher against any breach of such warranty. For ease of dissemination and to ensure proper policing of use, papers and contributions become the legal copyright of the publisher unless otherwise agreed. The editors may make use of software for checking the originality of submissions received.

Prior to article submission, authors should clear permission to use any content that has not been created by them. Failure to do so may lead to lengthy delays in publication. ZAJLIS is unable to publish any article which has permissions pending. The rights ZAJLIS require are:

Non-exclusive rights to reproduce the material in the article or book chapter.
Print and electronic rights.
Worldwide English language rights.
To use the material for the life of the work (i.e. there should be no time restrictions on the re-use of material e.g. a one-year licence).

Article Sidebar

Main Article Content

Abstract

Article Details