Komparasi Algoritma Decision Tree, Naive Bayes dan KNN dalam Klasifikasi Kanker Payudara

Authors

  • Muhammad Abdul Jabbar Universitas Dipa Makassar; Jl. Perintis Kemerdekaan KM. 9 Makassar, Indonesia
  • Erfan Hasmin Universitas Dipa Makassar; Jl. Perintis Kemerdekaan KM. 9 Makassar, Indonesia
  • Sunardi Universitas Dipa Makassar; Jl. Perintis Kemerdekaan KM. 9 Makassar, Indonesia
  • Cucut Susanto a:1:{s:5:"en_US";s:25:"Universitas Dipa Makassar";}
  • Wilem Musu Universitas Dipa Makassar; Jln. Perintis Kemerdekaan 9, Makassar, 0411-587194

DOI:

https://doi.org/10.22303/csrid.14.3.2022.258-270

Keywords:

Breast Cancer, Decision Tree, Naive Bayes, KNN

Abstract

Breast cancer is a type of cancer that is commonly formed in breast cells and the cancer cells grow out of control. Cancer can occur in all genders. In 2020, the Global Cancer Observatory recorded a death rate of 684,996,000 and new cases of 2,261,419[1]. From the mortality rate, both men and women should be aware of their health by taking actions such as early detection and avoiding the risk of causing cancer. The source of data in this study came from the UCI Machine Learning Repository. This study aims to compare three data mining algorithms for classifying breast cancer. In this study, the algorithms used in making comparisons are the Decision Tree Algorithm, Naive Bayes, and KNN using 2 cross-validation methods, Hold-Out and K-Fold. The results of the test showed that the KNN algorithm always produced excellent accuracy performance compared to the Naive Bayes and Decision Tree algorithms, namely 98% in the Hold-Out method and 96% in the K-Fold method, while Naive Bayes is 95% on the Hold-Out method and 95% on the K-Fold method, Decision Tree is 94% on the Hold-Out method and 93% on the K-Fold method.

 KeywordsBreast Cancer, Decision Tree, Naive Bayes, KNN

References

C. Council, “Breast cancer | Causes, Symptoms & Treatments | Cancer Council,” 2020. https://www.cancer.org.au/cancer-information/types-of-cancer/breast-cancer (accessed May 10, 2022).

K. K. R. Indonesia, “Kanker Payudara Paling Banyak di Indonesia, Kemenkes Targetkan Pemerataan Layanan Kesehatan,” 2022. https://www.kemkes.go.id/article/view/22020400002/kanker-payudaya-paling-banyak-di-indonesia-kemenkes-targetkan-pemerataan-layanan-kesehatan.html (accessed May 10, 2022).

G. C. Observatory, “Cancer Today,” 2020. https://gco.iarc.fr/ (accessed May 11, 2022).

A. Osareh and B. Shadgar, “Machine learning techniques to diagnose breast cancer,” in 2010 5th International Symposium on Health Informatics and Bioinformatics, 2010, pp. 114–120. doi: 10.1109/HIBIT.2010.5478895.

W. Ananda, M. Safii, and M. Fauzan, “Prediksi Jumlah Hasil Panen Sawit Menggunakan Algoritma Naive Bayes,” TIN Terap. Inform. Nusant. Vol, vol. 1, no. 10, pp. 513–519, 2021.

A. Andriani, “Sistem prediksi penyakit diabetes berbasis decision tree,” J. Bianglala Inform., vol. I, no. 1, pp. 1–10, 2013.

Y. I. Kurniawan and T. I. Barokah, “Klasifikasi Penentuan Pengajuan Kartu Kredit Menggunakan K-Nearest Neighbor,” J. Ilm. Matrik, vol. 22, no. 1, pp. 73–82, 2020, doi: 10.33557/jurnalmatrik.v22i1.843.

W. Musu, A. Ibrahim, and Heriadi, “Pengaruh Komposisi Data Training dan Testing terhadap Akurasi Algoritma C4 . 5,” Pros. Semin. Ilm. Sist. Inf. Dan Teknol. Inf., vol. X, no. 1, pp. 186–195, 2021.

F. Kurniawan and Ivandari, “Komparasi Algoritma Data Mining Untuk Klasifikasi Penyakit Kanker Payudara,” IC-Tech, vol. XII, no. 1, pp. 1–8, 2017, [Online]. Available: http://jurnal.stmik-wp.ac.id

Arthur Asuncion and D. Newman, “About,” 2017. https://archive.ics.uci.edu/ml/about.html (accessed May 14, 2022).

G. R. Shinde, S. Majumder, H. R. Bhapkar, and P. N. Mahalle, “Exploratory Data Analysis BT - Quality of Work-Life During Pandemic: Data Analysis and Mathematical Modeling,” G. R. Shinde, S. Majumder, H. R. Bhapkar, and P. N. Mahalle, Eds. Singapore: Springer Singapore, 2022, pp. 97–105. doi: 10.1007/978-981-16-7523-2_7.

G. A. Marcoulides, Discovering Knowledge in Data: an Introduction to Data Mining, vol. 100, no. 472. 2005. doi: 10.1198/jasa.2005.s61.

M. A. Berry and G. Linoff, “Data mining techniques - for marketing, sales, and customer support,” 1997.

D. M. S. Kurniawan, Pengenalan Machine Learning Python. Jakarta: PT ELEX MEDIA KOMPUTINDO, 2020.

N. Jayanti, S. Puspitodjati, and T. Elida, “Teknik Klasifikasi Pohon Keputusan Untuk Memprediksi Kebangkrutan Bank Berdasarkan Rasio Keuangan Bank,” Proceeding, Semin. Ilm. Nas. Komput. dan Sist. Intelijen (KOMMIT 2008) ISSN 1411-6286, no. Kommit, pp. 101–107, 2008.

D. Sartika, D. I. Sensuse, U. Indo, G. Mandiri, and F. I. Komputer, “Perbandingan Algoritma Klasifikasi Naive Bayes , Nearest Neighbour , dan Decision Tree pada Studi Kasus Pengambilan Keputusan Pemilihan Pola Pakaian,” vol. 1, no. 2, pp. 151–161, 2017.

K. Chomboon, P. Chujai, P. Teerarassammee, K. Kerdprasop, and N. Kerdprasop, “An Empirical Study of Distance Metrics for k-Nearest Neighbor Algorithm,” pp. 280–285, 2015, doi: 10.12792/iciae2015.051.

T. Rismawan, A. W. Irawan, W. Prabowo, and S. Kusumadewi, “Sistem Pendukung Keputusan Berbasis Pocket Pc Sebagai Penentu Status Gizi Menggunakan Metode Knn (K-Nearest Neighbor),” Teknoin, vol. 13, no. 2, pp. 18–23, 2008, doi: 10.20885/teknoin.vol13.iss2.art5.

I. A. Nikmatun and I. Waspada, “Implementasi Data Mining untuk Klasifikasi Masa Studi Mahasiswa Menggunakan Algoritma K-Nearest Neighbor,” J. SIMETRIS, vol. 10, no. 2, pp. 421–432, 2019.

Published

2022-12-20

How to Cite

Jabbar, M. A., Hasmin, E., Sunardi, Susanto, C., & Musu, W. (2022). Komparasi Algoritma Decision Tree, Naive Bayes dan KNN dalam Klasifikasi Kanker Payudara. Computer Science Research and Its Development Journal, 14(3), 258–270. https://doi.org/10.22303/csrid.14.3.2022.258-270

Issue

Section

Articles