Diagnosis of breast cancer with Stacked autoencoder and Subspace kNN


Adem K.

Physica A: Statistical Mechanics and its Applications, cilt.551, 2020 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 551
  • Basım Tarihi: 2020
  • Doi Numarası: 10.1016/j.physa.2020.124591
  • Dergi Adı: Physica A: Statistical Mechanics and its Applications
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Artic & Antarctic Regions, Compendex, INSPEC, Public Affairs Index, zbMATH, Civil Engineering Abstracts
  • Anahtar Kelimeler: Breast cancer, Stacked autoencoder, Subspace kNN
  • Sivas Cumhuriyet Üniversitesi Adresli: Hayır

Özet

Breast cancer is one of the most common and deadliest cancer types in women worldwide. Research on this disease has become very important because early diagnosis stages, clinical applications and the speed of response to treatment are facilitated in diseases such as cancer. In this study, an approach is proposed in which a Subspace kNN algorithm is used together with Stacked autoencoder (SAE) for diagnosis of disease on the breast cancer microarray dataset for the first time. Such hybrid approaches can provide better results when classifying data sets with high-dimensional and uncertainty. The data set used in the study was taken from Kent Ridge-2 database. It consists of 97 samples (51 benign, 46 malicious) and 24482 attributes. The performance of the proposed method was evaluated and the results were compared with other well-known methods of dimension reduction and machine learning. As a result of the comparison, the data set was reduced to 100 attributes by using SAE and Subspace kNN and 91.24% accuracy was achieved. The result obtained provides important classification accuracy, especially in high-dimensional data sets. The importance of this study is that the models that were created by using various classifiers to increase the success rate of the stacked autoencoder-softmax classifier model in the breast cancer microarray data set were applied for the first time. In this regard, it is considered that automation-based studies will provide diagnostic decision support system a solution using the proposed method in future works.