No image available for this title

Text

Analyze Important Features of PIMA Indian Database For Diabetes Prediction Using KNN



Abstract— Diabetes is a chronic, non-communicable disease,
and a long-term health condition that affects how the body uses
glucose, the type of sugar that gives energy. In Indonesia, diabetes
ranks as the sixth highest cause of death, following conditions
related to childbirth. In 2021, Indonesia has a total of 19.5 million
diabetes patients, making it the fifth-highest in the world. Some
machine learning research has used data from the PIDD (PIMA
Indian Diabetes Dataset) to predict diabetes. In this research, in
addition to prediction accuracy, data complexity is also important.
This research analyzes important features in the PIMA Indian
database using the KNN (k-nearest neighbor) method for
classification. The results show that using KNN with k=22 value
results in the highest accuracy of 83.12%. The analysis also found
that the important features required by the KNN method to
achieve high accuracy from the PIMA Indian database, in order
of importance, are glucose, age, insulin, blood pressure, Body Mass
Index, pregnancy, skin thickness, and diabetes pedigree function.
However, when used in the KNN classification method, the
diabetes pedigree function feature was found to be unnecessary,
not relevant, and can be reduced.


Availability

No copy data


Detail Information

Series Title
-
Call Number
-
Publisher JURNAL SISFOKOM (SISTEM INFORMASI DAN KOMPUTER) : Indonesia.,
Collation
12
Language
Indonesia
ISBN/ISSN
2598-7305
Classification
NONE
Content Type
-
Media Type
-
Carrier Type
-
Edition
-
Subject(s)
Specific Detail Info
-
Statement of Responsibility

Other Information

Accreditation
-

Other version/related

No other version available


File Attachment



Information


Web Online Public Access Catalog - Use the search options to find documents quickly