Record Detail
Advanced Search
Text
SIMILAR QUESTIONS IDENTIFICATION ON INDONESIAN LANGUAGE SUBJECTS USING MACHINE LEARNING
Question similarity is carried out to evaluate similarities between questions in a collection of questions in the question and answer forum and on other platforms. This is done to improve the performance of the question-and-answer forum so that new questions submitted by users can be identified as similar to existing questions in the database. Currently, research related to question similarity is still being carried out on foreign language datasets. The purpose of this research is to identify the similarity of questions in a collection of questions in Indonesian. The method used is Support Vector Machine and IndoBERT. For feature extraction, we evaluate the lexical features and syntax features of each question. For lexical feature extraction, we use the cosine similarity algorithm to calculate the distance between two objects which are represented as vectors. For syntax feature extraction we use the Indonesian part of speech tagger (POS Tag). The dataset used is a collection of questions on Indonesian subjects at the primary and secondary school levels. The results of this study show that the best performance of the Support Vector Machine is obtained from the use of the cosine similarity feature with an accuracy of 85%. While the use of the POS Tag feature or the combination of POS Tag and cosine similarity causes the model to be overfitted and the accuracy decreases to 77%. Meanwhile, for the IndoBERT model, an accuracy of 95% was obtained.
Availability
No copy data
Detail Information
Series Title |
-
|
---|---|
Call Number |
-
|
Publisher | Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI) : Indonesia., 2023 |
Collation |
005
|
Language |
English
|
ISBN/ISSN |
2089-8673
|
Classification |
NONE
|
Content Type |
-
|
Media Type |
-
|
---|---|
Carrier Type |
-
|
Edition |
-
|
Subject(s) | |
Specific Detail Info |
-
|
Statement of Responsibility |
-
|
Other Information
Accreditation |
-
|
---|
Other version/related
No other version available
File Attachment
Information
Web Online Public Access Catalog - Use the search options to find documents quickly