Record Detail

Advanced Search

Text

Feature Expansion Using Word2vec for Hate Speech Detection on Indonesian Twitter with Classification Using SVM and Random Forest

Erwin Budi Setiawan - Personal Name
Mila Putri Kartika Dewi - Personal Name

Hate speech is one of the most common cases on Twitter. It is limited to 280 characters in uploading tweets, resulting in many word variations and possible vocabulary mismatches. Therefore, this study aims to overcome these problems and build a hate speech detection system on Indonesian Twitter. This study uses 20,571 tweet data and implements the Feature Expansion method using Word2vec to overcome vocabulary mismatches. Other methods applied are Bag of Word (BOW) and Term Frequency-Inverse Document Frequency (TF-IDF) to represent feature values in tweets. This study examines two methods in the classification process, namely Support Vector Machine (SVM) and Random Forest (RF). The final result shows that the Feature Expansion method with TF-IDF weighting in the Random Forest classification gives the best accuracy result, which is 88,37%. The Feature Expansion method with TF-IDF weighting can increase the accuracy value from several tests in detecting hate speech and overcoming vocabulary mismatches.

Availability

No copy data

Detail Information

Series Title	-
Call Number	-
Publisher	JURNAL MEDIA INFORMATIKA BUDIDARMA : Indonesia., 2022
Collation	006
Language	English
ISBN/ISSN	2614-5278
Classification	NONE
Content Type	-

Media Type	-
Carrier Type	-
Edition	-
Subject(s)	Random Forest SINTA 3 Word2Vec Hate Speech Feature Expansion Support Vector Machine(SVM) Indonesian Twitter
Specific Detail Info	-
Statement of Responsibility	-

Other Information

Accreditation	-

Other version/related

No other version available

File Attachment

Feature Expansion Using Word2vec for Hate Speech Detection on Indonesian Twitter with Classification Using SVM and Random Forest

Information

Web Online Public Access Catalog - Use the search options to find documents quickly