Hidayat, NasrulAkhmad (2014) Pengklasteran Novel Menggunakan Algoritma Shrinking Based Shared Nearest Neighbour (SSNN). Sarjana thesis, Universitas Brawijaya.
Abstract
Novel adalah salah satu jenis buku yang sangat diminati masyarakat. Semakin banyaknya jumlah dan genre novel diperlukan pengelompokan novel yang baik untuk mempermudah pengguna dalam mencari dan memperoleh informasi. Pada penelitian ini digunakan Algoritma Shrinking Based Shared Nearest Neighbour (SSNN) dan metode Unsupervised Feature Selection antara Term Contribution (TC) dan Document Frequency (DF) dalam pengelompokan novel. Algoritma SSNN menggunakan konsep pergerakan data Algoritma Data Shrinking untuk memperkuat kepadatan graph ketetanggaan. Sedangkan, metode feature selection digunakan untuk mengurangi dimensionalitas suatu data pada text clustering sehingga dapat meningkatkan performansi clustering. Hasil pengujian memperlihatkan nilai f-measure menggunakan Algoritma SSNN sebesar 0.88. Kesalahan pengklasteran terjadi ketika nilai parameter k (ketetanggaan terdekat), CP (Current Point), dan MP (Move Point) yang dimasukkan terlalu kecil sehingga beberapa titik ketetanggaan tidak masuk dalam proses pembentukan cluster. Sedangkan pada pengujian pengaruh penggunaan metode Unsupervised Feature Selection, yaitu penggunaan metode DF lebih baik jika dibandingkan metode TC dikarenakan metode DF mempertahankan term yang memiliki term frekuensi yang besar dan terjadi dibanyak dokumen sehingga menghasilkan nilai similarity antar dokumen yang cukup baik dibandingkan metode TC dalam proses pembentukan cluster. Dari segi waktu komputasi yang digunakan, metode DF memiliki waktu komputasi yang lebih cepat dibandingkan dengan metode TC dengan selisih waktu 0.5 hingga 5 detik karena proses perhitungan metode DF lebih sederhana daripada metode TC.
English Abstract
Novel is one of the types of books that are in great demand by many people. Increased number of amounts and genre of novels needed a good grouping to ease the users to search and gain information they want. In this study used Shrinking Based Shared Nearest Neighbour Algorithm (SSNN) and methods of Feature Selection between Term Contribution (TC) and Document Frequency (DF) in a novel grouping. SSNN Algorithm applies the concept of data movement from Data Shrinking Algorithm to enlarge the accuracy achieved. To handle the high dimension feature problem in the text clustering then conducted feature selection method. Feature selection method has ability to reduce some data dimensionality so the clustering performance can be increased. The test result in SSNN algorithm shows the value of f-measure is 0.88. Mistakes mainly occurred when the parameter value inputted is insignificant that some neighborhood point is not included in the cluster creation process. The use of Document Frequency (DF) method is better compared to Term Contribution (TC) method because DF method maintain the terms that happen in many documents which gives better similarity value between documents than TC method in forming cluster process. In the time computation aspect, DF method has faster computation than TC in 0.5% to 5% difference because TC method has simpler computation..
Item Type: | Thesis (Sarjana) |
---|---|
Identification Number: | SKR/FTIK/2014/206/051406221 |
Subjects: | 000 Computer science, information and general works > 005 Computer programming, programs, data |
Divisions: | Fakultas Ilmu Komputer > Teknik Informatika |
Depositing User: | Budi Wahyono Wahyono |
Date Deposited: | 30 Oct 2014 14:45 |
Last Modified: | 20 Oct 2021 04:46 |
URI: | http://repository.ub.ac.id/id/eprint/145994 |
Preview |
Text
SKRIPSI.pdf Download (6MB) | Preview |
Actions (login required)
View Item |