Home // International Journal On Advances in Software, volume 17, numbers 3 and 4, 2024 // View article


Graph Based Text Classification Using a Word-Reduced Heterogeneous Graph

Authors:
Hiromu Nakajima
Minoru Sasaki

Keywords: text classification; graph convolutional neural network; Word-Reduced Heterogeneous Graph; semi-supervised learning

Abstract:
Text classification, which determines the label of a document based on cues such as the co-occurrence of words and their frequency of occurrence, has been studied in various approaches to date. Traditional text classification methods utilizing graph structure data represent the connections between words, words and documents, and between documents themselves through edge weights between nodes. These are subsequently trained by feeding them into a graph neural network. However, such methods require a very large amount of memory, which can lead to operational issues or an inability to process large datasets in certain environments. In this study, we introduce a more compact graph structure by eliminating words that appear in only one document, deemed unnecessary for text classification. This approach not only conserves memory but also enables the use of larger trained models by utilizing the saved memory. The findings demonstrate that this method successfully reduces memory usage while maintaining the accuracy of conventional approaches. By utilizing the saved memory, the proposed method succeeded in using larger trained models, and the classification accuracy of the proposed method was dramatically improved compared to the conventional method.

Pages: 143 to 152

Copyright: Copyright (c) to authors, 2024. Used with permission.

Publication date: December 30, 2024

Published in: journal

ISSN: 1942-2628