Home // ICCGI 2012, The Seventh International Multi-Conference on Computing in the Global Information Technology // View article
Chinese Blog Classification Based on Text Classification and Multi-feature
Authors:
Jianzhuo Yan
Suhua Yang
Liying Fang
Keywords: text classification; Chinese blog classification; short text classification; feature expansion; multi-feature integration
Abstract:
The Chinese blog has become one of the most important sources of information in China. The content of Chinese blog varies widely, thus its classification is of great significance. The Chinese blog has the features of the title, straight matter, tags and user-defined types, and different features have different lengths. Traditional text classification method of the Chinese blog classification is not ideal. In this paper, the Chinese blog is classified by using a number of Chinese blog features in which traditional text classification technique and short text classification technique will be chosen according to the different length of features. In addition, the feature expansion method is adopted for sparse features of short text, and the features are integrated by linear training. Experimental results show that the proposed method improves the accuracy of classification.
Pages: 271 to 276
Copyright: Copyright (c) IARIA, 2012
Publication date: June 24, 2012
Published in: conference
ISSN: 2308-4529
ISBN: 978-1-61208-202-8
Location: Venice, Italy
Dates: from June 24, 2012 to June 29, 2012