Home // ICCGI 2012, The Seventh International Multi-Conference on Computing in the Global Information Technology // View article


Chinese Blog Classification Based on Text Classification and Multi-feature

Authors:
Jianzhuo Yan
Suhua Yang
Liying Fang

Keywords: text classification; Chinese blog classification; short text classification; feature expansion; multi-feature integration

Abstract:
The Chinese blog has become one of the most important sources of information in China. The content of Chinese blog varies widely, thus its classification is of great significance. The Chinese blog has the features of the title, straight matter, tags and user-defined types, and different features have different lengths. Traditional text classification method of the Chinese blog classification is not ideal. In this paper, the Chinese blog is classified by using a number of Chinese blog features in which traditional text classification technique and short text classification technique will be chosen according to the different length of features. In addition, the feature expansion method is adopted for sparse features of short text, and the features are integrated by linear training. Experimental results show that the proposed method improves the accuracy of classification.

Pages: 271 to 276

Copyright: Copyright (c) IARIA, 2012

Publication date: June 24, 2012

Published in: conference

ISSN: 2308-4529

ISBN: 978-1-61208-202-8

Location: Venice, Italy

Dates: from June 24, 2012 to June 29, 2012