Home // ICCGI 2011, The Sixth International Multi-Conference on Computing in the Global Information Technology // View article


Significance of Low Frequent Words in Patent Classification

Authors:
Akmal Saeed Khattak
Gerhard Heyer

Keywords: patent classification; text classification; taxonomy; International Patent Classification (IPC)

Abstract:
Low frequent terms are often considered noise but in case of patent documents it might refer to technical terms. This paper shows the significance of low frequent terms in patent classification. Our experiments show that low frequent terms cannot be ignored in patents as it give better performance in terms of f-measure and accuracy than high frequent terms. Experiments are shown to prove that set of low frequent terms outperforms set of high terms in classifying patent documents.

Pages: 8 to 13

Copyright: Copyright (c) IARIA, 2011

Publication date: June 19, 2011

Published in: conference

ISSN: 2308-4529

ISBN: 978-1-61208-139-7

Location: Luxembourg City, Luxembourg

Dates: from June 19, 2011 to June 24, 2011