Home // ICCGI 2011, The Sixth International Multi-Conference on Computing in the Global Information Technology // View article
Significance of Low Frequent Words in Patent Classification
Authors:
Akmal Saeed Khattak
Gerhard Heyer
Keywords: patent classification; text classification; taxonomy; International Patent Classification (IPC)
Abstract:
Low frequent terms are often considered noise but in case of patent documents it might refer to technical terms. This paper shows the significance of low frequent terms in patent classification. Our experiments show that low frequent terms cannot be ignored in patents as it give better performance in terms of f-measure and accuracy than high frequent terms. Experiments are shown to prove that set of low frequent terms outperforms set of high terms in classifying patent documents.
Pages: 8 to 13
Copyright: Copyright (c) IARIA, 2011
Publication date: June 19, 2011
Published in: conference
ISSN: 2308-4529
ISBN: 978-1-61208-139-7
Location: Luxembourg City, Luxembourg
Dates: from June 19, 2011 to June 24, 2011