Home // ICONS 2014, The Ninth International Conference on Systems // View article
A Novel Technique for Retrieving Source Code Duplication
Authors:
Yoshihisa Udagawa
Keywords: Java source code, Control statement, Method identifier, Similarity measure, Derived sequence retrieval model, Sorensen-Dice index
Abstract:
In this paper we propose a new approach for the detection of clones in source code for improving safety of software systems. The main contributions of this paper are development of a mining algorithm to explore program structure and the definition of a similarity measure that is tailored to sequentially structured texts for retrieving similar source code fragments. Retrieval experiments were conducted using Apache-Tomcat 7, which is a large-size open source Java program. The results show that the proposed mining algorithm finds a set of frequent sequences within one minute, and the proposed similarity measure is a better indicator than the Sorensen-Dice index.
Pages: 172 to 177
Copyright: Copyright (c) IARIA, 2014
Publication date: February 23, 2014
Published in: conference
ISSN: 2308-4243
ISBN: 978-1-61208-319-3
Location: Nice, France
Dates: from February 23, 2014 to February 27, 2014