Home // ICONS 2014, The Ninth International Conference on Systems // View article


A Novel Technique for Retrieving Source Code Duplication

Authors:
Yoshihisa Udagawa

Keywords: Java source code, Control statement, Method identifier, Similarity measure, Derived sequence retrieval model, Sorensen-Dice index

Abstract:
In this paper we propose a new approach for the detection of clones in source code for improving safety of software systems. The main contributions of this paper are development of a mining algorithm to explore program structure and the definition of a similarity measure that is tailored to sequentially structured texts for retrieving similar source code fragments. Retrieval experiments were conducted using Apache-Tomcat 7, which is a large-size open source Java program. The results show that the proposed mining algorithm finds a set of frequent sequences within one minute, and the proposed similarity measure is a better indicator than the Sorensen-Dice index.

Pages: 172 to 177

Copyright: Copyright (c) IARIA, 2014

Publication date: February 23, 2014

Published in: conference

ISSN: 2308-4243

ISBN: 978-1-61208-319-3

Location: Nice, France

Dates: from February 23, 2014 to February 27, 2014