Home // VALID 2024, The Sixteenth International Conference on Advances in System Testing and Validation Lifecycle // View article


Precise Code Fragment Clone Detection

Authors:
Mariam Arutunian
Matevos Mehrabyan
Sevak Sargsyan
Hayk Aslanyan

Keywords: code clones; program static analysis; binary code; source code

Abstract:
Detecting duplicate code fragments referred as "clones", is essential for various aspects of software management, maintenance, and security. This article presents a novel method for detecting code fragment clones, applicable to source and binary code. The method addresses the limitations of existing tools, which often focus on detecting clones of entire functions and are typically specialized for either source or binary code, but not both simultaneously. The developed algorithm analyzes input code fragments against the target project, and outputs all detected fragment clones. For fragment clone detection, it uses program dependence graphs - a data structure unifying data and control flow for the function. In the first step source and binary code are converted to program dependence graph representation. Then unified algorithm is applied for maximal similar subgraphs detection. Code fragments corresponding to detected similar subgraphs are considered as clones. The experimental evaluation of the proposed method demonstrates its effectiveness providing an average 96.9% precision, 92.9% recall for binary code, and 96.5% precision, 93.8% recall for source code.

Pages: 7 to 14

Copyright: Copyright (c) IARIA, 2024

Publication date: September 29, 2024

Published in: conference

ISSN: 2308-4316

ISBN: 978-1-68558-199-2

Location: Venice, Italy

Dates: from September 29, 2024 to October 3, 2024