Exploring the Use of Large Language Models for Data Extraction for Systematic Reviews in Software Engineering

Laiq, Muhammad

Home // ICSEA 2025, The Twentieth International Conference on Software Engineering Advances // View article

Exploring the Use of Large Language Models for Data Extraction for Systematic Reviews in Software Engineering

Authors:
Muhammad Laiq

Keywords: LLMs; Data extraction; Systematic mapping study; literature review; Systematic reviews.

Abstract:
To support evidence-based decision-making, software engineering employs systematic reviews to collect and consolidate relevant literature on a specific research topic. However, conducting systematic reviews is a labor-intensive and time-consuming task. Recent advancements in Large Language Models (LLMs), such as Generative Pre-trained Transformer (GPT) models, offer opportunities to streamline and reduce the manual effort required, particularly in data extraction for Systematic Mapping Studies (SMS). This study evaluates the performance of GPT-4o in extracting data from 46 primary studies of an SMS by comparing the results of automated extraction with the data extracted manually. Our evaluation revealed that GPT-4o achieves an average accuracy of approximately 79%. Although these results indicate that the entire process cannot be fully automated, GPT-4o can be a supportive tool in a semi-automated workflow. Therefore, we recommend using LLMs, such as GPT-4o, for an initial phase of automated extraction, followed by human validation and refinement.

Pages: 13 to 16

Copyright: Copyright (c) IARIA, 2025

Publication date: September 28, 2025

Published in: conference

ISSN: 2308-4235

ISBN: 978-1-68558-296-8

Location: Lisbon, Portugal

Dates: from September 28, 2025 to October 2, 2025