Please use this identifier to cite or link to this item:
http://dx.doi.org/10.25673/120248
Title: | Data extractions using a large language model (Elicit) and human reviewers in randomized controlled trials : a systematic comparison |
Author(s): | Bianchi, Joleen Hirt, Julian ![]() Vogt, Magdalena Vetsch, Janine |
Issue Date: | 2025 |
Type: | Article |
Language: | English |
Abstract: | Aim: We aimed at comparing data extractions from randomized controlled trials by using Elicit and human reviewers. Background: Elicit is an artificial intelligence tool which may automate specific steps in conducting systematic reviews. However, the tool's performance and accuracy have not been independently assessed. Methods: For comparison, we sampled 20 randomized controlled trials of which data were extracted manually from a human reviewer. We assessed the variables study objectives, sample characteristics and size, study design, interventions, outcome measured, and intervention effects and classified the results into “more,” “equal to,” “partially equal,” and “deviating” extractions. STROBE checklist was used to report the study. Results: We analysed 20 randomized controlled trials from 11 countries. The studies covered diverse healthcare topics. Across all seven variables, Elicit extracted “more” data in 29.3% of cases, “equal” in 20.7%, “partially equal” in 45.7%, and “deviating” in 4.3%. Elicit provided “more” information for the variable study design (100%) and sample characteristics (45%). In contrast, for more nuanced variables, such as “intervention effects,” Elicit's extractions were less detailed, with 95% rated as “partially equal.” Conclusions: Elicit was capable of extracting data partly correct for our predefined variables. Variables like “intervention effect” or “intervention” may require a human reviewer to complete the data extraction. Our results suggest that verification by human reviewers is necessary to ensure that all relevant information is captured completely and correctly by Elicit. Implications: Systematic reviews are labor‐intensive. Data extraction process may be facilitated by artificial intelligence tools. Use of Elicit may require a human reviewer to double‐check the extracted data. |
URI: | https://opendata.uni-halle.de//handle/1981185920/122207 http://dx.doi.org/10.25673/120248 |
Open Access: | ![]() |
License: | ![]() |
Journal Title: | Cochrane evidence synthesis and methods |
Publisher: | Wiley |
Publisher Place: | [Hoboken, New Jersey] |
Volume: | 3 |
Issue: | 4 |
Original Publication: | 10.1002/cesm.70033 |
Page Start: | 1 |
Page End: | 6 |
Appears in Collections: | Open Access Publikationen der MLU |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Cochrane Evidence Synthesis and Methods - 2025 - Bianchi - Data Extractions Using a Large Language Model Elicit and Human.pdf | 271.3 kB | Adobe PDF | ![]() View/Open |