Applying large language models to extract information from crop trait prioritization studies
Guardado en:
| Publicado en: | Plants, People, Planet vol. 8, no. 1 (Jan 1, 2026), p. 176-185 |
|---|---|
| Autor principal: | |
| Otros Autores: | , , |
| Publicado: |
John Wiley & Sons, Inc.
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | Societal Impact Statement Investigation of farmers', consumers', and other stakeholders' trait preferences is vital for the adoption and impact of improved crop varieties. While qualitative research methods are known to increase the depth and scope of information from respondents, only 5% of previous trait preference studies used qualitative data in their analyses. We show that AI‐based natural language processing, particularly GPTs, is both a time and cost‐effective mechanism for accurately analyzing open‐ended trait preference data. This will contribute to the selection and prioritization of breeding targets to better meet end‐user needs, with implications for food security and health outcomes globally. Crop trait preference research is critical for the development of improved crop varieties, guiding breeding programs in setting trait priorities and targets that represent farmers' and consumers' needs. However, there is a dearth of methodological harmonization in trait preference studies, leading to high heterogeneity in collected data and analysis frameworks, which constrains comparability between studies. Qualitative research tools using open‐ended questions are among the most common methods used to elucidate crop trait preferences, but only a fraction of these data are used in analysis. The ascendance of AI tools in data analysis provides an opportunity to enhance capitalization of these data from open‐ended question types. We use natural language processing (NLP) techniques, including generative pretrained transformer (GPT) models, to elucidate labels from open‐ended question responses and perform multilabel text classification. We compare these labels to pre‐codes from close‐ended questions, as well as to existing crop trait ontology terms. We find that analyzing responses to open‐ended questions using NLP leads to information gain, including an increase in diversity of traits and insight into their social functions. We conclude that using NLP‐based approaches would allow breeding teams to extract trait terms from open‐ended question responses efficiently and to compare these to both existing ontology terms and close‐ended survey data. Our findings reveal the importance of using open‐ended questions to inform survey codes in mixed methods research design for trait preference studies. |
|---|---|
| ISSN: | 2572-2611 |
| DOI: | 10.1002/ppp3.70075 |
| Fuente: | Publicly Available Content Database |