The Pl@ntNet system leverages global crowdsourcing to collect and annotate plant observations, generating vast datasets with inherent labeling noise due to varying user expertise. Effective label aggregation is essential for training accurate models, but the scale of the data poses significant challenges to traditional methods.
In this talk, we present our cooperative label aggregation strategy, which estimates user expertise as a dynamic trust score based on their ability to correctly identify plant species. We evaluate the method on a newly released subset of the Pl@ntNet database focused on European flora, comprising over 6 million observations contributed by 800,000 users.
We then explore ongoing work on sequential label collection to further enhance data quality. By integrating a personalized recommendation system based on multi-armed contextual bandits, we aim to match each user with plant observations aligned with their current expertise. This approach promises a tailored labeling experience that improves both user engagement and annotation accuracy.
This talk is based on our recent paper (https://hal.science/hal-04603038) and current work.
Enhancing Crowdsourced Plant Identification: From Label Aggregation to Personalized Recommendations
Séminaire
Organisme intervenant (ou équipe pour les séminaires internes)
INRIA Lille
Nom intervenant
Tanguy Lefort
Résumé
Lieu
Salle C1.2.04
Date du jour
Date de fin du Workshop