Menu Close

Sessions

Evaluation

Measuring Actual Privacy of Obfuscated Queries in
Information Retrieval
Francesco Luigi De Faveri, Guglielmo Faggioli and Nicola
Ferro
Retrieve, Annotate, Evaluate, Repeat: Leveraging Multimodal
LLMs for Large-Scale Product Retrieval Evaluation
Kasra Hosseini, Thomas Kober, Josip Krapac, Roland
Vollgraf, Weiwei Cheng and Ana Peleteiro Ramallo
Context Example Selection For LLM Generated Relevance
Assessments
Jack McKechnie, Graham McDonald and Craig Macdonald
Corpus Subsampling: Estimating the Effectiveness of Neural
Retrieval Models on Large Corpora
Maik Fröbe, Andrew Parry, Harrisen Scells, Shuai Wang,
Shengyao Zhuang, Guido Zuccon, Martin Potthast and Matthias
Hagen
PEIR: Modeling Performance in Neural Information RetrievalPooya Khandel, Andrew Yates, Ana Lucia Varbanescu, Maarten
de Rijke and Andy Pimentel
Towards Reliable Testing for Multiple Information Retrieval
System Comparisons
David Otero, Javier Parapar and Alvaro Barreiro

Domain-specific tasks and specific user groups

The Impact of Mainstream-Driven Algorithms on
Recommendations For Children
Robin Ungruh, Alejandro Bellogín and Maria Soledad Pera
Evaluating LLM Abilities to Understand Tabular Electronic
Health Records: A Comprehensive Study of Patient Data
Extraction and Retrieval
Jesus Lovon-Melgarejo, Martin Mouysset, Jo Oleiwan, Jose G
Moreno, Christine Damase-Michel and Lynda Tamine
exHarmony: Authorship and Citations for Benchmarking the
Reviewer Assignment Problem
Sajad Ebrahimi, Sara Salamat, Negar Arabzadeh, Mahdi
Bashari and Ebrahim Bagheri
Leveraging Query Terms for Efficient Legal Document
Recommendation
André Rolim, Leandro Marinho, Edleno Moura, Marcos
Domingues and Ricardo Oliveira
Advancing Math Formula Search Using Diverse Structural and
Symbolic Representations
Sumedh Vemuganti, Ayu Seiya and Nickvash Kani

From facts and fairness to adversaries

LIBRA: Measuring Bias of Large Language Model from a Local
Context
Bo Pang, Tingrui Qiao, Caroline Walker, Chris Cunningham
and Yun Sing Koh
Opt-in Transparent Fairness for Recommender SystemsBjørnar Vassøy, Benjamin Kille and Helge Langseth
Enhancing FEVER-Style Claim Fact-Checking Against
Wikipedia: A Diagnostic Taxonomy and Generative Framework
Anton Chernyavskiy, Dmitry Ilvovsky and Preslav Nakov
News Without Borders: Domain Adaptation of Multilingual
Sentence Embeddings for Cross-lingual News Recommendation
Andreea Iana, Fabian David Schmidt, Goran Glavaš and Heiko
Paulheim
Towards Efficient and Explainable Hate Speech Detection via
Model Distillation
Paloma Piot and Javier Parapar
Enhancing Utility in Differentially Private Recommendation
Data Release via Exponential Mechanism
Antonio Ferrara, Angela Di Fazio, Alberto Carlo Maria
Mancino, Tommaso Di Noia and Eugenio Di Sciascio

Graphs & RAG

Graph Representation of Tables+Text and Compact Subgraph
Retrieval for QA Tasks
Vishwajeet Kumar, Jaydeep Sen, Bhawna Chelani and Soumen
Chakrabarti
Higher Order Knowledge Graph EmbeddingsGiuseppe Pirrò
Town Mice versus Country Mice: Urban Bias in Job
Recommender Systems
Roan Schellingerhout, Francesco Barile and Nava Tintarev
Graph-Convolutional Networks: Named Entity Recognition and
Large Language Model Embedding in Document Clustering
Imed Keraghel and Mohamed Nadif
Leveraging Retrieval-Augmented Generation for Keyphrase
Synonym Suggestion
Jorge Gabín and Javier Parapar
Is Relevance ‘Lost in Transmission’ from Retriever to
Generator?
Fangzheng Tian, Debasis Ganguly and Craig Macdonald

Recommenders

Repeat-bias-aware Optimization of Beyond-accuracy Metrics
for Next Basket Recommendation
Yuanna Liu, Ming Li, Mohammad Aliannejadi and Maarten de
Rijke
CountNet: Utilising Repetition Counts in Sequential
Recommendation
Aleksandr V. Petrov, Efi Karra Taniskidou and Sean Murphy
Feature Attribution Explanations of Session-based
Recommendations
Simone Borg Bruun, Maria Maistro and Christina Lioma
Embedding Cultural Diversity in Prototype-based Recommender
Systems
Armin Moradi, Nicola Neophytou, Florian Carichon and
Golnoosh Farnadi
Town Mice versus Country Mice: Urban Bias in Job
Recommender Systems
Roan Schellingerhout, Francesco Barile and Nava Tintarev
LLM is Knowledge Graph Reasoner: LLM’s Intuition-aware
Knowledge Graph Reasoning for Cold-start Sequential
Recommendation
Keigo Sakurai, Ren Togo, Takahiro Ogawa and Miki Haseyama

Conversational and Robust IR

Zero-Shot and Efficient Clarification Need Prediction in
Conversational Search
Lili Lu, Chuan Meng, Federico Ravenda, Mohammad Aliannejadi
and Fabio Crestani
Improving the Re-Usability of Conversational Search Test
Collections
Zahra Abbasiantaeb, Chuan Meng, Leif Azzopardi and Mohammad
Aliannejadi
Malevolence Attacks Against Pretrained Dialogue ModelsPengjie Ren, Ruiqi Li, Zhaochun Ren, Zhumin Chen, Maarten
de Rijke and Yangjun Zhang
mFollowIR: a Multilingual Benchmark for Instruction
Following in Information Retrieval
Orion Weller, Benjamin Chang, Eugene Yang, Mahsa Yarmohammadi, Sam Barham, Sean MacAvaney, Arman Cohan, Luca Soldaini, Benjamin Van Durme and Dawn Lawrie
Query Performance Prediction using Dimension Importance
Estimators
Guglielmo Faggioli, Nicola Ferro, Raffaele Perego and
Nicola Tonellotto
On the Robustness of Generative Information Retrieval
Models: An Out-of-Distribution Perspective
Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Changjiang Zhou,
Maarten de Rijke and Xueqi Cheng

About rankers and rerankers

Guiding Retrieval using Large Language ModelsMandeep Rathee, Sean MacAvaney and Avishek Anand
Set-Encoder: Permutation-Invariant Inter-Passage Attention
for Listwise Passage Re-Ranking with Cross-Encoders
Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao
Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin
Potthast and Matthias Hagen
Rank-without-GPT: Building GPT-Independent Listwise
Rerankers on Open-Source Large Language Models
Xinyu Zhang, Sebastian Hofstätter, Patrick Lewis, Raphael
Tang and Jimmy Lin
An Investigation of Prompt Variations for Zero-shot LLM-based RankersShuoqi Sun, Shengyao Zhuang, Shuai Wang and Guido Zuccon
Can Large Language Models Effectively Rerank News Articles
for Background Linking?
Marwa Essam and Tamer Elsayed
One size doesn’t fit all: Predicting the Number of Examples
for In-Context Learning
Manish Chandra, Debasis Ganguly and Iadh Ounis

Across modalities and languages

A Multi-modal Recipe for Improved Multi-domain
Recommendation
Zixuan Yi and Iadh Ounis
Towards Identity-Aware Cross-Modal Retrieval: a Dataset and
a Baseline
Nicola Messina, Lucia Vadicamo, Leo Maltese and Claudio
Gennaro
Patent Figure Classification using Large Vision-language
Models
Sushil Awale, Eric Müller-Budack and Ralph Ewerth
MVAM: Multi-View Attention Method for Fine-grained
Image-Text Matching
Wanqing Cui, Rui Cheng, Jiafeng Guo and Xueqi Cheng
Maybe you are looking for CroQS: Cross-modal Query
Suggestion for Text-to-Image Retrieval
Giacomo Pacini, Fabio Carrara, Nicola Messina, Nicola
Tonellotto, Giuseppe Amato and Fabrizio Falchi
Visual Latent Captioning – Towards Verbalizing Vision
Transformer Encoders
Sogol Haghighat, Tim Daniel Metzler, Santosh Thoduka and
Sebastian Houben

Efficiency in IR and NLP

MURR: Model Updating with Regularized Replay for Searching
a Document Stream
Eugene Yang, Nicola Tonellotto, Dawn Lawrie, Sean
MacAvaney, James Mayfield, Douglas Oard and Scott Miller
Token Pruning Optimization for Efficient Dense Retrieval
with Multi-Vector Representations
Shanxiu He, Mutasem Al-Darabsah, Suraj Nair, Jonathan May,
Tarun Agarwal, Tao Yang and Choon Hui Teo
CUP: a Framework for Resource-Efficient Review-Based
Recommenders
Ghazaleh Haratinezhad Torbati, Anna Tigunova, Gerhard
Weikum and Andrew Yates
LSTM-based Selective Dense Text Retrieval Guided by Sparse
Lexical Retrieval
Yingrui Yang, Parker Carlson, Yifan Qiao, Wentai Xie,
Shanxiu He and Tao Yang
Leveraging High-Resolution Features for Improved Deep
Hashing-based Image Retrieval
Aymen Berriche, Mehdi Zakaria Adjal and Riyadh Baghdadi
Decoding the Hierarchy: A Hybrid Approach to Hierarchical
Multi-Label Text Classification
Fatos Torba, Christophe Gravier, Charlotte Laclau,
Abderrahmen Kammoun and Julien Subercaze

Findings

Evaluating Auto-complete Ranking for Diversity and RelevanceSonali Singh, Sachin Farfade and Prakash Mandayam Comar
Semantically Proportioned nDCG for Explaining ColBERT’s
Learning Process
Ariane Mueller and Craig Macdonald
Exploring the relationship between listener receptivity and
source of music recommendations
John Paul Vargheese, Marianne Wilson, Katherine Stephen,
Rachel Salzano and David Brazier
Uncertainty Estimation in the Real World: A study on Music
Emotion Recognition
Karn N Watcharasupat, Yiwei Ding, Aleksandra T Ma, Pavan
Seshadri and Alexander Lerch
Unraveling the Impact of Visual Complexity on Search as
Learning
Wolfgang Gritz, Anett Hoppe and Ralph Ewerth
Semi-supervised image-based narrative extraction: A case
study with historical photographic records
Fausto German, Brian Keith, Mauricio Matus, Diego Urrutia
and Claudio Meneses
FrameworkX: A Reusable RAG Framework and Baselines for
TrackY
Ronak Pradeep, Nandan Thakur, Sahel Sharifymoghaddam, Eric
Zhang, Ryan Nguyen, Daniel Campos, Nick Craswell and Jimmy
Lin
Lost but Not Only in the Middle: Positional Bias in
Retrieval Augmented Generation
Jan Hutter, David Rau, Maarten Marx and Jaap Kamps
Evaluating Sequential Recommendations in the Wild: A Case
Study on Offline Accuracy, Click Rates, and Consumption
Anastasiia Klimashevskaia, Snorre Alvsvåg, Christoph
Trattner, Alain D. Starke, Astrid Tessem and Dietmar Jannach
Biased PromptORE: Enhancing Relation Extraction in Gendered
Languages and Complex Texts – The Case of Spanish Documents
from the XVI Century
Héctor López Hidalgo, Michel Boeglin, David Kahn, Josiane
Mothe, Diego Ortiz and David Panzoli
Efficient Session Retrieval Using Topical Index ShardsGijs Hendriksen, Djoerd Hiemstra and Arjen de Vries

CLEF & Repro Tracks 1

EXIST 2025: Learning with Disagreement for Sexism Identification and Characterization in Tweets, Memes, and TikTok VideosLaura Plaza, Jorge Carrillo-De-Albornoz, Iván Arcos, Paolo Rosso, Damiano Spina, Enrique Amigó, Julio Gonzalo and Roser Morante
Towards Reproducibility of Interactive Retrieval Experiments: Framework and Case StudyJana Isabelle Friese and Norbert Fuhr
Combining and Evaluating Query Performance Predictors: A Reproducibility StudySourav Saha, Suchana Datta, Dwaipayan Roy, Mandar Mitra and Derek Greene
LongEval at CLEF 2025: Longitudinal Evaluation of IR Model PerformanceMatteo Cancellieri, Alaa El-Ebshihy, Tobias Fink, Petra Galuščáková, Gabriela González Sáez, Lorraine Goeuriot, David Iommi, Jüri Keller, Petr Knoth, Philippe Mulhem, Florina Piroi, David Pride and Philipp Schaer
On the Reproducibility of: Adapting Learned Sparse Retrieval for Long DocumentsEmmanouil Georgios Lionis and Jia-Huei Ju
Fact vs. Fiction: Are the Reportedly “Magical” LLM-Based Sequential Recommenders Reproducible?Shirin Tahmasebi, Narjes Nikzad, Amir H. Payberah, Meysam Asgari-Chenaghlu and Mihhail Matskin
BioASQ at CLEF2025: The thirteenth edition of the large-scale biomedical semantic indexing and question answering challengeAnastasios Nentidis, Georgios Katsimpras, Anastasia Krithara, Martin Krallinger, Miguel Rodriguez Ortega, Natalia Loukachevitch, Andrey Sakhovskiy, Elena Tutubalina, Grigorios Tsoumakas, George Giannakoulas, Alexandra Bekiaridou, Athanasios Samaras, Giorgio Maria Di Nunzio, Nicola Ferro, Stefano Marchesin, Laura Menotti, Gianmaria Silvello and Georgios Paliouras
A Reproducibility Study on Consistent LLM Reasoning for Natural Language Inference over Clinical TrialsArtur Guimarães, João Magalhães and Bruno Martins
eRisk 2025: Contextual and Conversational Approaches for Depression ChallengesJavier Parapar, Anxo Perez, Xi Wang and Fabio Crestani
LifeCLEF 2025 Teaser: Challenges on Species Presence Prediction and Identification, and Individual Animal IdentificationAlexis Joly, Lukáš Picek, Stefan Kahl, Hervé Goëau, Lukas Adam, Christophe Botella, Diego Marcos, Maximilien Servajean, César Leblanc, Theo Larcher, Jiri Matas, Klara Janouskova, Vojtěch Čermák, Kostas Papafitsoros, Robert Planqué, Willem-Pier Vellinga, Holger Klinck, Tom Denton, Pierre Bonnet and Henning Müller
ImageCLEF 2025: Multimedia Retrieval in Medical, Social Media and Content Recommendation ApplicationsBogdan Ionescu, Henning Müller, Dan-Cristian Stanciu, Ahmad Idrissi-Yaghir, Ahmedkhan Radzhabov, Alba García Seco de Herrera, Alexandra Andrei, Andrea Storås, Asma Ben Abacha, Benjamin Bracke, Benjamin Lecouteux, Benno Stein, Cécile Macaire, Christoph Friedrich, Cynthia Sabrina Schmidt, Didier Schwab, Dimitar Dimitrov, Emmanuelle Esperança-Rodier, Gabriel Constantin, Hendrik Damm, Henning Schäfer, Ivan Rodkin, Johannes Kiesel, Johannes Rückert, Liviu-Daniel Stefan, Louise Bloch, Martin Potthast, Maximilian Heinrich, Helmut Becker, Ivan Koychev, Josep Malvehy, Michael Riegler, Mihai Dogariu, Noel Codella, Pål Halvorsen, Preslav Nakov, Raphael Brüngel, Roberto Andres Novoa, Rocktim Jyoti Das, Steven A. Hicks, Sushant Gautam, Tabea M. G. Pakull, Vajira Thambawita, Vassili Kovalev, Wen-Wai Yim and Zhuohan Xie

CLEF & Repro Tracks 2

CLEF 2025 SimpleText Track: Simplify Scientific Text (and Nothing More)Liana Ermakova and Jaap Kamps
CLEF 2025 JOKER Lab: Humour in the MachineLiana Ermakova, Anne-Gwenn Bosser, Tristan Miller and Ricardo Campos
QuantumCLEF 2025 – The Second Edition of the Quantum Computing Lab at CLEFAndrea Pasin, Maurizio Ferrari Dacrema, Paolo Cremonesi, Washington Cunha, Marcos Goncalves and Nicola Ferro
Overview of PAN 2025: Generative AI Detection, Multilingual Text Detoxification, Multi-Author Writing Style Analysis, and Generative Plagiarism DetectionJanek Bevendorff, Daryna Dementieva, Maik Fröbe, Bela Gipp, André Greiner-Petter, Jussi Karlgren, Maximilian Mayerl, Preslav Nakov, Alexander Panchenko, Martin Potthast, Artem Shelmanov, Efstathios Stamatatos, Benno Stein, Yuxia Wang, Matti Wiegmann and Eva Zangerle
The CLEF-2025 CheckThat! Lab: Subjectivity, Fact-Checking, Claim Extraction & Normalization, and RetrievalFiroj Alam, Julia Maria Struß, Tanmoy Chakraborty, Stefan Dietze, Salim Hafid, Katerina Korre, Arianna Muti, Preslav Nakov, Federico Ruggeri, Sebastian Schellhamm, Vinay Setty, Megha Sundriyal, Konstantin Todorov and Venktesh Viswanathan
Revisiting Language Models in Neural News Recommender Systems: A Reproducibility StudyYuyue Zhao, Jin Huang, David Vos and Maarten de Rijke
Overview of Touché 2025: Argumentation SystemsJohannes Kiesel, Çağrı Çöltekin, Marcel Gohsen, Sebastian Heineking, Maximilian Heinrich, Maik Fröbe, Tim Hagen, Mohammad Aliannejadi, Tomaž Erjavec, Matthias Hagen, Matyáš Kopp, Nikola Ljubešić, Katja Meden, Nailia Mirzakhmedova, Vaidas Morkevičius, Harrisen Scells, Ines Zelch, Martin Potthast and Benno Stein
Reproducing HotFlip for Corpus Poisoning Attacks in Dense RetrievalYongkang Li, Panagiotis Eustratiadis and Evangelos Kanoulas
TalentCLEF at CLEF2025: Skill and Job Title Intelligence for Human Capital ManagementLuis Gasco, Hermenegildo Fabregat, Laura García-Sardiña, Daniel Deniz, Alvaro Rodrigo and Rabih Zbib
A Reproducibility Study for Joint Information Retrieval and Recommendation in Product SearchSimone Merlo, Guglielmo Faggioli and Nicola Ferro
Are Representation Disentanglement and Interpretability Linked in Recommendation Models? A Critical Review and Reproducibility StudyErvin Dervishaj, Tuukka Ruotsalo, Maria Maistro and Christina Lioma
ELOQUENT CLEF Shared Tasks for Evaluation of Generative Language Model Quality, 2nd editionJussi Karlgren, Ekaterina Artemova, Ondřej Bojar, Timothee Mickus, Vladislav Mikhailov, Magnus Sahlgren, Erik Velldal and Lilja Øvrelid