서브 헤더

공지사항

공지사항

행사세미나 (세미나)Capability Rubrics with LLM Annotations

페이지 정보

profile_image
작성자 관리자 댓글 0건 조회 719회 작성일 25.04.14

본문

Title: Capability Rubrics with LLM Annotations


Speaker: Prof. José Hernández-Orallo @ Universitat Politècnica de València


Time : 16:00 ~ 17:00, May 8th, 2025


Location: Online


Language: English speech & English slides


Abstract

I'll present general scales for AI evaluation that can explain what common AI benchmarks really measure, extract ability profiles of AI systems, and predict their performance for new task instances, in- and out-of-distribution. These scales are based on natural language rubrics that are used by standard language models to annotate thousands of instances from 63 textual tasks, giving good inter-rater agreements. High predictive power at the instance level becomes possible using these demand levels, providing superior estimates over black-box baseline predictors based on embeddings or finetuning, especially in out-of-distribution settings (new tasks and new benchmarks).


Bio:

José Hernández-Orallo is Professor at the Universitat Politècnica de València, Spain and Senior Research Fellow at the Leverhulme Centre for the Future of Intelligence, University of Cambridge, UK. He received a B.Sc. and a M.Sc. in Computer Science from UPV, partly completed at the École Nationale Supérieure de l'Électronique et de ses Applications (France), and a Ph.D. in Logic and Philosophy of Science with a doctoral extraordinary prize from the University of Valencia. His academic and research activities have spanned several areas of artificial intelligence, machine learning, data science and intelligence measurement, with a focus on a more insightful analysis of the capabilities, generality, progress, impact and risks of artificial intelligence. He has published five books and more than two hundred journal articles and conference papers on these topics. His research in the area of machine intelligence evaluation has been covered by several popular outlets, such as The Economist, New Scientist or Nature. He keeps exploring a more integrated view of the evaluation of natural and artificial intelligence, as vindicated in his book "The Measure of All Minds" (Cambridge University Press, 2017, PROSE Award 2018). He is a member of AAAI, CLAIRE and ELLIS, and a EurAI Fellow.