서브 헤더

공지사항

공지사항

행사세미나 (세미나)Towards Cost Efficient Use of Pre-trained Models

페이지 정보

profile_image
작성자 관리자 댓글 0건 조회 735회 작성일 25.05.15

본문

Title: Towards Cost Efficient Use of Pre-trained Models


Speaker: Prof. Alan Ritter @ Georgia Tech


Time : 14:00 - 15:00, May 20th, 2025


Location: Hybrid


Language: English speech & English slides


Abstract

Large language models (LLMs) are driving rapid advances in AI, but these breakthroughs come with substantial costs. Training state-of-the-art models demands significant GPU resources for both pretraining and inference, as well as labeled data for post-training. In this talk, I will explore cost-utility tradeoffs that arise across several stages of model development, aiming to inform more efficient decision-making. First, I will examine pretraining-based adaptation, which incurs high computational costs when applied to new domains. Second, I will show that training and distilling large models can offer a cost-effective path to improved performance. Third, I will compare the tradeoffs between supervised fine-tuning and preference-based methods such as Direct Preference Optimization (DPO). Finally, I will present a method for extracting experimental data from scientific tables, enabling automated meta-analyses across thousands of papers on arXiv.org.


Bio:

Alan Ritter is an associate professor in the College of Computing at Georgia Tech.  He carried out some​ of the earliest work on the use of language models to develop chatbots, including training them via end-to-end reinforcement learning.  Alan is the recipient of various awards, including an NDSEG Fellowship, NSF CAREER Award, Amazon Research Award, and a Sony Faculty Innovation Award, along with multiple paper awards. His research has garnered media attention from WIRED, TNW, Bloomberg, and VentureBeat.