Cornell Tech - Award-Winning Paper Unravels Challenges of Scaling Language Models

By: Sarah Marquart

Associate Professor of Computer Science at Cornell Tech Alexander “Sasha” Rush and his colleagues from Hugging Face earned an Outstanding Main Track Runner-Up award at the December 2023 NeurIPS Annual Conference on Neural Information Processing Systems.

Their winning paper, Scaling Data-Constrained Language Models, was among six recognized by the awards committee out of a record 13,321 submissions. The team’s research delves into the science of scaling large language models (LLMs), particularly studying the impact of training dataset size.

The authors explain that if LLM training — the technology behind AI chatbots like ChatGPT — continues to scale indefinitely, we will quickly reach the point where there isn’t enough existing data to support further learning. High-quality English language data could be exhausted as soon as this year, with low-quality data following as early as 2030, according to an October 2022 study the authors cite.

Anticipating these impending challenges, Rush and his colleagues explored optimal strategies for scaling large language models in data-limited environments. They focused on solutions that strike a delicate balance between performance and cost, taking into account elements such as computational resources and environmental strain.

Their award-winning research revealed that there are indeed limits on the scaling horizon and suggested the need for more effective utilization of available data. The authors are optimistic that their findings will pave the way for understanding how models gain their capabilities using existing data.

“Large language models are powered by data, and they get better because of high-quality human-written text,” says Rush. “It’s critical to remember that the work of writers, from journalists to stack-overflow experts, forms the basis of what we call generative AI.”

< Back to News

Large language models are powered by data, and they get better because of high-quality human-written text. It's critical to remember that the work of writers, from journalists to stack-overflow experts, forms the basis of what we call generative AI.”

Alexander “Sasha” Rush Associate Professor of Computer Science

Related People

Alexander Rush

Media Highlights

Bloomberg Law

Ripple Ruling Blurs Definition of Cryptocurrencies as Securities

Mental Daily

Study Takes A Closer Look At NYPD Patrol Patterns Using Dashcam Footage

Tech Policy Press

Content Moderation, Encryption, and the Law

Princeton University

Tech Expert Arvind Narayanan Takes the Helm at Princeton Center for Information Technology Policy

Marktechpost

A New AI Research from Stanford, Cornell, and Oxford Introduces a Generative Model that Discovers Object Intrinsics from Just a Few Instances in a Single Image

Master's Programs

PHD & Post Doctoral Programs

Buildings

Plan your event

Tour Campus

CONNECT WITH US

Award-Winning Paper Unravels Challenges of Scaling Language Models

Media Highlights

Bloomberg Law

Mental Daily

Tech Policy Press

Princeton University

Marktechpost

RELATED STORIES

News Category AI

Professor Noah Snavely to Join Distinguished Roster of ACM Fellows

News Category AI

Professor Emma Pierson Named Schmidt AI2050 Fellow

News Category AI

How AI and New Tech Are Redefining Product Development

News Category AI

Cornell Tech Part of $400 Million Empire AI Consortium Announced by Governor Hochul

About

Discover

Resources