In the past few years, Machine Learning and Large Language Models have taken the world by storm, with ChatGPT having over 180 million users and receiving approximately 1.6 billion visits per month in 2024. But this rapid growth raises the question: How do we maintain the accessibility and efficiency of Large Language Models at a pace that keeps up with the rapid growth of new programs?

A large part of the current struggle to advance the efficiency of these programs lies in the fact that the hardware supporting them is not up to date or developed as quickly as the software applications. Every time someone types a question into ChatGPT, five computers are working to try to get an answer back – a task that consumes a substantial amount of resources and will only worsen if trends continue.

If we want to increase the accessibility, efficiency, and performance of Machine Learning, we need to improve the hardware applications they utilize. This is the exact focus for Mohamed Abdelfattah, an Assistant Professor at Cornell Tech, who has received a prestigious U.S. National Science Foundation (NSF) Early Career Development Award in order to develop specialized computer chips and software programs that enhance AI performance.

The award supports his research proposal, “Efficient Large Language Model Inference Through Codesign: Adaptable Software Partitioning and FPGA-based Distributed Hardware,” for a five-year period from 2024 through 2029 with a total amount of $883,082.

“The key challenge is still scaling; we need to make these models bigger and add more data to make them capable, but we don’t yet have the right computing platforms,” said Abdelfattah. “Rethinking hardware architecture together with software and algorithms is crucial for unleashing the generative tasks. The NSF project proposes optimizing the entire computing stack, composed of three main areas of algorithms, software, and hardware, to make distributed and large-scale language models run more efficiently.”

Before becoming a professor at Cornell Tech, Abdelfattah spent six years as a principal scientist at Samsung Electronics working with hardware. He realized during his time in the industry that Machine Learning was the future so he began exploring methods for allowing his work to combine the two.

When Machine Learning and Large Language Models took off and the scale of Artificial Intelligence became exponentially higher than it was even a few years prior, the research and engineering industry was faced with the dilemma of new technology that worked really well but was developing so rapidly it was unsustainable. Abdelfattah saw the dilemma as an opportunity for drastic improvement in the limits of running the novel technology on a larger scale.

“Large Language Models had all the makings of a challenging research project to tackle,” he said. “Getting these systems to work efficiently at the rate they’re developing is a massive feat that we can’t overcome by making our systems 10% better; the level of impact requires us to work toward a solution that makes them 100% better.”

Abdelfattah’s work in making Machine Learning more efficient and accessible is crucial for its financial sustainability and growth. Presently, Large Language Models struggle with economic profitability – platforms ending up losing money because of the energy consumption required to deploy them.

Decreasing energy consumption, paired with an increase in efficiency, will make the technology viable to deploy on a large scale, which presents endless opportunities for what Machine Learning and Large Language Models have the potential to achieve. With increased accessibility allowing for more people to use the models in everything from coding to the legal space, Abdelfattah’s work will help shape and change the future of our productivity and efficiency.