Research into more efficient AI hardware and software supported by AMD donation

by David Silverman

13 August 2024

Imperial has received a donation from high performance and adaptive computing company AMD to support research into machine learning.

AMD made the donation in recognition of the excellent research of Professor George Constantinides and Dr Aaron Zhao in Imperial’s Department of Electrical and Electronic Engineering.

It will support research to improve the performance and energy consumption of large language models (LLMs) similar to those used for ChatGPT.

Professor Constantinides said: “I’m very grateful to AMD for supporting our research, which will contribute to further advances in artificial intelligence and the upskilling of research students who will be placed to contribute the next wave of advances in hardware-oriented machine learning.”

Reconfiguring processors

Computer processors are more complex versions of the circuits that control devices such as toasters. Like many of those simpler circuits, they make use of logic gates, which follow simple rules that link their inputs to their outputs (binary 1s or 0s). For example, an OR gate outputs ‘1’ if at least one of its inputs is a 1, otherwise it outputs ‘0’. A modern computer processor contains millions of logic gates joined in a configuration that allows them to perform advanced computations.

The research could enable low-power AI-enabled laptops, cloud-provided assistants and AI embedded into future cars and robots.

Creating an optimal processor design requires knowing what kind of software the processor will run – for example, whether it is mainly processing text or images, and how important features such as speed and accuracy are in a given use case. Specialist applications such as LLMs, which trade in a specific type of data and are highly processing-intensive, could therefore benefit from special hardware.

It is not usually cost-effective to manufacture a processor with just one application in mind. However, reconfigurable processors known as field-programmable gate arrays (FPGA) allows programmers to reconfigure sets of logic gates in the way they choose.

FPGAs are used in settings such as laboratories and available for large-scale use via cloud services such as Amazon Web Services and Microsoft Azure. They can be used to experiment with new processor designs before they are manufactured as fixed-function integrated circuits.

The Imperial researchers will use some of the funding from AMD to explore how to configure FPGAs to deliver higher-performing LLMs.

Finding the right numbers

One task the research group is pursuing is to find the optimal design of arithmetic units in processors, which carry out arithmetic over numbers represented using bits.

“AI models such as LLMs have a very interesting characteristic – they are usually very forgiving. If you represent numbers in a very lossy manner, they tolerate this." Dr Aaron Zhao

All computations, even over text or images, reduce to computations over numbers. Yet real numbers can contain infinitely long strings of digits, and computers can only have limited numbers of bits (binary 1s or 0s) to represent those numbers with.

Computers therefore have to work with approximations of real numbers, and engineers aim to design number systems that provide sufficient precision while also optimising for accuracy, speed, and energy efficiency.

Recent research by Professor Constantinides and Dr Zhao showed that machine learning models can be trained to perform very accurately even on processors that represent numbers with an unusually low level of precision.

“AI models such as LLMs, which generate text, and diffusion models, which generate images, have a very interesting characteristic – they are usually very forgiving. If you represent numbers in a very lossy manner – with fewer bits, or a more finite precision – they tolerate this because there are inherently a lot of redundancies in the network,” explains Dr Zhao. “By nature they are very effective at recognising patterns in noisy data and more generally working around whatever constraints you place on them.”

Optimising machine learning models

The researchers are also aiming to optimise the design of machine learning models that are trained and deployed on the processors. “Usually I’ve got some hardware and I have to write software for it. But with FPGAs we have to do it the other way round. The reality is we have to look at both of these ends,” explains Professor Constantinides. “Because we can develop the processor architecture around the algorithm, we get really high performance.”

The future of machine learning

Dr Zhao says that this research could inform the design of FPGA configurations and machine learning models used, for example, to accelerate scientific research. It also has potential to support the continued development of specialised processors designed specifically for machine learning, such as AMD’s neural-processing-units.

“If AI architectures keep changing and evolving, then it’s really hard to have fixed silicon optimised for them. In this case, reconfigurable devices can shine,” he explains, “But FPGAs naturally run at a lower clock frequency – the configurability comes at a cost to clock rate. If we see AI models becoming stable, then we also see our research as trying to influence the next generation of ASIC [application-specific] processor design.”

This could potentially enable low-power AI-enabled laptops, cloud-provided assistants and AI embedded into future cars and robots.