What is a Large Language Model (LLM)?


A Large Language Model (LLM) is a type of AI system that uses massive amounts of text data to learn about and manipulate human language by generating natural-sounding text or responding to prompts. LLMs typically have deep learning architectures, often with deep transformers, and can understand relationships between words, phrases, and meanings. 

LLMs are called “large” because of their amount of input text (trillions of words) and due to their number of parameters (many hundreds of billions). Examples of LLMs you may have heard of include OpenAI’s GPT family, Google’s PaLM, and Meta’s LLaMA models.

How do LLMs work?

LLMs function by predicting the next most likely word (or token) in a sequence based on the words that came before it. LLMs can learn from enormous text corpora through the process of self-supervised learning and learn to model the statistical patterns of the language without any labels. 

Most LLMs are built on a transformer. Transformers are an architecture that uses paths and mechanisms like self-attention to look at relationships between words through long sequences of text. Every set of input words is passed through several layers of processing. The transformer builds internal representations that allow it to predict context and semantics, and even style. When using LMs, the transformer applies all of this knowledge to create responses, summaries, code snippets, translations, etc.

What are the key capabilities of LLMs?

LLMs are versatile tools with a wide range of applications across industries. Their primary capabilities include:

What are the limitations of LLMs?

Despite their strengths, LLMs also face notable constraints that organizations and users must account for:

Got questions?

Ask our consultants today—we’re excited to assist you!

TALK TO US
  • A
  • B
  • C
  • D
  • E
  • F
  • G
  • H
  • I
  • J
  • K
  • L
  • M
  • N
  • O
  • P
  • Q
  • R
  • S
  • T
  • U
  • V
  • W
  • X
  • Y
  • Z