Causal LLMs and Seq2Seq Architectures

The Hugging Face libraries have become the de-facto standard how to access foundation models from Python, both for inference and fine-tuning. This post describes how to use the Hugging Face APIs for different types of Large Language Models.

As part of the Transformers library Hugging Face provides Auto Classes which works well when accessing popular models. For other models and when training models it’s important to understand the different types of architectures:

Causal Language Modeling (CLM) - decoder
Masked Language Modeling (MLM) - encoder
Sequence-to-Sequence (Seq2Seq) - encoder and decoder

Below is a quick summary of the great blog Understanding Causal LLM’s, Masked LLM’s, and Seq2Seq: A Guide to Language Model Training Approaches.

Causal Language Modeling

Causal Language Modeling is typically used in decoder-based architectures, for example GPT, to generate text and for summarization. Causal Language Modeling is an autoregressive method where the model is trained to predict the next token in a sequence given the previous tokens.

Hugging Face API: transformers.AutoModelForCausalLM

  
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id="gpt2"
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

Masked Language Modeling

Masked Language Modeling is typically used in encoder-based architectures, for example BERT, to predict masked content as needed for text classification, sentiment analysis, and named entity recognition.

Hugging Face API: transformers.AutoModelForMaskGeneration

  
from transformers import AutoTokenizer, AutoModelForMaskedLM
model_id="bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id)

Sequence-to-Sequence

Sequence-to-Sequence is typically used in encoder-decoder-based architectures, for example FLAN-T5, to do translations, question answering and speech to text. Check out the Hugging Face documentation for details.

Hugging Face API: transformers.AutoModelForSeq2SeqLM

  
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id="google/flan-t5-small"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

Next Steps

The three architectures have pros and cons as explained in the referenced blog.

To learn more, check out the Watsonx.ai documentation and the Watsonx.ai landing page.