Retrieval Augmented Generation (RAG) solutions like Question Answering are not easy to evaluate. This post summarizes some options with their pros and cons. Question Answering solutions are more t...
Retrieval Augmented Generation with Chroma and LangChain
ChatGPT is more than just the GPT model. Similarly, the AI task Question Answering is also more than invoking just one model. This post describes a simple flow that leverages vector search via Chro...
Searching Documents related to Questions via Embeddings
In large language models embeddings are numerical representations of words, phrases, or sentences that capture their meaning and context. This post describes how to use embeddings to search and rer...
Fine-tuning small LLMs with Output from large LLMs
The innovation in the AI space is continuing in an incredible speed. This post describes a new technique that has evolved recently which allows smaller fine-tuned models to almost reach the perform...
Generating synthetic Data with Large Language Models
When fine-tuning large language models there is often not enough data available. This post describes how to use the Falcon model to generate synthetic data. An incredible feature of Large Language...
Instruction Fine-tuning of Large Language Models
This post explains what ‘instruct’ versions of Large Language Models are and how instructions can be used for efficient fine-tuning. Often there are instruct versions of popular Large Language Mod...
Efficient Fine-tuning with PEFT and LoRA
Classic fine-tuning of Large Language Models typically changes most weights of the models which requires a lot of resources. LoRA based fine-tuning freezes the original weights and only trains a sm...
Running Python locally
There are several ways to run Python code locally. Often it is desired to run it in virtual environments and containers to be able to run multiple configurations in parallel and to easily remove co...
Open Source LLMs in Watsonx.ai
Watsonx.ai is IBM’s offering to train, validate, tune, and deploy generative AI based on foundation models. It comes with several open source models which are briefly described in this post. As a ...
Causal LLMs and Seq2Seq Architectures
The Hugging Face libraries have become the de-facto standard how to access foundation models from Python, both for inference and fine-tuning. This post describes how to use the Hugging Face APIs fo...