bar
heidloff.net - Building is my Passion
Niklas Heidloff
Cancel

Retrieval Augmented Generation with Chroma and LangChain

ChatGPT is more than just the GPT model. Similarly, the AI task Question Answering is also more than invoking just one model. This post describes a simple flow that leverages vector search via Chro...

Searching Documents related to Questions via Embeddings

In large language models embeddings are numerical representations of words, phrases, or sentences that capture their meaning and context. This post describes how to use embeddings to search and rer...

Fine-tuning small LLMs with Output from large LLMs

The innovation in the AI space is continuing in an incredible speed. This post describes a new technique that has evolved recently which allows smaller fine-tuned models to almost reach the perform...

Generating synthetic Data with Large Language Models

When fine-tuning large language models there is often not enough data available. This post describes how to use the Falcon model to generate synthetic data. An incredible feature of Large Language...

Instruction Fine-tuning of Large Language Models

This post explains what ‘instruct’ versions of Large Language Models are and how instructions can be used for efficient fine-tuning. Often there are instruct versions of popular Large Language Mod...

Efficient Fine-tuning with PEFT and LoRA

Classic fine-tuning of Large Language Models typically changes most weights of the models which requires a lot of resources. LoRA based fine-tuning freezes the original weights and only trains a sm...

Running Python locally

There are several ways to run Python code locally. Often it is desired to run it in virtual environments and containers to be able to run multiple configurations in parallel and to easily remove co...

Open Source LLMs in Watsonx.ai

Watsonx.ai is IBM’s offering to train, validate, tune, and deploy generative AI based on foundation models. It comes with several open source models which are briefly described in this post. As a ...

Causal LLMs and Seq2Seq Architectures

The Hugging Face libraries have become the de-facto standard how to access foundation models from Python, both for inference and fine-tuning. This post describes how to use the Hugging Face APIs fo...

Decoding Methods for Generative AI

With watsonx.ai, you can train, validate, tune, and deploy generative AI based on foundation models. This post explains some of the strategies to instruct Large Language Models how to generate text...

Disclaimer
The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
Trending Tags