bar
heidloff.net - Building is my Passion
Niklas Heidloff
Cancel

Deploying LLMs via Hugging Face on IBM Cloud

With the Text Generation Inference toolkit from Hugging Face Large Language Models can be hosted efficiently. This post describes how to run open-source models or fine-tuned models on IBM Cloud. T...

Fine-tuning LLMs via Hugging Face on IBM Cloud

The speed of innovation in the AI community is amazing. What didn’t seem to be possible a year ago, is standard today. Fine-tuning is a great example. With the latest progress, you can fine-tune sm...

Foundation Models, Transformers, BERT and GPT

Since I’m excited by the incredible capabilities which technologies like ChatGPT and Bard provide, I’m trying to understand better how they work. This post summarizes my current understanding about...

Mixtral Agents with Tools for Multi-turn Conversations

Larger Large Language Models like ChatGPT can be prompted to behave as agents for specific use cases. They can return output in certain formats, and they can return instructions to invoke code. Thi...

Highlights of my technical Work in 2023

What a great year 2023 has been! When ChatGPT was published at the end of 2022, I knew it would change the world. I wanted to learn and understand this technology. Fortunately, through my network ...

Evaluating LoRA Fine-Tuning Results

After Large Language Models have been fine-tuned, the quality needs to be evaluated. This post describes a simple s example utilizing a custom evaluation mechanism. For standard LLM tasks there ar...

Fine-Tuning LLMs with LoRA on a small GPU

Smaller and/or quanitzed Large Language Models can be fine-tuned on a single GPU. For example for FLAN T5 XL (3b) a Nvidia V100 GPU with 16GB is sufficient. This post demonstrates a simple example ...

Preparing LLM LoRA Fine-Tuning locally

Before the actual fine-tuning of Large Language Models can start, data needs to be prepared and the code needs to be checked whether it works. This post describes how to do this efficiently locally...

Deploying a Virtual Server with GPU in the IBM Cloud

Fine-tuning Large Language Models requires GPUs. When tuning small and/or quantized models, single GPUs can be sufficient. This post explains how to leverage a Nvidia V100 GPU in the IBM Cloud. Ov...

Training Models locally via Containers

Training and fine-tuning models takes time. During this process it’s important to see progress. This post describes how to visualize output in Tensorboard running locally. The Hugging Face Trainer...

Disclaimer
The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
Trending Tags