After Large Language Models have been fine-tuned, the quality needs to be evaluated. This post describes a simple s example utilizing a custom evaluation mechanism. For standard LLM tasks there ar...
Fine-Tuning LLMs with LoRA on a small GPU
Smaller and/or quanitzed Large Language Models can be fine-tuned on a single GPU. For example for FLAN T5 XL (3b) a Nvidia V100 GPU with 16GB is sufficient. This post demonstrates a simple example ...
Preparing LLM LoRA Fine-Tuning locally
Before the actual fine-tuning of Large Language Models can start, data needs to be prepared and the code needs to be checked whether it works. This post describes how to do this efficiently locally...
Deploying a Virtual Server with GPU in the IBM Cloud
Fine-tuning Large Language Models requires GPUs. When tuning small and/or quantized models, single GPUs can be sufficient. This post explains how to leverage a Nvidia V100 GPU in the IBM Cloud. Ov...
Training Models locally via Containers
Training and fine-tuning models takes time. During this process it’s important to see progress. This post describes how to visualize output in Tensorboard running locally. The Hugging Face Trainer...
Metrics to evaluate Search Results
Via Retrieval Augmented Generation search results can be passed as context into prompts to Large Language Models to support the models to generate good responses. Passing the right search results i...
Hybrid and Vector Searches with Elasticsearch
Semantic searches allow finding relevant information even if there are no classic keyword matches. Recent research has shown that combinations of semantic and classic keyword searches often outperf...
Semantic Searches with Elasticsearch
In the most recent versions Elasticsearch provides semantic searches. This post summarizes how this new functionality can be utilized. Semantic searches allow finding relevant information even if ...
Tokenizing Text for Vector Searches with Java
Vector-based searches allow finding semantic relevant information without the presence of keywords. Often vector-based search engines can only handle documents with limited lengths. This post descr...
Enhancements of LLMs via Self-Reflections
One of the key challenges of LLMs is hallucination. Retrieval Augmented Generation (RAG) reduces hallucination but cannot eliminate it. This post summarizes a new concept to address this shortcomin...