heidloff.net - Building is my Passion
Post
Cancel

Evaluating Question Answering Scenarios with watsonx.governance

watsonx.governance is IBM’s AI governance offering to manage and monitor (Generative) AI solutions. It is available as SaaS or as software to be run everywhere. This post demonstrates how to do this for a Question Answering scenario with an external model running on Azure.

Here is a definition of IBM watsonx.governance:

IBM watsonx.governance was built to help you direct, manage and monitor the artificial intelligence (AI) activities of your organization: 1. Govern generative AI (gen AI) and machine learning (ML) models from any vendor including IBM watsonx.ai, Amazon Sagemaker and Bedrock, Google Vertex and Microsoft Azure. 2. Evaluate and monitor for model health, accuracy, drift, bias and gen AI quality. 3. Access powerful governance, risk and compliance capabilities featuring workflows with approvals, customizable dashboards, risk scorecards and reports. 4. Use factsheet capabilities to collect and document model metadata automatically across the AI model lifecycle.

Documentation and blog posts:

Overview

For the Question Answering use case GPT is used running on Azure, but the same mechanism works for other models too. Question Answering is often utilized in RAG (Retrieval Augmented Generation) pipelines.

image

Evaluations

For this model different GenAI evaluations, for example answer quality, faithfulness and answer relevance, have been defined which are provided by watsonx. For the evaluations test sets need to provide ground truth information.

image

Results are displayed after the evaluations.

image

image

For RAG scenarios the documents are displayed which were passed in to the models to generate answers.

image

You can even select different parts of the answer to see from which documents they came from.

image

Faithfulness, answer relevance and unsuccessful requests are also displayed graphically.

image

Code

Since this example uses an external model, a detached prompt template needs to be created, for example in a notebook.

image

Next Steps

To learn more, check out the Watsonx.ai documentation and the Watsonx.ai landing page.

Featured Blog Posts
Disclaimer
The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
Contents
Trending Tags