IBM Watson NLP (Natural Language Understanding) and Watson Speech containers can be run locally, on-premises or Kubernetes and OpenShift clusters. Via REST and WebSockets APIs AI can easily be embedded in applications. This post describes how to run Watson Speech To Text locally in Minikube.
To set some context, check out the landing page IBM Watson Speech Libraries for Embed.
The Watson Speech To Text library is available as containers providing REST and WebSockets interfaces. While this offering is new, the underlaying functionality has been used and optimized for a long time in IBM offerings like the IBM Cloud SaaS service for STS and IBM Cloud Pak for Data.
To try it, a trial is available. The container images are stored in an IBM container registry that is accessed via an IBM Entitlement Key.
How to run STS locally via Minikube
My post Running IBM Watson Speech to Text in Containers explained how to run Watson STT locally in Docker. The instructions below describe how to deploy Watson Speech To Text locally to Minikube via kubectl and yaml files.
First you need to install Minikube, for example via brew on MacOS. Next Minikube needs to be started with more memory and disk size than the Minikube defaults. I’ve used the settings below which is more than required, but I wanted to leave space for other applications. Note that you also need to give your container runtime more resources. For example if you use Docker Desktop, navigate to Preferences-Resources to do this.
1
2
$ brew install minikube
$ minikube start --cpus 12 --memory 16000 --disk-size 50g
The namespace and secret need to be created.
1
2
3
4
5
6
7
8
$ kubectl create namespace watson-demo
$ kubectl config set-context --current --namespace=watson-demo
$ kubectl create secret docker-registry \
--docker-server=cp.icr.io \
--docker-username=cp \
--docker-password=<your IBM Entitlement Key> \
-n watson-demo \
ibm-entitlement-key
Clone a repo with the Kubernetes yaml files to deploy Watson Speech To Text.
1
2
3
$ git clone https://github.com/nheidloff/watson-embed-demos.git
$ kubectl apply -f watson-embed-demos/minikube-speech-to-text/kubernetes/
$ kubectl get pods --watch
To use other models, modify deployment.yaml.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
- name: watson-stt-en-us-telephony
image: cp.icr.io/cp/ai/watson-stt-en-us-telephony:1.0.0
args:
- sh
- -c
- cp model/* /models/pool2
env:
- name: ACCEPT_LICENSE
value: "true"
resources:
limits:
cpu: 1
ephemeral-storage: 1Gi
memory: 1Gi
requests:
cpu: 100m
ephemeral-storage: 1Gi
memory: 256Mi
volumeMounts:
- name: models
mountPath: /models/pool2
When you open the Kubernetes Dashboard (via ‘minikube dashboard’), you’ll see the deployed resources. The pod contains the runtime container and four init containers (two specific models, a generic model and a utility container).
To invoke Watson Speech To Text, port forwarding can be used.
1
$ kubectl port-forward svc/ibm-watson-tts-embed 1080
Invoke the REST API with a sample audio file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ curl "http://localhost:1080/speech-to-text/api/v1/recognize" \
--header "Content-Type: audio/wav" \
--data-binary @watson-embed-demos/demo.wav
{
"result_index": 0,
"results": [
{
"final": true,
"alternatives": [
{
"transcript": "ibm watson speech to text can easily be embedded in applications",
"confidence": 0.85
}
]
}
]
}
To find out more about Watson Speech To Speech and Watson for Embed in general, check out these resources:
- Watson Speech To Text Documentation
- Watson Speech To Text Model Catalog
- Watson Speech To Text SaaS Model Catalog
- Watson Speech To Text SaaS API docs
- Trial
- Entitlement key
- Automation for Watson NLP Deployments
- Running IBM Watson NLP locally in Containers
- Running IBM Watson Speech to Text in Containers
- Running IBM Watson Text to Speech in Containers