OpenShift AI Ops Self-Healing Workshop

Welcome to the OpenShift AI Ops Self-Healing Workshop!

OpenShift 4.18+
AI Powered
ML KServe

In this hands-on workshop, you’ll learn how to leverage AI-powered self-healing capabilities in OpenShift. You’ll train machine learning models, interact with your cluster using natural language, and watch automated remediation in action.

What You’ll Learn

Module Description

Module 0

Introduction & Architecture - Understand the hybrid deterministic-AI approach and platform components

Module 1

ML Model Training with Tekton - Train anomaly detection and predictive analytics models using automated pipelines

Module 2

Deploy MCP Server & Configure Lightspeed - Set up the MCP Server and connect OpenShift Lightspeed to your platform (Required)

Module 3

End-to-End Self-Healing - Chat with your cluster using Lightspeed, deploy apps, break things, and watch AI fix them

Module 4

Extra Credit - Advanced ML - LSTM networks, ensemble methods, and deploying your own custom models

Module 5

Notebook Catalog & Use Cases - Comprehensive guide to 33+ notebooks covering all platform capabilities

What’s Already Deployed

Your workshop environment comes pre-configured with:

Component Description

OpenShift Lightspeed

AI assistant integrated into the OpenShift console

MCP Server

Go service connecting Lightspeed to cluster tools

Coordination Engine

Go service orchestrating remediation workflows

KServe ML Models

Anomaly detection + capacity forecasting models

Jupyter Workbench

Notebook environment for ML development

ArgoCD

GitOps deployment via Validated Patterns

Prerequisites

Skills Required

  • Basic OpenShift/Kubernetes knowledge (pods, deployments, services)

  • Familiarity with the OpenShift web console

  • Basic understanding of machine learning concepts (helpful but not required)

Access Information

Your instructor will provide the following credentials:

Verify Your Access

  1. Open the OpenShift Console: https://console-openshift-console.apps.{guid}.example.com

  2. Log in with provided credentials

  3. Click the Lightspeed icon (sparkle icon) in the top-right corner

  4. If the chat interface opens, you’re ready to go!

Workshop Duration

Module Duration Notes

Module 0

15 min

Introduction & overview

Module 1

30 min

ML training pipelines

Module 2

20 min

MCP Server & Lightspeed setup (Required)

Module 3

45 min

Interactive self-healing demo

Module 4

45+ min

Extra credit (Jupyter notebooks)

Module 5

30 min

Notebook catalog reference

Total: ~2.5 hours core (add 75+ min for extra credit and catalog)

Getting Help

If you encounter issues during the workshop:

  1. Check the Troubleshooting sections in each module

  2. Ask your instructor

  3. Review the Troubleshooting Guide

Let’s Get Started!

When you’re ready, proceed to Module 0: Introduction & Architecture to learn how the platform works before diving into hands-on exercises.


Architecture Note: The MCP Server and Coordination Engine are Go services for production performance. The Jupyter notebooks are Python for ML/data science. You don’t need to write Go code to use the platform!