Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Red Hat OpenShift AI installation and setup

May 1, 2024
Diego Alvarez Ponce Kaitlyn Abdo
Related topics:
Artificial intelligenceEdge computing
Related products:
Red Hat OpenShiftRed Hat OpenShift AI

Share:

    Welcome to the third article in this series, which covers the process to prepare and run computer vision models at the edge. The complete list of articles that compose the series are listed below:

    • How to install single node OpenShift on AWS
    • How to install single node OpenShift on bare metal
    • Red Hat OpenShift AI installation and setup
    • Model training in Red Hat OpenShift AI
    • Prepare and label custom datasets with Label Studio
    • Deploy computer vision applications at the edge with MicroShift

    Introduction

    Red Hat OpenShift AI is a comprehensive platform designed to streamline the development, deployment, and management of data science and machine learning applications in hybrid and multi-cloud environments. Leveraging the Red Hat OpenShift app dev platform, OpenShift AI empowers data science teams to exploit container orchestration capabilities for scalable and efficient deployment. 

    In this tutorial, we will prepare and install the Red Hat OpenShift AI operator and its components. This includes enabling the use of GPU and the storage configuration. The next article in this series will cover how to use the operator for AI model training.

    LVM storage installation

    OpenShift AI will require some storage when creating workbenches and deploying notebooks. Therefore, one of the prerequisites will be to install the Logical Volume Manager Storage (LVMS) operator on our single node OpenShift (SNO). LVMS uses Linux Logical Volume Manager (LVM) as a back end to provide the necessary storage space for OpenShift AI in our SNO.

     

    Note

    LVMS requires an empty dedicated disk to provision storage. To ensure that the operator can be installed and used, make sure you already have an empty disk available.

    The easiest and most convenient way to install the operator is via the OpenShift web console:

    1. In your SNO web console, navigate to the Operators section on the left-hand menu.
    2. Select OperatorHub. This will show the marketplace catalog integrated in Red Hat OpenShift Container Platform (OCP) with the different operators available.
    3. In the search field, type LVMS.
    4. Select the LVM Storage operator (Figure 1) and click Install on the right side of the screen.
    LVM Storage Operator.
    Figure 1: After searching LVMS, select the LVM Storage operator that will populate in the search results.
    1. Once in the configuration page, we can keep the default values. Press Install again.
    2. Wait a little while the installation finishes. Then, press the Create LVMCluster button that just appeared.
    3. In the configuration form you can change some of the parameters, like the instance name, device class, etc. Check the default box under storage > deviceClasses to use lvms-vg1 as the default storage class.
    4. Press the Create button to start the custom resource creation. 
    5. You can wait for the Status to become Ready, or check the deployment process from the command line. In the Terminal connected to your SNO, run the following command:

      watch oc get pods -n openshift-storage
    6. Wait until you see all pods running:

      NAME                                        READY         STATUS     RESTARTS      AGE
      lvms-operator-5656d84f77-ntlzm              1/1           Running    0             2m45s
      topolvm-controller-7dd48b6556-dg222         5/5           Running    0             109s
      topolvm-node-xgc79                          4/4           Running    0             87s
      vg-manager-lj4kd                            1/1           Running    0             109s

    And just like that, you've deployed the first operator. But this is only the beginning. Let’s continue with the next step, to configure the node for GPU detection.

    Node Feature Discovery installation

    Now, let's focus on configuring our node so the GPU can be detected. Red Hat’s supported approach is using the NVIDIA GPU Operator. Before installing it, there are a couple of prerequisites we need to meet. 

    The first one is installing the Node Feature Discovery Operator (NFD). This operator will manage the detection and configuration of hardware features in our SNO. The process will be quite similar to the one we just followed. 

    1. In the web console, locate the Operators section on the left menu again.
    2. Click OperatorHub to access the catalog.
    3. Once there, type NFD in the text box. We will get two results.
    4. In this case, I will install the operator that is supported by Red Hat (Figure 2). Click Install.
    Node Feature Discovery Operator.
    Figure 2: After searching NFD, select the Node Feature Discovery operator that will populate in the search results.
    1. This will prompt us to a second page with different configurable parameters. Let’s keep the default values and press the Install button.
    2. This will trigger the operator installation. Once finished, press View Operator.
    3. Under the NodeFeatureDiscovery component, click Create instance.
    4. As we did before, keep the default values and click Create. This instance proceeds to label the GPU node.
    5. Verify the installation by running the following command in your terminal:

      watch oc get pods -n openshift-nfd
    6. Wait until all the pods are running:

      NAME                                          READY         STATUS     RESTARTS      AGE
      nfd-controller-manager-7758f5d99-9zpjw        2/2           Running    0             2m4s
      nfd-master-798b4885-4qfhq                     1/1           Running    0             10s
      nfd-worker-7gjhv                              1/1           Running    0             10s
    7. The Node Feature Discovery Operator uses vendor PCI IDs to identify hardware in our node. 0x10de is the PCI vendor ID that is assigned to NVIDIA, so we can verify if that label is present in our node by running this command:

      oc describe node | egrep 'Labels|pci'
    8. There, you can spot the 0x10de tag present:

      Labels:        beta.kubernetes.io/arch=amd64
                     feature.node.kubernetes.io/pci-102b.present=true
                     feature.node.kubernetes.io/pci-10de.present=true
                     feature.node.kubernetes.io/pci-14e4.present=true

    The Node Feature Operator has been installed correctly. This means that our GPU hardware can be detected, so we can continue and install the NVIDIA GPU operator.

    NVIDIA GPU Operator installation

    The NVIDIA GPU Operator will manage and automate the software provision needed to configure the GPU we just exposed. Follow these instructions to install the operator:

    1. Again, navigate to Operators in the web console. 
    2. Move to the OperatorHub section.
    3. In the search field, type NVIDIA.
    4. Select the NVIDIA GPU Operator (Figure 3) and press Install.

      NVIDIA GPU Operator.
      Figure 3: After searching NVIDIA, select the NVIDIA GPU operator that will populate in the search results.
    5. It’s not necessary to modify any values. Click Install again.
    6. When the operator is installed, press View Operator.
    7. You can create the operand by clicking Create instance in the ClusterPolicy section.
    8. Skip the values configuration part and click Create.
    9. While the ClusterPolicy is created, we can see the progress from our terminal by running this command:

      watch oc get pods -n nvidia-gpu-operator
    10. You will know it has finished when you see an output similar to the following:

      NAME                                                     READY     STATUS         RESTARTS      AGE
      gpu-feature-discovery-wkzpf                              1/1       Running        0             15d
      gpu-operator-76c4c94788-59rfh                            1/1       Running        0             15d
      nvidia-container-toolkit-daemonset-5t5dp                 1/1       Running        0             15d
      nvidia-cuda-validator-m5x4k                              0/1       Completed      0             15d
      nvidia-dcgm-8sn57                                        1/1       Running        0             15d
      nvidia-dcgm-exporter-hnjc6                               1/1       Running        0             15d
      nvidia-device-plugin-daemonset-467zm                     1/1       Running        0             15d
      nvidia-device-plugin-validator-bqfr6                     0/1       Completed      0             15d
      nvidia-driver-daemonset-412.86.202301061548-0-kpkjp      2/2       Running        0             15d
      nvidia-node-status-exporter-6chdx                        1/1       Running        0             15d
      nvidia-operator-validator-jj8c4                          1/1       Running        0             15d 

    We have just completed the GPU setup. At this point, we will be able to select our GPU to be used in the model training. There is one last thing we need to take care of before installing OpenShift AI: enabling the Image Registry Operator.

    Enable Image Registry Operator

    On platforms that do not provide shareable object storage, like bare metal, the OpenShift Image Registry Operator bootstraps itself as Removed. This allows OpenShift to be installed on these platform types. OpenShift AI will require enabling the image registry again in order to be able to deploy the workbenches. 

    1. In your Terminal, ensure you don't have any running pods in the openshift-image-registry namespace:

      oc get pod -n openshift-image-registry -l docker-registry=default
    2. Now we will need to edit the registry configuration:

      oc edit configs.imageregistry.operator.openshift.io
    3. Under storage: { }, include the following lines, making sure you leave the claim name blank. This way, the PVC will be created automatically:

      spec:
      ...
        storage:
          pvc:
            claim:
    4. Also, change the managementState field from Removed to Managed:

      spec:
      ...
        managementState: Managed
    5. The PVC will be created as Shared access (RWX). However, we will need to use ReadWriteOnce. Back in the Web Console, go to the Storage menu.
    6. Navigate to the PersistentVolumeClaims section.
    7. Make sure you have selected Project: openshift-image-registry at the top of the page. If you cannot find it, enable the Show default namespaces button.
    8. You will see the image-registry-storage PVC as Pending. The PVC cannot be modified, so we will need to delete the existing one and recreate it modifying the accessMode. Click on the three dots on the right side and select Delete PersistentVolumeClaim.
    9. It’s time to recreate the PVC again. To do so, click Create PersistentVolumeClaim.
    10. Complete the following fields as shown and click Create when done:
    • StorageClass: lvms-vg1
    • PersistentVolumeClaim name: image-registry-storage
    • AccessMode: Single User (RWO)
    • Size: 30 GiB
    • Volume mode: Filesystem
    1. In a few seconds, you will see the PVC status as Bound (Figure 4).
    PVC status.
    Figure 4: In the OpenShift console, the status of the image-registry-storage PVC shows Bound. 

    With this last step, you have installed and configured the necessary infrastructure and prerequisites for Red Hat OpenShift AI. 

    Red Hat OpenShift AI installation

    Red Hat OpenShift AI combines the scalability and flexibility of containerization with the capabilities of machine learning and data analytics. With OpenShift AI, data scientists and developers can efficiently collaborate, deploy, and manage their models and applications.

    You can have your OpenShift AI operator installed and working in just a couple of minutes:

    1. From the web console, navigate back to the Operators tab and select OperatorHub.
    2. Type OpenShift AI to search the component in the operators' catalog.
    3. Select the Red Hat OpenShift AI operator (Figure 5) and click Install.
    Red Hat OpenShift AI Operator.
    Figure 5: After searching OpenShift AI, select the Red Hat OpenShift AI operator that will populate in the search results.
    1. The default values will already be configured so we will not need to modify any of them. To start the installation press the Install button again.
    2. Once the status has changed to Succeeded we can confirm that the operator deployment has finished.
    3. Now we need to create the required custom resource. Select Create DataScienceCluster.
    4. Keep the pre-defined values and press Create.
    5. Wait for the Phase to become Ready. This will mean that the operator is ready to be used.
    6. We can access the OpenShift AI web console from the OCP console. On the right side of the top navigation bar, you will find a square icon formed by 9 smaller squares. Click it and select Red Hat OpenShift AI from the drop-down menu, as shown in Figure 6.
    Red Hat OpenShift AI console drop-down menu.
    Figure 6: Open the Red Hat OpenShift AI Web Console from OCP.
    1. A new tab will open. Log in again using your OpenShift credentials (kubeadmin and password).

    Welcome to the Red Hat OpenShift AI landing page (Figure 7). It is on this platform where the magic will happen, as you'll learn in the next article. 

    Red Hat OpenShift AI landing page.
    Figure 7: Red Hat OpenShift AI landing page.

    Video demo

    The following video covers the process of installing Red Hat OpenShift AI on the single node, along with the underlying operators like Logical Volume Manager Storage (LVMS), Node Feature Discovery (NFD), and NVIDIA GPU.

    Next steps

    In this article, we have made use of different operators that are indispensable for the installation of Red Hat OpenShift AI. We started with the storage setup and ended with the GPU enablement, which will speed up the training process that we will see in our next article. 

    From here, we will move away from infrastructure and enter the world of artificial intelligence and computer vision. Check out the next article to keep learning about Red Hat OpenShift AI: Model training in Red Hat OpenShift AI.

    OSZAR »
    Last updated: September 30, 2024

    Related Posts

    • How to install single node OpenShift on AWS

    • How to install single node OpenShift on bare metal

    • How to integrate Quarkus applications with OpenShift AI

    • How to use LLMs in Java with LangChain4j and Quarkus

    • Why GPUs are essential for AI and high-performance computing

    • Model training in Red Hat OpenShift AI

    Recent Posts

    • More Essential AI tutorials for Node.js Developers

    • How to run a fraud detection AI model on RHEL CVMs

    • How we use software provenance at Red Hat

    • Alternatives to creating bootc images from scratch

    • How to update OpenStack Services on OpenShift

    What’s up next?

    Learn how to access a large language model using Node.js and LangChain.js. You’ll also explore LangChain.js APIs that simplify common requirements like retrieval-augmented generation (RAG).

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue

    OSZAR »