Intel Corporation (INTC) Earnings Call Transcript & Summary

December 13, 2023

NASDAQ US Information Technology Semiconductors and Semiconductor Equipment special 54 min

Earnings Call Speaker Segments

Austin Webb

executive
#1

Good morning, good afternoon and good evening, and welcome to Tech.Decoded, Intel's Technical Webinar Series for Software Developers. Thanks for tuning in to today's episode, How to Deploy Faster ML Workloads on Azure Confidential Computing Clusters. I'd like to introduce today's speakers. Kelli Belcher is an AI Solutions Engineer at Intel with numerous years of experience across financial services, health care and tech industries. In her current role, Kelli helps build machine learning solutions using Intel's portfolio of open AI software tools and as a Developer Evangelist for Intel oneAPI AI analytics toolkit. Joining Kelli today will be Bob Chesebrough. Bob is a Developer Evangelist at Intel with over 20 years of experience in software development, optimization on Intel platforms and AI. Currently, Bob works with the universities and developers evangelizing and helping them understand AI and oneAPI concepts. My name is Austin Webb and I'll be your host. A few notes about this platform we are using. You can access the biographies of our speakers in the top left-hand column. The following section contains downloadable resources, including a copy of today's presentation. Across the bottom of the screen, you see a series of icons. This will give you access to the slides, the Q&A and the survey. You can also access closed captioning there as well. You can resize and move around all of these windows to best fit your view, however you like. If you have any issues with the presentation sound, slides or other, please note it in the Q&A and a producer will help you out. You can post questions at any time in the Q&A window, our experts will be responding there in real time and at the very end, they'll go over some questions as well, time permitting, of course. This webinar is being recorded, and a replay will be available immediately after we conclude. You can access that by using your attendee link for this presentation. And with that, Kelli, I'll hand it off to you.

Kelli Belcher

executive
#2

Thank you, Austin. Hi, everyone. My name is Kelli, and I'm an AI Solutions Engineer at Intel. And so I'm very excited to share and talk about optimized cloud modules with you, and How to Deploy Faster Machine Learning Workloads on Microsoft Azure. So for the agenda for today, I'll first introduce the Intel AI Stack, that consists of both hardware and software optimizations. Then I'll provide an overview of the Intel Optimized Cloud Modules. We have developed for a variety of different cloud service providers, including Microsoft Azure, Amazon Web Services, and Google Cloud platform. Then we'll go over the Intel optimization for machine learning that are included in the cloud modules, and I'll provide a brief introduction of Kubernetes Concepts and Kubeflow Pipelines. Then we'll conclude the demo and walk through on building the full XGBoost Kubeflow pipeline on the Azure cloud. So without any further ado, let me introduce you to the Intel AI Stack. If you take a look at this top level overview of the AI pipeline, we can divide to 3 main sections. The first half is data preparation, which involves ingesting, analyzing, formatting and cleaning the data. The second step is model training and development. And the third is deploying the model into production. So let's start from the beginning of the pipeline, the data preparation stuff. Data preparation is around 80% of the time is spent on AI solutions. It is a very iterative process requiring large amounts of memory, data storage, management and data transformations. And this is where the benefits of CPUs can be used to scale up the performance and we run in parallel multiple CPU quartz. Now let's take a look at the far end of the pipeline on the right. Once the optimal model has been selected for deployment and used in production, a CPU provides excellent efficiency for inference workloads and is widely used for AI inferencing tasks. And in the middle here, the model development set, I would split this into 2, machine learning and deep learning training. Classical or traditional machine learning has a similar story as data preparation. It most often takes place on a CPU with utilization of many different mathematical assumptions. Deep learning is more resource-intensive and requires more factorization and paralyzation capabilities of the underlying hardware, which is often most suited for a GPU, though there is also a way to achieve very close performance by selecting a different data precision type with our fourth generation Xeon CPUs. So each part of the AI pipeline is quite different from one another, the hardware capabilities of Xeon CPUs paired with AI software that Intel offers, play a critical role providing significant value to customers, developers and data scientists at all stages of the pipeline. And with many different pieces in the AI pipeline, these are far different optimization types and it's important to have all of the optimizations for different architectures in one place. And we at Intel have developed a solution for this. So let me introduce you to the Intel oneAPI toolkit. This toolkit provides the software tools and optimizations needed to efficiently develop high-performance applications and solutions across a variety of Intel architectures, including CPU, GPU, FPGA and VPU. It is a suite of complementary toolkits, a base toolkit and specialty add-ons, like the AI analytics toolkit that include best-in-class compilers, powerful performance libraries and analysis, debug, importing tools. So talking about the segment specific tools, it includes different solutions for HPC, for AI, for IoT and for working with graphics. So you can also visualize this as a set of optimized mathematical assumptions and operations or operational sets, which is being used in our optimized frameworks and libraries and provide additional performance enhancements to your workloads. And we can also talk about AI software with a different depth of links. One way I like to look at the stack is by starting to look the first layer, which is where the low level performance libraries, compilers, kernels and software components operate very close to the hardware and provide many performance enhancements. Now many developers operate at this level, it is not where the majority of AI developers are. Nowadays, most of the data engineers and data scientists operate the frameworks layer, the second layer here. This is a layer which has all the machine learning and deep learning libraries, tools and frameworks running to make it easier for developers. And on top of this layer is typically where the AI platforms and MLOps tools are. These packages and MLOps typically unify the data science journey end-to-end from data preparation to deployment and help to improve the productivity of data scientists and cross-functional teams in the organization. So at the high level, you can see these 3 key layers that represent the software stack, mapped under the data science journey. And now let's take a closer look at the Intel architecture that helps enable this journey. So underneath this stack, we have the hardware. This includes a number of cores, frequency, vector accelerations, different instruction sets like AVX2, AVX-512, VNNI, Intel DL Boost, AMX and XMX for GPUs, which will help to significantly boost the performance of AI workloads at the hardware level. And to take advantage of that, at the performance libraries layer, we have our oneAPI tool kit, which is the unifying software layer across all of our hardware. So this includes some of the key libraries like oneDAL, oneDNN, oneCCL, and oneMKL, that helps to bring out the best performance of Intel CPU and GPU hardware. In order to simplify the work of programmers with the diversity of hardware at hand, that is CPU, GPU, FPGA and other accelerators, or in short, at Intel we like to call them XPUs. Intel offers oneAPI that includes unified cross-architecture language based on C++ standards. And building upon that, various optimized domain-specific libraries are also included, such as oneDNN or the oneAPI Deep Neural Network Library, the oneAPI Math Kernel Library or oneMKL, which optimizes mathematical operations on Intel hardware and oneDAL, that is the library for data analytics, which in turn, powers several Intel optimizations for traditional machine learning libraries. And based on this oneAPI ecosystem, Intel has developed a collection of AI software tools, some of which are included in the Intel AI analytics toolkit, while others are stand-alone products like OpenVINO. So the next layer here is the frameworks layer. These packages represent some of the most popular frameworks used in data science and machine learning workloads such as Pandas, Spark, SciPy, NumPy, Scikit-learn, XGBoost, PyTorch, and TensorFlow. And a lot of our optimizations for these frameworks and libraries are upstreamed, we need only a few lines of code to unlock the benefits of the optimizations. So once you're done with creating the models, you go into the post training or customization phase where you further tune your model to get the best performance. This is a critical phase because there's a lot of TCO benefits to be had by making sure that the models are highly optimized and sustainable. And this is also one of the key differences between proof of concept and production. To put your models to production, you first need to achieve the best performance and accuracy ratio for this. And here is where the Intel Neural Compressor, which provide multiple performance improvements fits in. So there's a wide variety of optimizations and while developers can get each of these ingredients separately to suit their needs, we have also pre-integrated and cross-validated these into a very efficient package called the AI Analytics Toolkit. And this toolkit includes many of the oneAPI optimizations over the box, package to provide a turnkey, quick start data science environment. In addition, we also have oneAPI-powered AI reference kits and Intel-optimized cloud modules to help to bring all these components together. Also in our repository on GitHub, you can find many additional code samples, which include optimizations for training and efforts. And we also have a set of hands-on, ready-to-run generative AI notebooks with different use cases like stable diffusion and fine-tuning LLMs that are hosted on the Intel Developer Cloud. And now I'm pleased to introduce the Intel-optimized cloud modules. As I mentioned in the previous slide, the Intel-optimized cloud modules helped to bring together all of the components of our software and hardware optimizations through the different phases of the AI pipeline across a variety of cloud service providers, including AWS, Azure and GCP. These are open-source reference architectures or modules with a full set of instructions to facilitate building and deploying optimized, efficient and scalable AI solutions with all source code published on GitHub. And this is a high-level overview of some of the tools and optimizations that are implemented within the set of modules. At the DevOps level, we have modules that incorporate Kubernetes and Kubeflow, Docker, TerraForm and FastAPI. Then at the allocation level, we have a variety of different AI tools and oneAPI optimizations that showcase different machine learning and deep learning use cases for machine learning modules and implement XGBoost optimizations and the Intel extension for accelerating Scikit-learn operations. We'll touch more on these libraries later in the session. And for deep learning use cases, we have modules around fine-tuning in LLM and a stable diffusion module that uses Intel optimizations for PyTorch and oneDNN under the hood. These modules also use oneCCL optimizations for distributing the fine-tuning process across multiple nodes in the cluster. And these are all designed for use with specific instruction sets for different cloud service providers like GCP, Azure and AWS and leverage Intel hardware offerings on their platforms, like Xeon CPUs and Intel software card extensions, Intel SGX. So I want to focus a little more on some of the modules we have currently released for Azure. This module, XGBoost pipeline on Kubernetes provides an optimized training and inference pipeline using XGBoost to predict the probability of a load default. The reference solution utilizes the Intel optimizations for XGBoost, Intel oneDAL and the Intel extension for Scikit-learn in the full end-to-end machine learning pipeline. This module can be used as a reference solution for data preprocessing, machine learning for binary classification tasks, model inference and performance analytics and applications deployed on Azure Kubernetes clusters. The module also demonstrates how it actually [indiscernible] can be updated with new data, training incrementally, which aims to tackle challenges such as data shift and very large data sets. The solution architecture uses Docker for application computerization and stores the image in Azure Container Registry. The application is then deployed on a cluster managed by the Azure Kubernetes Service. Our cluster runs on confidential computing virtual machines, leveraging the Intel Software Guard Extensions. And Azure file shares used for persistent data and model storage, a natural load balancers provisioned by our Kubernetes service that the client uses to interact with the application. The XGBoost pipeline on Kubernetes consists of 3 main API end points. The first is data preprocessing. This endpoint receives the raw csv file and creates the data preprocessing pipeline and then generates a training and test data of the size specified. The train and test data as well as our preprocessor assured in the Azure file share for later use. The second endpoint is model training. This endpoint begins training in XGBoost cloud software. The model validation results are then stored in the Azure file share, and this endpoint also provides an option to continue training the model as new data becomes available. And the third endpoint is model imprints. This endpoint retreats a trained XGBoost model from the Azure file share and converts it into an inference optimized Docker file model to calculate the predictions on the test event. The increased results are then returned to the client and stored in the Azure file share containing the predicted risk probabilities. And linked here on the slide you can find the open source implementation for this module on GitHub. The second module builds on top of the previous module with the implementation of Kubeflow, and this is actually the module that will be demoing in today's session. This solution is derived from the loan default risk prediction AI reference kit and has been refactored to achieve better modularity for Kubeflow pipelines. The cloud solution architecture utilizes an Azure Kubernetes cluster running on Intel Software Guard Extensions virtual machines. And Azure Container Registry is attached to Kubernetes cluster with the image that the Kubeflow pipeline uses to build the containerized hype on components. The Kubeflow layer is then installed on to the Azure cluster and when the XGBoost pipeline is deployed on to Kubeflow, each of the pods in the pipeline are assigned to you and run on one of the Intel SG Extensions. So there are 7 different components in the pipeline. The first component loads the data from the URL provided in the pipeline run parameters and synthetically augments the data to the size specified, it then saves a new data set as an output or in fact in the Kubeflow MinIO volume. This dataset is run into the next set of the component, which subsets the data into training and test sets for model evaluation and saves these files at the dataset artifacts. The third component creates a data preprocessing pipeline to prepare the data for the XGBoost model. This component loves the train and test features from the MinIO storage and transforms the categorical variables using [indiscernible] coding, inputs the missing values and power transforms some numerical features. The next component then trains the XGBoost model using the software accelerations powered by oneAPI, and without any code changes needed, the XGBoost optimizations are automatically enabled on the Intel Xeon nodes in the cluster. To further optimize the model production speed, we can bring the trained XGBoost model into an inference-optimized Daal4Py classifier and they convert XGBoost to Daal4Py component. And we use the inference-optimized Daal4Py model to compute both the binary class labels and probabilities. This component will return the classification report as well as 2 Kubeflow metrics artifacts, the area under the curve and the accuracy of the model. And then the last component of the pipeline, we calculate the ROC curve using the CPU accelerated version from the Intel Extension for Scikit-learn. The results are short as a Kubeflow classification metric and can be viewed in the digitalization tab of the ROC curve. We'll be going through each of the components in the pipeline in a little more detail on today's demo. So those were our 2 primary modules for machine learning that we've developed for Azure, but we also have many more third-party cloud optimization modules for Azure, AWS and GCP and our cloud solutions overview page, which you can scan the QR code on the lower right side of your screen to visit and we'll be updating the slide as we release new modules in the future. So as I mentioned in the previous slide, today, we'll be building an XGBoost Kubeflow pipeline. But before we do that, let's first take a closer look into the Intel-optimized frameworks and packages that are included in the module. Let's first begin with our framework optimization for Scikit-learn. Scikit-learn is one of these widely used machine lending packages for Python. It includes implementation of the standard machine learning algorithms for classification problems, regression and clustering. The Intel Extension for Scikit-learn is the available as part of the AI analytics toolkit or can be installed as a separate package into your environment. This framework optimization seamlessly speeds up your Scikit-learn applications, for Intel CPUs and GPUs across single and multi-node configurations, and requires very minimal code changes, which we'll review in this slide. There is actually a few different ways to enable the extension. My preference is the option on the left, which requires only 2 lines of code to use the Intel-optimized Scikit-learn libraries. You just import the patch by using the expand front as new learn X, import patches scaler, then apply the patch before importing your Scikit-learn libraries, and this will automatically import the Intel-accelerated versions of the Scikit-learn algorithms and functions that you're using. There's also a similar command to transact the projecting any time using the end patch Scikit-learn commit. In addition, this technology can also be applied to individual Scikit-learn algorithms, which you can classify initially to only patch selected libraries. And we have more detailed documentation on importing the additional supported algorithms on our GitHub. And another way to optimize your Scikit-learn code is by using a patch directly over the comment line shown on the right side of the slide. So now let's take a closer look at what this change enables for your Scikit-learn workloads. The accelerated libraries are powered by the oneAPI data in Intel's library, oneDAL in the back end, which has an optimized mathematical functions subset that enables very performant results. For example, with the support vector class fire, that could be up to 200 times faster for training and for inference, they can achieve over 1,000x speed up. And the Intel extension for Scikit-learn also supports optimizations in addition to the Scikit-learn as being shown here on the screen. It also accelerates Scikit-learn tasks and operations such as trained XGBoost and calculating the ROC curve, which are implemented from the XGBoost Kubeflow pipeline module. So you can see, just by adding those 2 lines of code Scikit-learn workload, there's a huge opportunity to really speed up your Scikit-learn code as well as to save cost and improve productivity. And now let's take a look at our optimization for another well-known machine learning framework called XGBoost. XGBoost is a widely-used machine learning algorithm that is known for its speed and accuracy and solving a variety of different data science tasks. It uses a gradient boosting framework, which is a sequence of different decision tree models connected through your grading dissent algorithm to minimize the air. To optimize training of XGBoost models, Intel has upstreamed optimizations into the open source distribution of XGBoost. These optimizations help to speed up XGBoost's histogram tree building with automatic memory prefetching. When building the histograms, the optimizations are automatically partitioned to multiple processing threads, parallelizing the XGBoost's split function, which provides a more efficient implementation and reduces memory consumption. To further accelerate inference on Intel CPUs with features that have not yet been ported into the open source distribution of XGBoost, we recommend a switch to the oneDAL backend through the inclusion of Daal4Py. So now let's look at an example of this. Here we can see that by using XGBoost as usual and by importing Daal4Py for the inference set, we can achieve the 23x performance boost with no loss in accuracy whatsoever. This is achieved by using Intel's optimizations under the hood, such as more efficient model representation memory and the usage of AVX-512 instruction set on Intel Xeon CPUs. So data usage is very easy to implement. You just import your model, converted with Daal4Py by calling get GBT model from XGBoost and then you make predictions with the optimizations enabled by using the GBT prediction function. So these are some of the main software operations that we've included in the machine learning cloud modules. And now before we begin building the pipeline, let's briefly go over some of the infrastructure tools and technologies that are implemented in the XGBoost Kubeflow pipeline. So let's begin with Kubernetes. Kubernetes is an open-source container orchestration platform that is designed for running production workloads at scale. It enables application services to maintain continuous availability and to scale up resources during periods of high usage and really some that are no longer needed. To do this, Kubernetes leverages the flexibility of containers and coordinates the deployment and distribution of containerized applications in a more efficient way in the Kubernetes cluster. Kubernetes clusters are made up of a set of computers, either physical or virtual machines that are called nodes. Each node in the cluster is connected to one another and work together as a single unit. To run the application containers, Kubernetes creates pods with the containers inside them. The pod is the smallest deployable unit in a Kubernetes application. The Kubernetes pods consist of one or more containers along with the shared resources and information that is needed to run the containers. Pods are then scheduled to the best available node for the containerized application to run on. With Kubernetes, you can also create replicas of your pods, so that you have a backup in case one were to fail and you can configure auto scaling in the cluster resources like increasing or decreasing the number of pods through horizontal scaling, or adjusting the computing resources like the memory size or CPU core count through vertical scaling. Kubernetes is also known for its portability and can be deployed on your local machine on the cloud or in the data center. In this session, we'll be using the managed Kubernetes service on the Azure cloud. And now let's take a look at the machine learning platform for Kubernetes called Kubeflow. Like Kubernetes, Kubeflow is an open-source platform that extends the benefits of scalability and portability to machine learning applications. The Kubeflow platform provides a curated set of components that are designed for managing and deploying the resources needed throughout all phases of the machine learning application life cycle. Some of these components include Kubeflow pipelines, which is an end-to-end orchestration tool for defining and running machine learning workflows. Kubeflow native support for Kubeflow Notebooks, and, R Studio and operators for training machine learning models. It also includes a system for hyperparameter tuning and serverless inference interfaces for model serving. In the model that we'll be deploying today, we'll be integrating the Intel software optimizations that we went over earlier into a Kubeflow pipeline that is running on Azure Kubernetes cluster. And so now I'll hand it over to Ben Consolvo, who will be doing a demo of the setup for the module and running the Kubeflow pipeline.

Benjamin Consolvo

executive
#3

Hello. My name is Ben Consolvo, I'm an AI Solutions Engineering Manager at Intel. Welcome to the Intel Cloud Optimization Modules for Microsoft Azure, XGBoost Kubeflow pipeline. In this video, I will be going over the hands-on setup of Azure Resources, installing Kubeflow and deploying a Kubeflow pipeline. So let's begin with the setup of Azure. There are a few prerequisites that need to be installed locally in order to get started. As you can see, I am running in a terminal window on VS Code with Windows subsystem for Linux. You can see on the bottom left corner, installed with an Ubuntu 22.04 image. And I can run this lsb_release command to see what version of Ubuntu I'm running. We'll be using the Microsoft Azure CLI as well as kubectl to interact and check on our Kubernetes cluster and customized, which is used to install Kubeflow and we will be using Docker also. So you can actually install the latest version of the Microsoft Azure CLI with this command, the scroll command, and then piping it into a bash and running the script. If you already have the Azure CLI installed, you can also run this AZ dashdash version. All right. So we can install kubectl so that we can interface with our Kubernetes cluster. And to install the latest version of kubectl, again, we can run cURL command, I'll paste that in here. And then yes, once we get the binary download, it looks like it's downloaded here, we can put it in a better location so that we can run this kubectl from anywhere. So we're going to put it in this user, local bin, kubectl directory. And then we can check the version of kubectl with this version command and we're just outputting in a YAML format. So it looks like we have version 1.28.1 installed. And finally, to install kustomize, which is a tool that will help us when we're trying to install Kubeflow later. We can run this cURL command. And then -- so we can see it's put this binary into our home directory here. And yes, so what we can do is we can actually put this as an alias into our bash or C file so that we can run this from -- yes, from anywhere that we are in our command line. All right. And then we can source our bash or C file to make sure that we get that alias. And then yes, we should be able to check the version of our customized installation. So it looks like we have 5.1.1 for the stand-alone version. And now Docker does need to be installed as well, and I'm using Windows. So I'm going to use the Docker desktop and with the Windows subsystem for Linux like me. So I'll show you the Docker desktop. These are the installation instructions, I won't go through this, but essentially just download the executable and run it. And then once you have Docker desktop installed, just one of the things that I wanted to point out is that once you have this up and running to make sure to go into your Settings and go to resources, WSL integration and make sure that the integration is with your additional distro here. When I had this, it was deactivated, so I wanted to make sure that, that's active, and also make sure here under general that it's using the WSL2-based engine. All right. So that's for Docker. And then what you need to do with Docker in the command line is you need to make sure to add your user to the Docker group with this -- the sudo command. So it looks like I already have that done and then adding the user here and then New Group Docker. Okay. So now we can dive into setting up our Azure resources and installing Kubeflow. So the next step is actually to get git-clone the code base and then change my directory to the main folder. So I've actually already cloned the code base for myself, but here's the git-clone command in case you need to run that. And then what I'm going to do is I'm going to see into or change directory into this cloud optimization folder. And then within that, as you can see on the left, there's some other folders that I have showing. So I'm going to go into the Kubeflow folder, pipelines and then XGBoost. So within this folder, we have the Docker file, I'll show it here, pipelines, XGBoost, we have a docker file as well as a source folder, SRC folder that you can see there with the Python script that is used to build the Kubeflow pipeline. So first, what I'm going to do is I'm actually going to log into Azure with this AZ login Command. And what it does is it opens up a browser and then you can just log in with this. And then you should see from your command line an indication that you've been able to successfully log in like this. Now I'm going to create a resource group that will hold all of the Azure resources for our solution. So in my case, I'm going to call the research group, Intel AKS Kubeflow-1 and set my location to Central U.S. So in general, I actually like to create -- I already created this. I'd like to create like an env.sh file, that has all of my environment variables that I want to save. So I'm showing those here, the resource group RG and then location, Central U.S. And then I have a couple of others that I'll cover later. What I want to do is I want to create this resource group with AZ group, create, minus N and then RGE, minus L, location, that actually, I need to source these environment variables first to make sure that I have them saved in my bash, and then I'll run that command that I just canceled out of. All right. So now I'm creating this resource group, and it looks like it was successful. So now we can actually go to the Azure portal, which I'll pull up here, and then I can click on Resource Groups, and I can see that I've created this Intel AKS Kubeflow-1 resource group. So now we can create the Azure container registry to store our container image for the Kubeflow pipeline, another environment where we can set for this Azure container registry, which I'm showing up here, this Kubeflow registry 1. So I'm going to clear this. And yes, the command that we use is az acr create, and then we're specifying which research group, which we've already defined, and then we want to name it according to that acr that I've already saved as an environment variable, and then the skew would be standard. Okay. So it looks like you're able to create that, other container registry. And before we build our image, I'm just going to first log into the Azure container registry with this acr login Command. Okay. Next, we'll build our image using this Docker file, okay? So I'm going to open up the Docker file. You're at the top, and so what you can see here is that for the container components in the Kubeflow pipeline, we're going to pull the base image from Python 3.10 and then copy their requirements, which is a list of installation for Python into that image. And then we're going to install all those requirements. So just make sure that you're in this trajectory that I'm showing on the screen. This pipeline is XGBoost because that's where the Docker file sits. And to build and push our image to the Azure registry, we'll use this ACR or Azure Container Registry build command. So I'll take it out here, build, image, actually boost on. This is an image name, and I'm going to call it the latest registry and then we have our ACR, RG for our research group, and the file is the Docker file, and then we're just going to do it on the present working directory. So that's the dot. All right. So yes, this may take a couple of minutes. But yes, we'll go ahead and wait for this. All right. To verify that our image was successfully pushed now that this is done running, we can actually use this acr repository show command. So the nice thing here is it shows us the image name, when it was last updated, and the registry that it is stored in. Okay. So now we're ready to create our Azure Kubernetes service cluster. We're going to do this in 2 steps to set up the confidential computing nodes. First, we'll create a system node pool and enable the confidential computing add-on for the cluster. And for this node pool, I'm using a standard D4, V5 virtual machine, which is a third-gen Xeon and this is the node that will host the AKS system pods like core DNS and metric server, and also provision a standard Azure load balancer for a cluster and attach the container registry we created in the previous step, which will allow our cluster to pull images from the registry. So I'm just going to open up the environment variables script again and show you this last environment variable for AKS, the Azure Kubernetes Service cluster that we're creating. So that's already saved in our bash. I'm going to clear our screen and then I'm going to copy this command because it's quite long, but it's just specifying some standard things that we need to have for our cluster. So yes, go ahead and create the cluster. It looks like we are able to spin up our cluster. This should just take you a few minutes to spin up, and then we're going to add a confidential computing node pool using the Intel Software Guard extensions virtual machines. All right. So now just to make sure that the registry was attached to the cluster successfully, I'm going to run the check acr command. So that's az aks check acr, then the name of the cluster, the resource group that we're in. And then, yes, we'll take the name of the ACR. All right. So it looks like we are able to pull images from the Kubeflow registry as it says here. So now we will go ahead and add the Intel third-gen Xeon CPU nodes with the Security Guard Extensions, [ VM node ] pull to the cluster. So I'm going to add 24 core SGX nodes, which are standard DC4 V3 instances on Azure. I'll name this node pool, Intel SGX, which will be referred to later when we add the node affinity to the Kubeflow pipeline. So I'm going to copy that's commended here to add this node pool and go ahead and run that. All right. It looks like we were able to spin up our node pool. So now I just need to get the access credentials to the cluster, and then this will merge them into this dot cube/config file so that kubectl can use them. So to get those credentials, we can do az aks, get credentials and the name of the aks and then the name of the research group. All right. So it already exists, but that's fine. I can overwrite it. So like you can see, it's saved into this dot cube/config file. And we can verify that the cluster credentials were set with this kubectl command. Kubectl config current context. And then we can see that it was set as our AKS name that we specified. Okay. Great. And now to make sure that all of our VMs are running, I'm going to use this kubectl get nodes command, and then I can see that I have these 3 nodes running. So we have 2 Intel SGX nodes and one standard node from the system node pool. And then to check that both SGX device plug-ins are running, I'm going to use the kubectl get pods, minus A for all command. And then we can see at the bottom that we do have our pods -- our pod running here with the plug-in SGX and then also the SGX workbook. Now that we've deployed all of our Azure resources for the pipeline, we'll install Kubeflow and set up the Kubernetes resources. To install Kubeflow on Azure, we're going to first clone the Kubeflow manifest GitHub repo. I'll show the command here, but I've actually already cloned it, so I won't actually execute the command. All right. And then I'm going to make sure that I'm in this manifest folder. So I'll go in there. So first thing I'm going to do is create a password to access the Kubeflow dashboard. The default password is 12341234, but I'm going to change that to something unique and hash it using a library called Passlib. So I'll do that with this Python Command, and then I'm going to use the password Kubeflow12345. And then I get this hash that gets printed out. And so what I'm going to do is I'm going to copy this into this config file, the cmo file, where there's a hash. And you can see up here where the file is located in my VS Code, the manifest common deck-based config math. And so I'm going to replace this hash. And you can see also that the default user name is this [email protected]. I'm going to save that file. So now I've edited that line with the new hash around line 22. So the next thing I'm going to do is change the Istio ingress gateway from cluster IP to a load balancer. So to do that, I'm going to open up the service.yaml, again, you can find where that is with the director here. And so I'm going to change this from cluster IP to load balancer. So this will configure an external IP address so that we can use the dashboard from within our browser. For Azure Kubernetes clusters, we also need to disable something called the Admission Enforcer from the Istio webhook. So I'm going to go to this installed.yaml file around line 2807 and add an annotation below the mutating webhook configuration metadata. So in this metadata section, I'm going to add this invitation. And then I'm going to save that file. And the last Manifest file I'm going to update is the Istio gateway so that we can access the dashboard over HTTPS. So under the HTTP server, I'm going to add the TLS for the transport layer security protocol and set the HTTPS redirect to True. And then below that, I'm going to add a second port 443. I'll just pace this in here now, so you can see it. All right. And then we're setting the HTTP access and host to all, we will be using a simple TLS and will create the certificate for the server once the Istio's Load Balancer has been created with our external IP address. All right. So go ahead and save this KFS resources yaml. Okay. So now I'm going to install Kubeflow using the customized -- using the customized While loop, which will apply all of the manifest files that we just edited with one command. Just make sure that you are sitting in the manifest directory. We can see that we are. So let's go ahead and run this While command. So this will take a few minutes for all the resources to spin up. Okay. Now we can check that all of the pods are running. So we can do that with this kubectl get pods minus A, so now we're just going to create a certificate for the TLS protocol. So first, I want to get an IP address. So I'll do that with this kubectl command and we should be able to get our external IP. So we can see it here on the third line. And so we'll actually use this external IP when we create our certificate. So I've already created a blank certificate here. And then I'm going to copy over the Istio certificates, information and here, and the only thing that I'll need to add is this external IP address here, and then I'll save that certificate file. And then I can run this kubectl, apply Istio certificate to create the certificate. And let's check that it was successfully created. All right. It looks like it is. Great. So now we are ready to log into Kubeflow and to run our pipeline. Before we log into the Kubeflow dashboard, let's first take a look at the Kubernetes cluster on the Azure portal to see a couple of helpful monitoring tools. So if we click into Kubernetes services and then click into our Kubernetes service, you can see the 2 node pools that we have with their node shapes listed as well as the Kubernetes version that we're running. And then if we click into the Kubernetes resources, services and ingresses page, we can see a listing of all of our Kubernetes services that are running. And we can even see our external IP from the Istio ingress gateway. So we have this that we can use to access the Kubeflow dashboard. You can also type this address into your browser as well. So I'm going to go ahead and click this to open it up. So when you first launch the dashboard, you might get this warning, and that's just because we're using a self-sign certificate. But if you can replace this with an SSL CA certificate if you have one, that works. So I'm just going to click advanced and then go to proceed. And then if you recall, we have our [email protected] that we left as default and then we changed our password to Kubeflow12345. All right. Now that we've logged into the Kubeflow dashboard, we can upload our created pipeline. But before we do that, I wanted to walk you through the pipeline code and how to create it. I've pulled up the Q flow pipeline Python code. And as you can see, it sits within our source folder. So we first import the KFP or Kubeflow Pipeline Libraries. And then we store our URI to our image in the Azure container registry in this Intel XDP D4P image variable. So this variable is actually referenced in building each component of the pipeline. I just have to make sure to update the Azure Kubeflow registry name here, is I named it with a one. So in the first component of the pipeline you can see this, @dsl.component decorator. That's used to indicate a part of the pipeline. And we have this low data component. So we're going to read in a credit risk csv file from a public URL and synthetically augment the data to the size specified in the pipeline parameters. So this new data set is then saved as an output artifact and Kubeflow's MinIO volume. So this dataset after we're through with this component of the pipeline, I'll go to the next one. So the next one is called create train/test set. And this creates the training and test sets for the 4 files: X train, Y train, X test and Y test, and then they're saved as artifacts. All right. And the next component of the pipeline, we have this one called preprocess features. And so this is where we preprocess the training and test features. It reads in the X train and X test and transforms the categorical features using one-hot-encoding and imputes the missing values and power transforms the numerical features. So these preprocessing steps are saved as an Scikit-learn pipeline that is used to fit the training features and then to transform the train and test sets. So we'll go to the next component. This is the part where we train the XGBoost model and we are using the tree method, HIST. So this will automatically enable the Intel optimizations for XGBoost since they have been upstreamed into the package. So you just want to make sure that you're using the most up-to-date version of XGBoost so that you get the most optimization benefits. So moving on to the next component. We're going to just convert XGBoost to daal4py. So this is where we load in the XGBoost model and convert it to an inference-optimized daal4py classifier. And if you're not sure what daal4py is, it's a Python API of the oneAPI data analytics library, and this daal4py model is saved as an output artifact and is read back into the next component to evaluate the model's performance on the hidden test data set. So next, we move on to the inference component of the pipeline called daal4py inference. We send the X test data in the daal4py model to calculate the predictions, and this will return a daal4py.classifier prediction result object, which has both target label predictions and the probabilities stored as an NumPy raise. So this component will return a few things at first, return to CSV file with the classification report, and it also returns 2 Kubeflow metric artifacts for the area under the curve and the accuracy. These could be viewed in the visualizations tab of this component of the pipeline once we open up the Kubeflow UI. All right. And then the last component of our pipeline. We have this plot ROC curve. So here we read in the predicted data with the probabilities and the true class labels and we compute the ROC curve. These results are returned as the Kubeflow classifications metrics artifact, which will be displayed in the graph on the visualizations tab. So these are all the components of the pipeline. But now we have to put it all together in a function. So we have this decorated @dsl.pipeline function where we have Intel XGBoost daal4Py pipeline, and this is where we're putting together all the components. The data size can be any numbers specified that is below 1 million for testing and benchmarking and the data URL is the address where the credit risk data csv file is hosted. So then in our pipeline, we call each of the components and -- that we defined above, and these will run as pipeline tasks. After each task, I've added the Kubernetes node selector. So you can see that here, for example. And when scheduling the pods for the pipeline tasks, the node selector will look for a node with a matching label, key and value pair. The label key, Intel VM, and the value, SGX, is the label that we assigned to the SGX node pool when we added it to the AKS cluster. Each component that returns an artifact can be used in the next component by calling that tasks output with the variables name. And at the very end of the python file after we get through all these, we have our main function. And that will help us to output a yaml pipeline file that we can upload directly into the Kubeflow UI. And now to generate the yaml file for the pipeline, I'm going to run this Python script. All right. And so that created a yaml file, which we can see here, and that's what we will use to upload into the Kubeflow UI. The data set used for this demo is a set of 32,581 simulated loans and was sourced from Kaggle, as you can see here. So it has 11 features, including customer and loan characteristics. So I'll pull up the data set here. So it has these 11 features, and one response, which is the loan status, which is the final outcome of the loan, whether it defaults or not. So we will just use the original data set as a starting point, and we can generate a lot more training and test data by slightly perturbing the values in the data set. In order to use it for a Kubeflow pipeline, you will need to host it at a URL, like on a public storage bucket somewhere. The advantage of this is that you can actually have a URL that has a stream of updated data, and you can keep retraining with recurring runs in Kubeflow on a schedule. I am back at the Kubeflow dashboard so we can upload our created pipeline. So you can go to the left panel here, click on Pipelines, go to Upload Pipeline, and then we'll upload it as a yaml file that we already created. So here it is. So upload that and keep the description the same as the pipeline name, and then we'll go to Create. So now we have our pipeline here. So now we can go back to Home. And what we want to do now is go ahead and create an experiment with that uploaded pipeline. So we'll give the same name with just experiment1, and then we'll choose the pipeline that we've already uploaded here. And then we also are required to put in a data set size. So I'm putting in 100,000. So that's the generated dataset size, and then as I mentioned before, we have to host our data somewhere, so I've done that, and I'll put that in. And yes, now we can look at this experiment. Here, I just wanted to show you the portions of the pipeline illuminating sped up because this does take a few minutes to run. All right. And it looks like our pipeline is done running, so we can actually go in and click on some of these individual components and take a look at the visualization of this ROC curve. And that concludes showing you the Kubeflow pipeline.

Kelli Belcher

executive
#4

Thank you, Ben, for the excellent demo. And I think we have a few minutes for questions. Bob, are there any questions?

Robert Chesebrough

executive
#5

Sorry, I was trying to get to my mute. No, I'm not really -- not seeing any questions, Kelli. There -- well, actually hang on a second, let's see. I think that there may have been something from another channel. So let me just look here Kelli, real quick. I think there was something -- just looking for it here. Yes. So how are the Kubeflow pipeline pods scheduled to the Intel nodes in the cluster?

Kelli Belcher

executive
#6

Yes. So to schedule the pods in the Kubeflow pipeline to the Intel nodes in the cluster, the Kubeflow pipelines SDK version 2 has a Kubernetes pipeline, which has a function to add a node selector. And the node selector will look for the nodes and the cluster with the label, the key value pair that we assigned to the Intel SGX nodes when we created them.

Robert Chesebrough

executive
#7

Awesome. Just a couple of others, not in any particular order. What are the performance benefits of using XGBoost optimizations and the daal4Py?

Kelli Belcher

executive
#8

Yes. So during the testing and benchmarking at a low default risk prediction kit, the XGBoost optimizations enabled the speed up of up to 1.5x and for inference with a batch size of 1 million, the Intel oneDAL optimizations enable the speed up of 4.4x.

Robert Chesebrough

executive
#9

Awesome. And then [indiscernible] really this one last one. Is there a way to run this pipeline on other Kubernetes platforms?

Kelli Belcher

executive
#10

Yes. We actually have a pipeline available for GCP using the Google Kubernetes engine with C3 instances, which are the fourth generation Intel Xeon CPUs. And we also have a general sample in the Kubeflow pipelines GitHub that can be run anywhere that Kubeflow is running.

Robert Chesebrough

executive
#11

Awesome. Well, I think that's all that I see, Kelli.

Kelli Belcher

executive
#12

Okay. Thanks, Bob. I'll hand it back over to Austin.

Austin Webb

executive
#13

Thank you, Kelli and Bob for the webinar today. If you want to watch replay, you can at any time using your attendee link. A replay will also be e-mailed out in the next few weeks. A quick reminder to complete and submit the short survey that will pop up automatically at the end of this webinar. Thank you so much for joining us, and we'll see you next time.

This call discussed

For developers and AI pipelines

Programmatic access to Intel Corporation earnings transcripts and 32,000+ others is available through the EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments, full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.