From eac635e43978eaa8a9a0a821281603e256bd7eae Mon Sep 17 00:00:00 2001
From: Tuomas Katila <tuomas.katila@intel.com>
Date: Fri, 16 Sep 2022 15:24:46 +0300
Subject: [PATCH 1/2] gpu: fix documentation links

Signed-off-by: Tuomas Katila <tuomas.katila@intel.com>
---
 cmd/gpu_nfdhook/README.md | 2 +-
 cmd/gpu_plugin/README.md  | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/cmd/gpu_nfdhook/README.md b/cmd/gpu_nfdhook/README.md
index 9cc558d85..2735f8470 100644
--- a/cmd/gpu_nfdhook/README.md
+++ b/cmd/gpu_nfdhook/README.md
@@ -40,7 +40,7 @@ Following labels are created by default. You may turn numeric labels into extend
 name | type | description|
 -----|------|------|
 |`gpu.intel.com/millicores`| number | node GPU count * 1000. Can be used as a finer grained shared execution fraction.
-|`gpu.intel.com/memory.max`| number | sum of detected [GPU memory amounts](#GPU-memory) in bytes OR environment variable value * GPU count
+|`gpu.intel.com/memory.max`| number | sum of detected [GPU memory amounts](#gpu-memory) in bytes OR environment variable value * GPU count
 |`gpu.intel.com/cards`| string | list of card names separated by '`.`'. The names match host `card*`-folders under `/sys/class/drm/`. Deprecated, use `gpu-numbers`.
 |`gpu.intel.com/gpu-numbers`| string | list of numbers separated by '`.`'. The numbers correspond to device file numbers for the primary nodes of given GPUs in kernel DRI subsystem, listed as `/dev/dri/card<num>` in devfs, and `/sys/class/drm/card<num>` in sysfs.
 |`gpu.intel.com/tiles`| number | sum of all detected GPU tiles in the system.
diff --git a/cmd/gpu_plugin/README.md b/cmd/gpu_plugin/README.md
index 255b168e8..9dbc42ea6 100644
--- a/cmd/gpu_plugin/README.md
+++ b/cmd/gpu_plugin/README.md
@@ -6,7 +6,10 @@ Table of Contents
 * [Modes and Configuration Options](#modes-and-configuration-options)
 * [Installation](#installation)
     * [Pre-built Images](#pre-built-images)
-         * [Fractional Resources](#fractional-resources)
+         * [Install to all nodes](#install-to-all-nodes)
+         * [Install to nodes with Intel GPUs with NFD](#install-to-nodes-with-intel-gpus-with-nfd)
+         * [Install to nodes with Intel GPUs with Fractional resources](#install-to-nodes-with-intel-gpus-with-fractional-resources)
+              * [Fractional resources details](#fractional-resources-details)
     * [Verify Plugin Registration](#verify-plugin-registration)
 * [Testing and Demos](#testing-and-demos)
 * [Issues with media workloads on multi-GPU setups](#issues-with-media-workloads-on-multi-gpu-setups)

From 9b3ee06cb19086436888b06f2d0792eb5c139aee Mon Sep 17 00:00:00 2001
From: Eero Tamminen <eero.t.tamminen@intel.com>
Date: Fri, 9 Sep 2022 17:10:07 +0300
Subject: [PATCH 2/2] Add GPU plugin README prerequisites section

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
---
 cmd/gpu_plugin/README.md | 83 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 78 insertions(+), 5 deletions(-)

diff --git a/cmd/gpu_plugin/README.md b/cmd/gpu_plugin/README.md
index 9dbc42ea6..5ff09e224 100644
--- a/cmd/gpu_plugin/README.md
+++ b/cmd/gpu_plugin/README.md
@@ -5,6 +5,11 @@ Table of Contents
 * [Introduction](#introduction)
 * [Modes and Configuration Options](#modes-and-configuration-options)
 * [Installation](#installation)
+    * [Prerequisites](#prerequisites)
+        * [Drivers for discrete GPUs](#drivers-for-discrete-gpus)
+            * [Kernel driver](#kernel-driver)
+            * [User-space drivers](#user-space-drivers)
+        * [Drivers for older (integrated) GPUs](#drivers-for-older-integrated-gpus)
     * [Pre-built Images](#pre-built-images)
          * [Install to all nodes](#install-to-all-nodes)
          * [Install to nodes with Intel GPUs with NFD](#install-to-nodes-with-intel-gpus-with-nfd)
@@ -19,7 +24,8 @@ Table of Contents
 ## Introduction
 
 Intel GPU plugin facilitates Kubernetes workload offloading by providing access to
-discrete (including Intel® Data Center GPU Flex Series) and integrated Intel GPU device files.
+discrete (including Intel® Data Center GPU Flex Series) and integrated Intel GPU devices
+supported by the host kernel.
 
 Use cases include, but are not limited to:
 - Media transcode
@@ -50,6 +56,73 @@ The following sections detail how to obtain, build, deploy and test the GPU devi
 
 Examples are provided showing how to deploy the plugin either using a DaemonSet or by hand on a per-node basis.
 
+### Prerequisites
+
+Access to a GPU device requires firmware, kernel and user-space
+drivers supporting it.  Firmware and kernel driver need to be on the
+host, user-space drivers in the GPU workload containers.
+
+Intel GPU devices supported by the current kernel can be listed with:
+```
+$ grep i915 /sys/class/drm/card?/device/uevent
+/sys/class/drm/card0/device/uevent:DRIVER=i915
+/sys/class/drm/card1/device/uevent:DRIVER=i915
+```
+
+#### Drivers for discrete GPUs
+
+##### Kernel driver
+
+For now, kernel needs to be built from sources. Later on there will
+also be pre-built kernels and/or DKMS GPU module distro packages for
+the enterprise / long-term-support kernels.
+
+While last 5.x upstream Linux kernel releases already had preliminary
+discrete Intel GPU support, one should really use kernel v6.x.
+
+In upstream kernels, discrete GPU support needs to be enabled with kernel
+`i915.force_probe=<PCI_ID>` command line option until relevant kernel
+driver features have been completed in upstream:
+https://www.kernel.org/doc/html/latest/gpu/rfc/index.html
+
+PCI IDs for the Intel GPUs on given host can be listed with:
+```
+$ lspci | grep -e VGA -e Display | grep Intel
+88:00.0 Display controller: Intel Corporation Device 56c1 (rev 05)
+8d:00.0 Display controller: Intel Corporation Device 56c1 (rev 05)
+```
+
+(`lspci` lists GPUs with display support as "VGA compatible controller",
+and server GPUs without display support, as "Display controller".)
+
+Mesa "Iris" 3D driver header provides a mapping between GPU PCI IDs and their Intel brand names:
+https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/include/pci_ids/iris_pci_ids.h
+
+If your kernel build does not find the correct firmware version for
+a given GPU from the host (see `dmesg | grep i915` output), latest
+firmware versions are available in upstream:
+https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915
+
+##### User-space drivers
+
+Until new enough user-space drivers (supporting also discrete GPUs)
+are available directly from distribution package repositories, they
+can be installed to containers from Intel package repositories. See:
+https://dgpu-docs.intel.com/installation-guides/index.html
+
+Example container is listed in [Testing and demos](#testing-and-demos).
+
+Validation status against *upstream* kernel is listed in the user-space drivers release notes:
+* Media driver: https://github.com/intel/media-driver/releases
+* Compute driver: https://github.com/intel/compute-runtime/releases
+
+#### Drivers for older (integrated) GPUs
+
+For the older (integrated) GPUs, new enough firmware and kernel driver
+are typically included already with the host OS, and new enough
+user-space drivers (for the GPU containers) are in the host OS
+repositories.
+
 ### Pre-built Images
 
 [Pre-built images](https://hub.docker.com/r/intel/intel-gpu-plugin)
@@ -155,8 +228,8 @@ master
 ## Testing and Demos
 
 We can test the plugin is working by deploying an OpenCL image and running `clinfo`.
-The sample OpenCL image can be built using `make intel-opencl-icd` and must be made
-available in the cluster.
+[intel-opencl-icd](../../demo/intel-opencl-icd/) sample OpenCL image, built using
+`make intel-opencl-icd` and available from DockerHub, is used for this.
 
 1. Create a job:
 
@@ -174,8 +247,8 @@ available in the cluster.
     <log output>
     ```
 
-    If the pod did not successfully launch, possibly because it could not obtain the gpu
-    resource, it will be stuck in the `Pending` status:
+    If the pod did not successfully launch, possibly because it could not obtain
+    the requested GPU resource, it will be stuck in the `Pending` status:
 
     ```bash
     $ kubectl get pods