Configure GPU-enabled nodes

Kubernetes includes support for managing graphical processing units (GPUs) across different nodes in a cluster, using device plugins.

In ArcGIS Enterprise on Kubernetes, you can implement a device plugin to enable GPU nodes in a cluster to optimize GIS workflows, such as those pertaining to raster analytics and deep learning. By default, capabilities such as raster analytics are configured to run in CPU mode but also provide the flexibility to run in GPU mode when these resources are available. Consider whether your workloads will benefit from using GPU-enabled nodes, as these types of nodes are generally more expensive to use.

Ensure that you have sufficient GPU resources for all your GPU-enabled workloads. For example, a single raster analytics pod will consume 1 GPU. To support 10 replicas of these pods, you will need 10 GPU available across your GPU enabled nodes.

Enable GPU

To enable the use of GPU for workloads, complete the following steps:

  1. Verify that your instance has the NVIDIA device plug-in for Kubernetes installed.

    The NVIDIA device plugin for Kubernetes is a daemonset that allows you to expose the number of GPUs on each node of a cluster, run GPU-enabled containers, and track the health of the GPUs. Many cloud environments are preconfigured with GPU nodes. If the device plugin is not installed, see the NVIDIA device plugin for Kubernetes documentationfor details and installation steps. If you've deployed on-premises, your administrator must enable GPU on each node in a cluster.

    Note:

    At this release, ArcGIS Enterprise on Kubernetes is only supported with NVIDIA GPUs.

  2. Optionally, create a custom label for your GPU nodes.

    It is recommended that you label GPU nodes to allow workloads to be scheduled to these nodes using pod placement rules. Use the following command to label each node:

    kubectl label nodes <your-node-name> <your-key>=<your-value>
    

  3. Configure services to use GPU.

    If you are enabling GPU for notebook services, see View and edit runtimes for information on setting GPU units per node. If you are enabling GPU for raster analytics, see Enable GPU resources for raster analytics.

  4. Optionally use taints to ensure only workloads that require GPU are scheduled to GPU-enabled nodes.

    Apply tolerations to any workloads that should run on GPU-enabled nodes before tainting the nodes. Use the following command to taint a GPU node:

    kubectl taint nodes <your-node-name> nvidia.com/gpu=Exists:NoExecute
    


In this topic
  1. Enable GPU