Configure raster analytics—ArcGIS Enterprise on Kubernetes

Raster analytics provides built-in tools and functions for preprocessing, orthorectification and mosaicking, remote sensing analysis, and an extensive range of math and trigonometry operators. Your custom functions can extend your organization's analytical capabilities even further.

Raster analytics is also designed to streamline and simplify collaboration and sharing. Users across your organization can contribute data, processing models, and expertise to an imagery project and share results with individuals, departments, and organizations within your enterprise.

Introduction to raster analytics

In ArcGIS Enterprise on Kubernetes, raster analytics is a flexible raster processing, storage, and sharing system that employs distributed computing and storage technology. Use raster analytics to apply raster analysis tools and raster functions provided in ArcGIS, build custom functions and tools, or combine multiple tools and functions into raster processing chains to run custom algorithms on large collections of raster data. Source data and processed results are stored, published, and shared across your enterprise according to your needs and priorities.

This extensive capability can be further expanded by using cloud computing capabilities and resources. The result is that image processing and analysis jobs that used to take days or weeks can now be done in minutes or hours, and jobs that were too large or extensive can now be completed.

Tip:

Cloud data storage is a requirement for on-premises and cloud deployments. It is used to store raster analytics outputs.

Confirm that your administrator has allocated sufficient resource quota and worker nodes to support this premium capability.

Enable raster analytics

The following configuration steps may require changes to the way you've deployed ArcGIS in your organization; review them carefully before proceeding. To enable raster analytics as a capability for the organization, complete the following steps:

Ensure that your organization has both of the required raster stores: one cloud store and one relational store.
The cloud store is used to store the raster output of the analysis. When your organization is deployed in a cloud environment, it is recommended to use a cloud storage service from the same cloud provider. The relational store is used for storing mosaic datasets while creating a hosted imagery layer or when the raster analysis generates a collection output. See Manage raster stores for information on how to add a raster store.
In ArcGIS Enterprise Manager, click the Capabilities button in the sidebar.
The capabilities page appears.
Turn on the Raster analytics toggle button.
A message appears indicating that the process to enable may take some time.
Click Enable.
A request to enable raster analytics is submitted. This process will validate prerequisites and activate supporting resources. The following system services will be automatically started:
- RasterAnalysisTools
- RasterProcessing
- RasterProcessingGPU
- RasterRendering

If the capability fails to enable, repeat the steps above to ensure that the raster stores have been added, the raster analytics license is valid and available, and the system services have been started. Review the logs to identify the requirements for this capability.

Raster analytics is now configured. You can begin to use raster analysis tools and host imagery in your organization. Learn how to use raster analysis and deep learning.

Additionally, learn how to tune raster analytics workflows.

Enable GPU resources for raster analytics

Raster analysis, especially deep learning analysis and AI tools can use graphics processing unit (GPU) resources in your cluster. You can configure GPU resources to enhance performance for deep learning models and inferencing tools. For deep learning, GPUs provide processing speed, resource efficiency, and scalability for your organization. Since they can handle massive parallel computations, GPUs are indispensable for raster analysis and deep learning. Individual services function better for both tasks when GPUs are configured for each service.

To enable GPU resources for raster analytics for your organization, work with a cluster administrator to complete the following steps:

Enable GPU for nodes in your Kubernetes cluster.
It is recommended that you create a custom label for your GPU nodes. For example, to label a node named node1 with the key raster and the value GPU use the following command:
```
kubectl label nodes node1 raster=GPU
```
Sign in to ArcGIS Enterprise Manager as an administrator.
Ensure raster analytics are enabled.
Use pod placement settings to ensure services that require GPU are scheduled to GPU-enabled nodes.
There are two raster analysis services that can leverage GPU nodes for processing data: RasterAnalysisTools and RasterProcessingGPU. It is recommended to configure pod placement for the RasterProcessingGPU service so that it can leverage the GPU. If your workflows involve training models, then it is recommended to also configure pod placement for the RasterAnalysisTools service.
1. Click Services > System services
2. Click the service name then click Pod placement.
3. To apply a node affinity rule that ensures the service's pods are scheduled on GPU nodes, provide the following information in the Node affinity section and click Add:
  - Type—Required
  - Key—Specify the key used for labeling the GPU node, for example raster.
  - Operator—In
  - Value—Specify the value used for labeling the GPU node, for example GPU.
4. To ensure your raster analysis workloads can run on GPU nodes while using taints to exclude other workloads from those nodes, provide the following information in the Tolerations section and click Add:
  - Effect—No Execute
  - Key—nvidia.com/gpu
  - Operator—Exists
5. Click Save and wait for pods to be scheduled.
  Caution:
  It is important to wait for the pods to be scheduled before proceeding. Otherwise the pods can become stuck in a pending state because Kubernetes cannot find suitable nodes matching the GPU request and placement rules.
Enable GPU for each service that should be configured to use GPU.
1. At the top of the service page, click Settings.
2. Turn on Enable GPU.
3. Click Save.
Optionally, taint the GPU nodes to ensure other workloads are not scheduled to GPU enabled nodes.
```
kubectl taint nodes <your-node-name> nvidia.com/gpu=Exists:NoExecute
```

Tune raster analytics

For optimal performance and scalability with raster analytics, consider the following recommendations:

When creating your organization, use a cloud storage service for the object store.
Increase worker nodes. Premium capabilities such as raster analytics require a minimum of one additional worker node for each architecture profile to support the added capabilities. Work with your administrator to determine which architecture profile was selected when creating the organization and increase worker nodes accordingly. Depending on the analysis that you will perform, you may need more than one additional node.
Use GPU-enabled nodes when possible. Optionally, configure node affinity and tolerations to run GPU workloads exclusively on GPU nodes. This is increasingly important when conducting deep learning and AI workflows.
By modifying service deployments with additional pods and resources, you can increase overall availability and throughput for your workflows as follows:
- For workflows that do not include deep learning and AI, scale the RasterProcessing service deployment.
- When conducting workflows to train models for deep learning workflows, scale the RasterAnalysisTools service deployment.
- When conducting inferencing workflows, scale the RasterProcessingGPU service deployment.
Some raster analysis tools distribute computations across multiple worker pods and write temporary data while performing analysis. Use ephemeral volumes when performing distributed raster analysis on large jobs that need to load data to a temporary space for processing. See configure ephemeral volumes for details.
When running raster analytics processes, it is recommended that you decrease the log level from Debug. If you're not actively troubleshooting an issue, use the Warning level. If you require finer-grained logging detail, use an alternative log level instead.
Consider setting a parallel processing factor to control the number of raster processing service instances that can be used for processing the data.
Memory requirements may need to be monitored and increased based on the type of analysis workflow and size of the data processed. For example, for deep learning and AI workflows, larger batch sizes require more memory.
Raster analytics tools have preconfigured limits. Some jobs may require additional capacity for the tools to function properly. You can edit the scaling properties to increase memory for different microservices.

Feedback on this topic?

Introduction to raster analytics

Tip:

Enable raster analytics

Enable GPU resources for raster analytics

Caution:

Tune raster analytics

In this topic