Kubernetes concepts and architecture—ArcGIS Enterprise on Kubernetes

To understand the ArcGIS Enterprise on Kubernetes system requirements and prerequisites, familiarize yourself with Kubernetes concepts and architecture. Common terms are described below and used in throughout the documentation.

Cluster

A cluster is a set of worker nodes that work together to run containerized applications. A Kubernetes cluster is a prerequisite to deploy ArcGIS Enterprise on Kubernetes.

Node

A node is a virtual or physical machine that has been specified with a minimum set of hardware requirements. The number of nodes required for deployment varies based on the architecture profile selected during deployment. Review system requirements to determine minimum node requirements for deployment.

Namespace

A namespace is used to organize objects in the Kubernetes cluster. A cluster can contain one or more namespaces, each specified with a unique name. Deleting a namespace removes all of its objects. A namespace is a prerequisite to deploy ArcGIS Enterprise on Kubernetes.

Role based access control (RBAC)

Role based access control is used to manage permissions for resources within the cluster.

Resource quotas

Resource quotas are used to control resource limits such as CPU and memory within a designated namespace.

Secrets

A secret is used to store and manage passwords, tokens, keys, or other highly sensitive information in the namespace.

Pod

A pod is the smallest unit in a Kubernetes cluster. Pods are an abstraction over a container or, in some cases, a set of containers. Each pod in a cluster is assigned a unique and internally available IP address.

Deployment

A deployment is a grouping of a designated set of pods, ReplicaSets, and Services that work together to run containerized applications. A deployment manages updates for a set of pods where they can be scaled and updated with optimized resource limits. In ArcGIS Enterprise on Kubernetes, deployments are managed from ArcGIS Enterprise Manager or ArcGIS Enterprise Administrator API. In ArcGIS Enterprise on Kubernetes, administrators manage GIS services and apps as deployments.

ReplicaSet

A deployment leverages a ReplicaSet to ensure that a designated set of replicas are running for a given pod. A ReplicaSet creates or deletes pods as needed to fulfill the desired set of replicas specified through a deployment. ReplicaSets are stateless.

StatefulSets

StatefulSets are used to manage pods that have been created with the same set of specifications. They maintain a unique ID for each pod within their respective deployment to ensure pods are unique and assigned appropriately to persistent storage volumes. StatefulSets are stateful.

Service

A service is used to group a set of pods and provide access to them as needed. A service uses a single domain name system (DNS) name for the group of pods and directs internal network traffic to a series of pods to manage workload and provide interpod communication. A service is defined by a static IP address that can be attached and recognized by participating pods.

The life cycles of a service and pod are independent, meaning that a pod can terminate and be replaced by a new pod. When this occurs, the service recognizes the new IP address accordingly.

A service ensures that network traffic can be directed to a current set of pods designated for the workload.

Ingress

Ingress is used to route external traffic into the cluster. It may also be used to provide additional load balancing to services within the cluster.

Persistent volume

Persistent volumes are resources that are available to the cluster for the purpose of data and system resource storage. Containers and pods in the cluster access persistent volumes using a persistent volume claim. The following information pertains to persistent volumes:

An administrator can make connections from the Kubernetes cluster to physical storage such as cloud storage, a network file system (NFS), and so on.
They are not dependent on a pod's life cycle.
They are not tied to a specific node; they are associated with the entire cluster.
Persistent volumes reside outside of the namespace to provide storage to the entire cluster.

Persistent volumes are defined by an administrator as a series of properties. Required specifications in the file vary and are based on physical storage type. Persistent volumes can be provisioned statically prior to deployment. They can also be provisioned dynamically during deployment when a storage class is used.

For additional details about persistent volumes, review system requirements.

Storage class

A storage class describes attributes for an available storage type. A storage class is used when persistent volumes are dynamically provisioned. A storage class is specified as a set of properties and is specified through the provisioner attribute. Each physical storage provider includes a provisioner.

Persistent volume claim

A persistent volume claim requests or claims persistent volumes to provide resources for a pod. A pod claims storage through the persistent volume claim, which in turn requests storage through a storage class. Persistent volume claims reside inside the namespace. Persistent storage is used to manage saved data, ranging from configuration files and logs to member content.

Pod-to-pod communication

Pods communicate with each other through a service in a virtual network. Individual pods are assigned an IP address, which the service recognizes throughout the life cycle of the pod. When a pod terminates, possibly because an application in its container has failed, a replacement pod and associated IP address are created in its place.

Topology spread constraints

A topologySpreadConstraint is a property that can be applied to pod templates that controls how scheduling occurs across a cluster topology. A topology domain is defined as each unique value is assigned to the same label key. The topologySpreadConstraint ensures even distribution across the topology domains according to a defined maxSkew property. If the maxSkew is set to 1, the number of pods per topology domain can only be within 1 replica of each other, so if you were using three topology domains the scheduler would not allow 0, 1, and 2 replicas in each of the three domains and would schedule as 1, 1, 1 instead.

Feedback on this topic?