Data and publishing in ArcGIS Enterprise—ArcGIS Enterprise on Kubernetes

Data storage and management is a vital aspect of your ArcGIS Enterprise deployment. It shapes how your organization accesses, manages, contributes, and edits data, and it provides the foundation for how your data can be used. ArcGIS Enterprise allows you to store source data for your web services and layers in user-managed data storage locations or data storage that is managed by ArcGIS.

User-managed storage locations are the data sources that you or others in your organization manage, such as a database, folder, or cloud storage locations. Conversely, when you use data storage locations managed by ArcGIS, you do not manage or access the underlying databases. In a single ArcGIS Enterprise deployment, you will likely use both types of data storage; you don't have to choose just one or the other.

Tip:

For more information on these terms and concepts, see the Data in ArcGIS: User Managed and ArcGIS Managed technical paper.

When you publish a web map, layer, or service to ArcGIS Enterprise, you determine how your data will be managed. The first step in this process is to decide if you'll copy data or register your data with one (or more) of the federated servers that comprise your ArcGIS Enterprise portal.

Register or copy the data

When you publish from ArcGIS Pro, you determine the location of the data used by your web layers. For most data sources, you can either register your data source—in which case the web layers access the data in the data source—or have ArcGIS copy the data to a location managed by ArcGIS, which can be a system managed data store or a federated server. If your data source is a cloud data warehouse, you always register the data source, but you can create a snapshot of the data when you publish. This makes a copy of the subset of data included in the query layer that accesses the cloud data warehouse and places it in the system managed data store for the web service to access. You can refresh the contents of this snapshot from the web layer's item page in the portal. Doing so overwrites the data in the system managed data store with data from the registered data source.

You can also add files to your organization and publish from the ArcGIS Enterprise portal. In this case, the data is always copied to one of the system-managed data stores.

Register data stores

When you add a user-managed data store and publish web layers, the web layers reference the data in the data source. If the data in the registered data source changes, you will see those changes in the web layer. The only exception to this is when you create a snapshot for data published from a cloud data warehouse.

The following are cases when registering data is recommended or required:

You have multiple clients accessing and updating the source data.
If you have apps directly editing the source data, apps editing the source data through services, or conversion or ETL processes that load data from contractors to your source, publish map or feature layers that reference the data source. That way, the people who use the layers can see changes to the data as they are made in the source.
You use versioned data from an enterprise geodatabase.
If you publish from a map that contains versioned enterprise geodatabase data and you copy the data, the copied data no longer participates in the version. Edits made through the published feature layer cannot take advantage of multiuser editing functionality.
You use archive-enabled data from an enterprise geodatabase.
Data owners enable archiving so they can see changes in the data over time. If you copy data from the source when you publish a feature layer, it's no longer part of the archive and you cannot see the changes made to the data after it's copied.
You have large feature classes or feature classes with complex geometries.
The greater the number of features and the more complex the shapes, the longer it takes to copy the data. Examples of complex shapes include polygons or lines with thousands of vertices, such as coast lines or meandering rivers.
To save resources in the system managed data store, you created raster tile, vector tile, scene caches, or a 3D tiles dataset and stored it in a folder or cloud data store that you control, and you will publish one of the following to reference the appropriate cache: a tile layer, vector tile layer, scene layer, or 3D tiles layers.
You're working with data or file types that can only be published from a registered data source.
If you publish the following types of data from ArcGIS Pro, you must register your data source with a federated server and publish to it:
- Geoprocessing scripts or models
- Dynamic maps
- KML
You don't want ArcGIS to clean up the data when you delete the web layer.
Data that is copied to ArcGIS Enterprise is automatically deleted by the system when the service or portal item associated with the data is deleted. If users only interact with the data through the web layer (in other words, the web layer essentially is the data), you want the data and web layer to be deleted at the same time. However, if the service or portal item is only one way that people access the data, you need the data to remain in the data source. In that case, register the data source with a federated server and publish.
Tip:
If you copy the data when you publish a feature layer and later decide you need to keep the data, export the data from the portal item, move the data into a database or enterprise geodatabase you register with a federated server, and publish.
Your database connection references a cloud data warehouse.

Copy data

Copying your data is like taking a snapshot of your source data at the time you publish. Unlike items created from registered data, items created from copied data do not receive dynamic updates from the data source as it changes. If you don't need your web layer to access the source data, copying the data when you publish is a suitable workflow.

The following are cases when you may prefer to copy data:

You're loading a file to the portal to publish from it.
Users outside your firewall need access to the data.
You and other users will only access the data through the web layer.
You're using an app or functionality that requires hosted layers.
You're working with a type of data that requires you to copy the data when you publish.
You're publishing from data in a cloud data warehouse, but your ArcGIS Enterprise portal is not in the cloud. In this case, making a snapshot of the data may improve performance when querying the web layer.

Copied data can be either user-managed or managed by ArcGIS.

Data managed by ArcGIS

Data managed by ArcGIS is the hosted data in your organization. The services built from hosted data always reside in the organization.

Many common workflows and their subsequent outputs in ArcGIS Enterprise depend on the ability to create hosted layers. Hosted layers are not only created as a direct action from uploading a dataset and explicitly choosing to publish it as a new layer. Hosted layers are also created as the output of many actions in ArcGIS Enterprise, such as running analysis tools and as part of distributed collaboration workflows where feature layers are copied.

Where to publish

Members of your organization can publish from files in the portal, from ArcGIS Pro, from data store items, or using ArcGIS API for Python.

Publish from files in the portal

You can publish hosted web layers from certain files you add to your organization.

When you publish from files in the portal, the services for the resultant layers always run on nodes in your portal.

The following table lists the files you can upload and the hosted web layers you publish from them:


Files	Type of layer
CSV file, Microsoft Excel file, GeoJSON file, zipped shapefile, zipped file geodatabase	Hosted feature layer
Tile package (.tpkx), service definition (.sd) file, or vector tile package (.vtpk)	Hosted tile layer
Scene layer package (.slpk)	Hosted scene layer
3D tiles package (.3tz)	Hosted 3D tiles layer

For information on publishing each type of layer, see Publish hosted feature layers, Publish hosted tile layers, Publish hosted vector tile layers, Publish hosted scene layers, and Publish hosted 3D tiles layers.

Publish from ArcGIS Pro

When you add layers to maps and scenes in ArcGIS Pro, you can share the layers as web layers. Depending on what type of layer you create, the layer's data is copied to an ArcGIS managed data store or it remains in your registered data source. When you choose to keep the data in your registered data source, you also choose the server on which the service will run.

Copy all data

When you choose an option under Copy all data when publishing from ArcGIS Pro, it means the data used by the resultant web layer will not be the same as the source data in your map or scene. There are certain web layers that require you to copy all data. They include the following:

Vector tile layers
Vector tile layers are shared (published) from point, line, polygon, or multipoint feature layers in your map. Layer data is cached locally. The service runs in your portal, and a tile layer item is created.
See Author a map for vector tile creation in the ArcGIS Pro help for information about how to create a map that meets the requirements for publishing a vector tile layer.
Tile layers
Tile layers are published from maps in ArcGIS Pro. Publishing a tile layer creates a cached map service on nodes in the portal and a tile layer item in your organization. The tile caches are in the portal in the same way as when you publish a tile package or service definition file in the portal. See Author a web map in the ArcGIS Pro help for information about publishing maps and layers as tile layers.
Scene layers
When you share a LAS scene layer or create a scene layer package in ArcGIS Pro, the scene service you publish runs on nodes in the portal and the cache is stored in an ArcGIS managed data store.

When you publish a feature layer in ArcGIS Pro, you have a choice of where to store the data. If you choose Copy all data when publishing, data is copied to the ArcGIS managed data store.

Copy data when you publish web tools

You can share a geoprocessing script or model from ArcGIS Pro to your organization. When you do this, you decide whether to copy the data used in the script or model to the server—which creates a static copy of the data the service uses—or create a reference that the service can access.

You can publish a web tool to the hosting server by copying data if the data size is small. When the data is large, copying data can take a long time, and it is not recommended.

Reference registered data

If you want your web layers to reference your source data, you must register your data source with a federated server where you want the service to run or register the source data with your organization. This ensures that the service can access the data. See Manage registered data stores in the ArcGIS Pro help for more information.

When you publish feature layers from a registered database, the data remains in the source database or enterprise geodatabase, and a feature layer item and map image layer item are created in your organization. A map service with feature access enabled is also created on the server you selected when you published. If the database connection accesses a cloud data warehouse, the data remains in the source, but only a feature layer item is created in your organization.

To publish a map image layer from a map in ArcGIS Pro, you must register the data source (or sources) with the federated server to which you publish or with your organization. All the data in the map stays in your registered data source, a map service is created on the federated server or nodes in the portal, and a map image layer is created in the organization.

For more information on publishing maps and features to federated servers, see Layers published to your portal's federated servers.

Publish web tools using referenced data

When you share a geoprocessing script or model from ArcGIS Pro, you can choose to reference registered data rather than copy all data. If you want the tool to reference the data, the data must be in an enterprise geodatabase or a folder that you registered with the federated server to which you publish or with your organization if you publish to the portal.

Publish from data store items in the portal

When you add a database data store item to the portal, you can bulk publish feature layers and map image layers that reference the data in a relational database or enterprise geodatabase accessed through the data store item.

You can share folder and cloud storage data store items that contain pre-created caches with others so they can publish tile, vector tile, 3D tiles, or scene layers that reference the caches in the data store stores.

Publish using ArcGIS API for Python

You can use the Item class in the GIS module of ArcGIS API for Python to publish items to your ArcGIS Enterprise portal using Python scripts and notebooks. See the ArcGIS API for Python sample notebooks for content publishers for scenario-based examples.

Feedback on this topic?