Automate workflows with ArcGIS Notebooks

Notebook services allows you to implement multiple methods of workflow automation, including automation through scheduled notebooks, webhooks, and your custom scripts using the Execute Notebook administrative API.

Note:

To automate a notebook using scheduled tasks, webhooks, or the Run Notebook API, the notebook must use a runtime of version 3.0 or later.

Schedule notebook tasks

Notebook authors can schedule ArcGIS Notebooks for automated running at a fixed time in the future, either once or on a recurring basis. Creating tasks to schedule notebooks allows you to automate routine workflows, run data-intensive processes during off-peak usage hours, and regularly update datasets. For example, you can schedule a notebook to do the following:

  • Import data from an online source that updates monthly, automatically clean the data and apply necessary transformations, and move the data to your workspace
  • Run a big data analysis workflow that requires heavy processing power overnight, when your machine resources are otherwise unused
  • Manage users who have created accounts in your organization during the past week and send the list to you in an email

You can create one or more tasks for a notebook. By default, the notebook author or administrator can create a maximum of 20 tasks. If the ownership of a notebook is changed, any tasks associated with that notebook will be deactivated and assigned to the new owner.

Scheduled tasks allow you to parameterize notebooks. When a notebook is parameterized, it allows you to write generic code that can be adapted to varying inputs without your interaction. The chosen parameters are inserted into the notebook when a task is run and can optionally be saved to the notebook. For example, a parameterized notebook can be used to generate region-wide air pollution reports on a recurring basis. The notebook can have multiple scheduled tasks, one for each region to be studied, and for each task, parameterized input such as city name and pollution type can be fed into the notebook.

Note:

Administrators can view details, edit, pause and resume, or delete each active notebook task in the site from the Manage tasks window.

You can configure the task to save the state of the notebook to the original notebook item after completion.

A static HTML view of the notebook will be saved for each scheduled task that is run.

Using the Manage tasks window from the Notebook services home page, administrators can view details, edit, pause and resume, or delete each active notebook task in the site. Administrators and notebook authors with the schedule notebook privilege can view details, edit, pause and resume, or delete a notebook task in the details page of the notebook or in the task pane of the notebook editor.

If a previous run of a task is still running, a new scheduled task run will be skipped. For example, if a task is scheduled to run every 15 minutes, but an instance of that task runs for 20 minutes, the next scheduled run will be skipped. If this occurs regularly, the task owner should adjust the scheduled time interval so that there is no overlap between runs.

For more information on scheduled notebook tasks, see the Schedule a notebook task help topic.

Scheduled task limits

There are certain user and organizational limits related to scheduled notebook tasks.

Change maximum concurrent automated notebook runs (maxAutomatedNotebookJobsPerManager)

When a notebook is run by a scheduled task, webhook, or the Execute Notebook API, Notebook services automatically opens a new deployment and runs the notebook without user interaction. By default, Notebook services is configured to run a maximum of 10 concurrent notebook runs for each notebook automation service deployment. Considering the resources available on the Kubernetes cluster, an administrator can adust this limit by modifying the maxAutomatedNotebookJobsPerManager configuration property of Notebook services. Any automated notebook request submitted after this limit is exceeded will be added to a queue and run once the number of automated task runs falls below this limit. Queued tasks will fail if the wait time exceeds the timeout time.

Note:

This does not limit the number of notebooks that are run interactively from the notebook editor.

You can change the maxAutomatedNotebookJobsPerManager limit by following the steps below.

Note:

Increasing the limit can result in additional resources used on the nodes.

  1. Sign in to your ArcGIS Enterprise Administrator API as an administrator.
  2. Click Notebooks > Configuration > Update Configuration.
  3. Click Settings > Site.
  4. Change the value for the maxAutomatedNotebookJobsPerManager property.
  5. Click Update Notebook Configuration.

Maximum active scheduled notebook tasks per user

Each notebook author with the privilege to schedule notebooks can create up to a maximum of 20 active notebook tasks. Once this limit is reached, the user cannot create new scheduled tasks. A new task can be created once an existing task changes from Active to Complete, Failed, or Inactive. This limit can be changed by updating the ExecuteNotebooksUserLimit property using the Update limits operation in the ArcGIS Enterprise Administrator API.

Maximum active scheduled notebook tasks per organization

The maximum number of active scheduled notebook tasks for an organization is limited to 200. This limit represents the total number of active tasks that can be owned by all users across an organization. Once this limit is reached, users cannot create new scheduled notebook tasks. This limit can be changed by updating the ExecuteNotebooksOrgLimitUpdate limits operation in the ArcGIS Enterprise Administrator API.

Number of results reported for a scheduled task

The results of a task are reported and maintained for 30 runs. Any task runs prior to the most recent 30 runs of a task are permanently deleted. This limit can be changed by updating the TaskRunHistoryCount property using the Update limits operation in the ArcGIS Enterprise Administrator API.

Automatic failure of a task

Any tasks that fail five consecutive times will automatically be switched to a failed state and will no longer run. The task owner must ensure that the notebook can be run successfully without any user interaction before reactivating the task. To ensure that the tasks continue to run, the owner of the task must identify and rectify the failure and change the task to the Active state. This limit can be changed by updating the FailedRunsDisableTask property using the Update limits operation in the ArcGIS Enterprise Administrator API.

Run Notebook API

Administrators and notebook authors can also automate a notebook to run without user interaction by using the Execute Notebook operation in the ArcGIS Enterprise Administrator API. This operation will automatically run a notebook when called, but by using a custom script, you can schedule it to run automatically to occur at a set time or on a regular schedule. A cron job or Windows scheduler can also be used to schedule the Execute Notebook operation to run at a set time or on a recurring interval.

To learn more, see the Execute Notebook operation topic in the Administrator Directory reference guide.