Note:
This tool is now available in Map Viewer, the modern map-making tool in ArcGIS Enterprise. To learn more, see Find Hot Spots (Map Viewer).
The Find Hot Spots tool will determine if there is any statistically significant clustering in the spatial pattern of your data.
Workflow diagram
Examples
A city's police department is conducting an analysis to determine if there is a relationship between violent crimes and unemployment rates. An expanded summer job program will be implemented for high schools in areas where there is high violent crime and high unemployment. Find Hot Spots will be used to find areas with statistically significant crime and unemployment hot spots.
A political strategist wants to know which regions showed the strongest or weakest support for a particular political party in the last election. This information could be helpful in guiding campaign strategies for future elections. The strategist subtracts the proportion of Democrat votes from the proportion of Republican votes and uses Find Hot Spots to find the hot and cold spots in the differences. The hot spots (red) will denote strong Republican support, while the cold spots (blue) will denote strong Democrat support.
A conservation officer is studying disease in trees to prioritize which areas of the forest should receive treatment and learn more about areas that are showing some resistence. The Find Hot Spots tool can be used to find clusters of diseased (hot spots) and healthy (cold spots) trees.
Usage notes
The input features can be points or areas.
The Find clusters of high and low parameter is used to evaluate the spatial arrangement of features. If your features are areas, a field must be chosen. Clustering will be determined using the numbers in the chosen field. Point features can be analyzed using a field or the Point Counts option. If Point Counts is used, the tool will determine if the points themselves are clustered, rather than clusters of high and low field values.
If points are being analyzed with Point Counts, two additional options will be available. The Count points within parameter allows the points to be aggregated within a Fishnet Grid, Hexagon Grid, or an area layer from the Contents pane, such as counties or ZIP Codes. The Define where points are possible parameter is used to create an area or multiple areas of interest. The three options for this parameter are None, meaning all points are used, an area defined by an area layer from the Contents pane, and areas created using the Draw tool.
Your data can be normalized using the Divide by parameter. The Esri Population data uses GeoEnrichment and requires the use of credits. Another option is to normalize using a field from the input layer (available when the Find clusters of high and low parameter is set to a field, rather than Point Counts). Values that can be used for normalization include number of households or area.
Note:
Esri Population data is not available for the Divide by parameter when your organization has a custom GeoEnrichment service configured.
The Options drop-down menu can be used to set a specific Cell Size value or Distance Band value for your analysis.
The output layer includes additional fields containing information such as the statistical significance of each feature, the p-value, and the z-score. The output layer also contains information about the statistical analysis in the Description section of its Item Details page.
How Find Hot Spots works
Even random spatial patterns exhibit some degree of clustering. In addition, our eyes and brains naturally try to find patterns even when none exist. Consequently, it can be difficult to know if the patterns in your data are the result of spatial processes at work or just the result of random chance. This is why researchers and analysts use statistical methods such as Find Hot Spots (Getis-Ord Gi*) to quantify spatial patterns.
The Find Hot Spots tool calculates the Getis-Ord Gi* (pronounced G-i-star) statistic for each feature in a dataset. The resultant z-scores and p-values tell you where features with either high or low values cluster spatially. The Find Hot Spots tool calculates optimal defaults based on the characteristics of the input data and automatically applies a False Discovery Rate (FDR) correction. Each feature is analyzed in the context of neighboring features. A feature with a high value is interesting but may not be a statistically significant hot spot. To be a statistically significant hot spot, a feature will have a high value and be surrounded by other features with high values. The local sum for a feature and its neighbors is compared proportionally to the sum of all features; when the local sum is very different from the expected local sum, and when that difference is too large to be the result of random chance, a statistically significant z-score results.
When you do find statistically significant clustering in your data, you have valuable information. Knowing where and when clustering occurs can provide important clues about the processes promoting the patterns you're seeing. Knowing that residential burglaries, for example, are consistently higher in particular neighborhoods is vital information if you need to design effective prevention strategies, allocate scarce police resources, initiate neighborhood watch programs, authorize in-depth criminal investigations, or identify potential suspects.
Analyze area features
Data is available for area features such as census tracts, counties, voter districts, hospital regions, parcels, park and recreation boundaries, watersheds, land cover classifications and climate zones. When your analysis layer contains area features, you must specify a numeric field that will be used to find clusters of high and low values. This field can represent the following:
- Counts (such as the number of households)
- Rates (such as the proportion of the population holding a college degree)
- Averages (such as the mean or median household income)
- Indices (such as a score indicating whether household spending on sporting goods is above or below the national average)
With the field you provide, the Find Hot Spots tool will create a map (the result layer) showing the areas with statistically significant clusters of high values (hot spots: red) and low values (cold spots: blue).
Analyze point features
A variety of data is available as point features. Examples of features most often represented as points include crime incidents, schools, hospitals, emergency call events, traffic accidents, water wells, trees, and boats. Sometimes you will be interested in analyzing data values (a field) associated with each point feature. In other cases, you will only be interested in evaluating the clustering of the points. The decision on whether to provide a field will depend on the question you are asking.
Find clusters of high and low values associated with point features
Provide an analysis field to answer questions such as Where do high and low values cluster? The field you select can represent the following:
- Counts (such as the number of traffic accidents at street intersections)
- Rates (such as city unemployment, where each city is represented as a point feature)
- Averages (such as the mean math test score among schools)
- Indices (such as a consumer satisfaction score for car dealerships across the county)
Find clusters of high and low point counts
For some point data—typically when each point represents an event, incident, or indication of presence or absence—there won't be an obvious analysis field to use. In these cases, you can find where clustering is unusually (statistically significant) intense or sparse. For this analysis, area features (a fishnet grid that the tool creates, or an area layer that you provide) are placed over the points and the number of points that fall within each area are counted. The tool then finds clusters of high and low point counts associated with each area feature.
Define where points are possible
Specify an area layer, or draw areas defining a study area where you want analysis to be performed in all locations where the incident point features could possibly occur. For this option, the Find Hot Spots tool will overlay your defined study area with a fishnet grid and counts the points falling within each fishnet square. When you do not indicate where incident points are possible using this option, the Find Hot Spots tool will only analyze fishnet squares that contain at least one point count. When you use this option to define where points are possible, however, the analysis will be done for all fishnet squares that fall within the bounding areas you define.
Count points within aggregation areas
In some cases, area features such as census tracts, police beats, or parcels make more sense for your analysis than the default fishnet grid.
Choose to divide by
There are two common approaches to identify hot and cold spots:
- By count—When you analyze a particular dataset, you often want to find hot and cold spots of the number of features in each aggregation area across your study area. For instance, you can find hot spots where the highest numbers of crimes have occurred and cold spots where the lowest numbers of crimes have occurred to allocate resources.
- By intensity—On the other hand, analyzing and understanding patterns that take into account underlying distributions that influence a particular phenomenon can also be meaningful. This concept is often referred to as normalization, or the process of dividing one numeric attribute value by another to minimize differences in values based on the size of areas or the number of features in each area. For instance, with crime, you may want to understand where there are clusters of high and low numbers of crimes that take into account the underlying population. In that case, you can count the number of crimes in each area (whether that area is a fishnet grid or a different area dataset) and divide that total number of crimes by the total population in that area. This gives you a crime rate, or the number of crimes per capita. Finding hot and cold spots of crime per capita answers a different question that can also help guide decision making.
Both ways of analyzing the data in your study area are valid; it just depends on what question you are asking.
Choosing an appropriate attribute to divide by is important. You must confirm that the Divide by parameter is a parameter that does, in fact, influence the distribution of the particular phenomenon you are analyzing.
When you choose the Divide by attribute for Esri Population, the population data from the Esri Demographics Global Coverage is used. Confirm that the resolution of the data available for the area that you are interested in is compatible with the size of the areas that are being enriched (either aggregation areas you provide or fishnet squares being created).
Interpret results
The output from the Find Hot Spots tool is a map. For the points or the areas in this result layer map, the darker the red or blue colors appear, the more confident you can be that clustering is not the result of random chance. Points or areas displayed using beige, on the other hand, are not part of any statistically significant cluster; the spatial pattern associated with these features may be the result of random chance. Sometimes the results of your analysis will indicate that there are no statistically significant clusters at all. This is important information. When a spatial pattern is random, you have no clues about underlying causes. In these cases, all of the features in the results layer will be beige. However, when you do find statistically significant clustering, the locations where clustering occurs are important clues about what may be creating the clustering. For example, finding statistically significant spatial clustering of cancer associated with certain environmental toxins can lead to policies and actions designed to protect people. Similarly, finding cold spots of childhood obesity associated with schools promoting after-school sports programs can provide strong justification for encouraging these types of programs more broadly.
Troubleshoot
The statistical method used by the Find Hot Spots tool is based on probability theory and, consequently, needs a minimum number of features to operate effectively. This statistical method also requires a variety of counts or analysis field values. If you are analyzing crime incidents by census tract, for example, and end up with exactly the same number of crimes in each tract, the tool cannot solve. The following table provides an explanation of the messages you may encounter when you use the Find Hot Spots tool:
Message | Problem | Solution |
---|---|---|
The analysis options you selected require a minimum of 60 points to compute hot and cold spots. | There aren't enough point features in your point analysis layer to compute reliable results. | Add more points to your analysis layer. Alternatively, you can define bounding analysis areas, to add information about where points could have occurred but didn't. With this method, you need a minimum of 30 points. You can also provide aggregation areas that overlay your points. You need a minimum of 30 polygon areas and 30 points within those areas for this analysis. If you have at least 30 points, you can specify an analysis field. This changes the question from where are there many or few points to where do high and low analysis field values cluster spatially. |
The analysis options you selected require a minimum of 30 points with valid data in the analysis field in order to compute hot and cold spots. | There aren't enough points, or enough points associated with non-null analysis field values, in your analysis layer to compute reliable results. | If you have fewer than 30 points, this analysis method is not appropriate for your data. If you have more than 30 points and you are seeing this message, the analysis field you specified may have null values. Points with null analysis field values are skipped. Another possibility is that you have an active filter reducing the number of points available for analysis. |
The analysis options you selected require a minimum of 30 polygons with valid data in the analysis field in order to compute hot and cold spots. | There aren't enough polygon areas, or enough area features associated with non-null analysis field values, in your analysis layer to compute reliable results. | If you have fewer than 30 polygon areas, this analysis method is not appropriate for your data. If you have more than 30 areas and you are seeing this message, the analysis field you specified may have null values. Polygon areas with null analysis field values are skipped. Another possibility is that you have an active filter reducing the number of polygon areas available for analysis. |
The analysis option you selected requires a minimum of 30 points to be inside the bounding polygon areas. | Only points that fall within the bounding analysis areas you draw or provide are analyzed. To provide reliable results, at least 30 points should be inside the bounding analysis areas. | If you do not have at least 30 points, this method is not appropriate for your data. With a minimum of 30 features, the solution is often to provide different, perhaps larger, bounding analysis areas. Another option is to provide an area layer with a minimum of 30 aggregation polygons that overlay at least 30 of your points. When you provide aggregation areas, analysis is performed on the point counts within each area. |
The analysis option you selected requires a minimum of 30 points to be inside the aggregation polygons. | Only the points that fall inside the aggregation polygons are included in the analysis. To provide reliable results, at least 30 points should be inside the polygon areas you provide. | If you do not have at least 30 points, this method is not appropriate for your data; otherwise, you should draw or provide bounding analysis areas that overlay at least 30 of your points. The bounding areas should reflect all the locations where points could possibly occur. |
The analysis option you selected requires a minimum of 30 aggregation areas. | The option you selected overlays the aggregation areas on top of your points and counts the number of points falling withing each area. A minimum of 30 counts (30 areas) are needed to provide reliable results. | Reliable results can be computed if you provide a minimum of 30 points that fall within a minimum of 30 aggregation areas. If you don't have 30 aggregation areas, you can draw or provide bounding analysis areas that overlay at least 30 of your points. These bounding areas should reflect all the locations where points could possibly occur. |
Hot and cold spots cannot be computed when the number of points in every polygon area is identical. Try different polygon areas or different analysis options. | When the Find Hot Spots tool counted the number of points within each aggregation area, it found that the counts were all identical. To compute results, this tool requires at least some variation in the count values obtained. | You can provide alternative aggregation areas that do not result in all areas having the exact same number of points. Rather than aggregation areas, you can also draw or provide bounding analysis areas. Alternatively, you can specify an analysis field. However, this changes the question from where are there many or few points to where do high and low analysis field values cluster spatially. |
There is not enough variation in point locations to compute hot and cold spots. Coincident points, for example, reduce spatial variation. You can try providing a bounding area, aggregation areas (a minimum of 30), or an Analysis Field. | Based on the number of points and how spread out they are, the tool creates a fishnet grid to overlay your points. After counting the number of points that fall within each fishnet square and removing squares with zero counts, there were fewer than 30 squares left. This tool requires a minimum of 30 counts (30 squares) to provide reliable results. | If your points occupy few unique locations (if there are many coincident points), a solution is to either provide aggregation areas that overlay your points or draw or provide bounding analysis areas indicating where points are and are not possible. Another option is to specify an analysis field. However, this changes the question from where are there many or few points to where do high and low analysis field values cluster spatially. |
There is not enough variation among the points within the bounding polygon areas. You can try providing larger boundaries. | Based on point locations and number of points, the tool creates a fishnet grid to overlay your points. After counting the number of points that fall within each fishnet square and removing squares that are outside the bounding analysis areas, fewer than 30 fishnet squares were left. This tool requires a minimum of 30 counts (30 squares) to provide reliable results. | If your points are located at a variety of locations inside the bounding analysis areas, you may just need to make or provide larger boundaries. If your points occupy few unique locations (if there are many coincident points), a solution is to provide aggregation areas that overlay your points. Another option is to specify an analysis field. However, this changes the question from where are there many or few points to where do high and low analysis field values cluster spatially. |
All of the values for your analysis field are likely the same. Hot and cold spots cannot be computed when there is no variation in the field being analyzed. | Most likely you specified an analysis field that has the same value for all of your points or area features in the analysis layer. The statistic used by this tool cannot solve unless there is a variety of values to work with. | You can specify a different analysis field or, for point features, analyze point densities rather than point values. |
We were not able to compute hot and cold spots for the data provided. If appropriate, try specifying an Analysis Field. | While unlikely, when the tool created a fishnet grid and counted the number of points within each square, the counts for all squares were identical. | Provide your own aggregation areas, draw or provide bounding analysis areas, or specify an analysis field. |
Cell size should be smaller than the distance band. | You provided a Distance Band value that is smaller than the size of each grid cell. | Review the units specified for both Distance Band and Cell Size, use the default value calculated by the tool, or use a value that is larger than the size of a single grid cell. |
Additional information about the algorithms used by the Find Hot Spots tool can be found in How Optimized Hot Spot Analysis works.
Similar tools
Use Find Hot Spots to determine if there is any statistically significant clustering in the spatial pattern of your data. Other tools that may be useful are described below.
Map Viewer Classic analysis tools
To find outliers in the spatial pattern of your data, use the Find Outliers tool.
To create a density map of your point or line features, use the Calculate Density tool.
ArcGIS Pro analysis tools
Find Hot Spots executes the same statistic used in the Hot Spot Analysis (Getis-Ord Gi*) and Optimized Hot Spot Analysis tools.