Section Two: Overlay Analysis

Resulting from a time where cartographers solved location problems using clear sheets printed with features such as roads and rivers, we get the name Overlay Analysis to refer to the set of geoprocessing tools the software uses today. This category of tools examines how datasets interact with each other in a spatial way and answers questions like “Does this point or polyline feature lie inside or outside of some designated polygon?”, “Do these roads start or end on public land?”, or “How many feet of river land inside the water district’s jurisdiction?”. We begin our discussion of geoprocessing tools with the overlay analysis category because it is the foundation of all geoprocessing tasks, as well as many of the most commonly used tools.

In ArcGIS specifically, overlay analysis tools update both the geometry and the attribute table, by creating a new output feature class or shapefile where the "answer" to the spatial question is depicted.  To better understand how overlay analysis tools are used to solve problems in the GIS, we will dive right in to several of the tools.  There are actually more overlay analysis tools then those presented here, but examining how a few of them work will help you understand the entire category of tools.

7.2.2: Vector Overlay

The majority of the work we do in the GIS is using vector files, and while the software has some really neat and very complex tools in it's toolbox, the basic overlay analysis tools are foundation of all the work technicians do, regardless of how in-depth the project is or how long that technician has been in the field.  Tools like erase, intersect, and union are so common that two of the three can be found in the Geoprocessing shortcut menu, along with some of the foundational proximity analysis tools.  ESRI, the makers of ArcGIS recognize this foundational status of these tools and have created that shortcut menu for ease of use (which, by the way, can be customized to include other tools that a technician uses frequently).


Erase works like an eraser, removing from features in the input layer all the areas where the erase layer is coincident.  That means that the output layer is made up of features consisting of all the areas where the two were not coincident.  Erase is one of our two "cookie cutter" tools.  I'm going to make a general assumption that at least once in your life, you've made cookies using a cookie cutter and a sheet of rolled out dough.  During the process of cookie making, you roll out a sheet of dough, select a cookie cutter shape, and press the cookie cutter into the dough.  The result is a cookie shaped exactly like the cookie cutter.  The Erase tool works exactly the same. If you rolled out a sheet of cookie dough and you used a cookie cutter to make some delicious cookies, the shape of the cookie is determined by the place where the cookie dough and the cookie cutter were coincident.  If you then threw away all of those neatly cut out cookies and baked the remaining cookie dough sheet, you would have run the Erase tool. In this example, the output layer (the sheet of cookie dough post cookie cutting) is made up of features where the input layer (the original, complete sheet of cookie dough) and the Erase layer (the cookie cutter) were not coincident.  In the Extraction Analysis section of this Chapter, we will look at the second cookie cutter tool, Clip, which is the opposite of Erase in the fact that Clip keeps the cookies where everything is coincident and deletes the dough where they are not.

Figure 7.1: Erase Tool (Overlay Analysis) vs Clip Tool (Extraction Analysis) in Cookie Form
Clip and Erase as Cookies
The Clip and Erase tools are both "cookie cutter" tools, but are opposite of each other.  They both use and Input Layer (the cookie dough, seen on the left side of this image) and either an Erase or Clip Features Layer (the cookie cutter), but the Erase tool (an Overlay Analysis tool) keeps the cookie dough and deletes the cookies, as seen on the right side of this image, while the Clip tool (an Extraction Analysis tool) keeps the cookies and deletes the dough, as seen in the center of this image.
Erase is used when you want to answer the question: “What lies outside the area where the Erase features are coincident with features from the Input layer?”


The default installation of ArcGIS provides a totally customizable Geoprocessing menu which contains the six most commonly used tools in ArcGIS.  This was designed for the convenience of the technician, and also designed for the convenience of the technician, the tools presented in the Geoprocessing menu can be removed or changed.  On a side note, custom menus can be created in ArcMap, containing - for example - all the tools you might need for a single project without having to search for them each time.  These menus can be kept in place, hidden from view, or completely deleted.

The first of the “top six” tools available as set as by the default installation of ArcMap is Intersect.  Intersect looks for features which are common to all of the input layers. In contrast to Erase, which uses a “binary” input model or a tool which uses just two input layers, Intersect uses a multiple input model, meaning the tool will find the areas common between two or more input layers.

Intersect is used when you want to answer the question: “Where are the places the input layers have in common?”

Union vs Merge vs Append

While Erase and Intersect look at how features interact with each other, Union, Merge, and Append combine data together from two separate input data sets of the same geometry.  The difference is Union splits up coincident features at the lines where they intersect and adds a field to the attribute table expressing the relationship between the layers.  Merge simply combines the features where they exist in the world without any sort of feature altering or attribute table changes or notations, creating a new output dataset.  Lastly, Append, like Merge, combines several input datasets of the same geometry type, however, the tool appends the second, third, fourth, et cetera, layers to the first dataset.    

At this stage in the game, it's not about memorizing the different tools and how they work, but instead about noting that different tools accomplish similar tasks with slightly different outputs.  Some tools take attributes into account while other tools simply deal with geometry. Some tools create tables, some create vector files, and some convert vectors to rasters or the other way around.  It's more about just exploring what tools can do, running data through the tool, and examining the output.  Running lots of different tools which are similar with the same datasets will help a technician grow their skill set and move forward as a more confident analyst.

Union (on the left), Merge (center), and Append (right) combine data from two or more input data sets with the difference being that Union creates new features on cut lines where the input features are coincident (as seen inside the green output circle where the purple rectangles intersect the yellow circle) and adds a field to the attribute table expressing the relationship while Merge simply combines the data sets and all of their original attributes, creating a new 

7.2.3: Raster Overlay

In Chapter Three, we looked at the rasters, stating they are a grid of cells, with each cell containing a unique value to represent the real world. For example, in classified rasters the values can represent the landscape, such as 3 for bare land and 5 for forested land and with Digital Elevation Models (DEM's), the values represent terrain values such as elevation, aspect, and slope.

Just like with vectors, we can perform overlay analysis with rasters using a different series of tools (as vector tools only work with vector layers, raster tools only works with raster layers, and with the exception of a vector layer designating an extent in some raster tools, there are no tools which combine vectors and rasters for geoprocessing).  By “overlaying” one raster over another, we can examine the cell values from one raster to another and find relationships.

Map Algebra

Map algebra is blanket term for comparing or analyzing the values contained within the cells of a raster layer utilizing algebraic functions such as addition, subtraction, multiplication, division, statistical calculations such as mean, median, and standard deviation, relationship operations such as greater than, less than, or equal to, and Boolean operations such as NOT, AND and OR.  By examining the mathematical solution of comparing two raster values, relationships between the two layers can be derived.  

One example of Map Algebra is the ability to find change in the landscape or change detection.  Since we know that all rasters are made up of a complete grid, we can overlay one with another, and compare the cells that line up.  If we had classified an image of a forest from 1999 with a spatial resolution of 30 meters, assigning values of ‘5’ for forest and ‘3’ for bare land, then repeated the process for an image from 2015 (with the same spatial resolution, as that does make a difference when it comes to comparing raster images), we now have an input raster (1999) and a comparison raster (2015) to use in our map algebra tool. Applying map algebra to the input and comparison rasters will expose the cells which changed from bare land to forest.

Figure 7.2: Map Algebra Used Determine Change in the Landscape Over Time, or Change Detection

Images with different spatial resolutions can be compared as well; there is just some additional steps that must be taken to resample one of the image, or change the spatial resolution through sampling the values inside a certain count of pixels and looking for the most common value or the average value, depending on the goal of the resampling tool. 

Figure 7.3: Resampling Raster Images
In order to compare rasters to complete Map Algebra, it is sometimes necessary to resample an image so the spatial resolution (the size of the pixel as compared to the real world) matches between the input layers. Resampling is done sometimes by taking the most common value, as in this example, or the average value within a group of pixels.

Map algebra is not limited to just arithmetic. As seen in the table, statistical, trigonometric, and relational functions are possible.

Type of functionExample OperationsExample Use
ArithmeticAdd, Subtract, Multiply, DivideFinding total risk out of individual risk factors
StatisticsMinimum, Maximum, Mean, MedianFinding statistical trends
Relational OperationsGreater than, less than, equal toComparing values, finding all cells = X
BooleanNot, And, OrCan be used in combination with relational operators; find all cells = X and cells = Y
TrigonometrySine, Cosine, Tangent, Arcsine 
Exponential and logarithmicExponents and Logs 

Map Algebra Location Operations

Map Algebra deals with comparing the pixels in the two images in four different ways, not just one pixel to one pixel, or a local operation, like we saw with the change detection example.  Having different means of comparing the raster pixels allows ArcGIS to fully utilize the assumed relationships between pixels (with an equal height, width, and center-to-center value, assumptions can be made about rasters that cannot be made about vectors).  The four operations between pixels are local, global, zonal, and focal or neighbor operations.  Map algebra is a more intermediate tool vs the vector tools we are learning about in Introduction to GIS, but it's a good introduction to how raster tools work, and a solid understanding of the basics of raster layer interactions is important to understanding how all geoprocessing tools work.

Local operations

Calculations performed between identical raster cells, that cell A1 (in the sense of a Cartesian coordinate system) on the input raster is compared to cell A1 on the comparison raster, B1 to B1, etc. Our example of tree density would be an example of a local operation - we are detecting change over time.

Global operations

Compare the value of a single cell on the input raster to all the cells on the comparison raster. The most common use for global operations is to determine distance from the source cell to all the other cells.

An example would be to find the distance from the source of cable service in a neighborhood to all the homes in a given area. If the cable company had a cutoff distance of 3 miles, global operations can find all the cells which fall within 3 miles of the source cell (main cable box) and return a raster layer with values of 1 for in and 0 for out. The output raster could then be layered on top of the image that was classified to show a neighborhood map of in and out areas for service.

Zonal Operations

Using  Zonal Map Algebra a zone from the input layer, map algebra is performed only on the cells in comparison raster which fall in the input layer zone. In the figure, the output raster shows calculations for only the cells which match up between the input raster’s zone and the comparison raster.

An example of zonal operations would be to find suitable habitat for a species. If you knew the species was happiest on a 6% slope, on a slope that faced east, and in an area above 3,000 feet, you can create one raster for each requirement, then use a zonal operation to find only the cells where all three factors line up. Layer the final output raster over an image of the area and the suitable habitat pops out.

Focal Operations

Using neighborhood values values from within a single raster, focal operations compare the neighborhood to one cell, then move to the next cell and compare a new neighborhood, and so on with the intention of finding a relationship or pattern which occurs within one raster.

Imagine focal operations as a moving window, looking at the pattern of a few cells, then the window moves, and compares the new group of cells to find the overall pattern.

An example of a focal operation is point density. The tool uses a window to count the number of points in one area, then moves on and counts inside the next area. It saves all the counts, sees where the count was high and where the count was low, then returns a new raster where high and low are colored differently. These maps are an example of continuous data.