Section Eight: Select by Location
Let's look at an example where we have two shapefiles: a point layer representing earthquake epicenters and a polygon layer representing the contiguous United States (the "lower 48"), and we want to know what of those earthquakes occurred within the State of California. When we open and examine the attribute table for the earthquake point layer, we find that there are fields for the year, month, day, and strength of each earthquake, yet there is no field for "State". Since our initial goal was to complete a simple SQL query for "Epicenter_State" = 'California', we are kind of at a standstill. No field designating state name means no ability to complete a simple SQL query on the table.
We still need to know what earthquakes happened within the borders of the State of California, and we've determined there is no field for the state name, so we turn to another tool in our GIS toolbox - Select by Location.Select by Location is the second most common way we use to find data and selects values from one layer based on it's spatial relationship with another layer In the image below, we can see the earthquakes which fall inside the California border, since our eyes can determine which ones fall inside, which ones fall outside, and which ones fall on the border. In a way, the software can "see" this too, when we tell is what to "look" for utilizing the Select by Location tool.
5.8.2: The Select by Location Dialog Box: Top to Bottom
Like we did with the Select by Attribute dialog box, we are going to look at the Select by Location dialog box from top to bottom, utilizing our earthquakes in California example, now that we've seen the situation of not having a state field within the earthquake point layer's attribute table and the result of the tool running. If you haven't yet, take a minute to examine the images above, noting the lack of the field necessary (specifically a field with a "state" header which would designate what state the earthquake occurred in) to complete a Select by Attribute an how the query performed by the Select by Location tool selected (highlighted) features within the earthquake layer based on the fact that the noted incidents intersected the US states layer (the desired query). Also note that California is selected, but - as we will see - not as a result of the Select by Location query, but instead as an input to the tool.
The Select by Location tool is still performing queries on the tables - asking the table a question and returning those values where the tool finds the relationship is true - just in this case, the query is less a question which follows the simple SQL query format and more of a question based on relationships. They are both types of table queries, just approached in different ways - non-spatial and spatial, respectively. SQL queries, as we read in a previous section, are not unique or limited to GIS, as SQL is a database electronic storage container with a top-down structure in which the items contained are related to each other and that relationship allows for the data to be quickly and efficiently queried and retrieved for use. language and databases popup all over the place. Select by Location relationship queries, however, are unique to GIS as the only selection criteria is a spatial relationship - how the features interact with each other using relationships such as intersect, are exactly the same as, and fall completely inside.
|Launch the Select by Location Dialog Box
Select by Location can be found in the Selection menu. Unlike Select by Attribute, Select by Location can only be launched via the Selection Menu.
The selection methods are the same as in the Select by Attribute dialog box, albeit a little different wording. Most of the time, we are performing new selections, so the dialog box defaults to "select features from", meaning that the technician needs to change the selection method as needed throughout different selection tasks.
When it comes to our California earthquakes example, we used the default of "select features from", since our earthquake point file had no selection to start, and we wanted to identify the features which meet our defined relationship query.
|Setting the Target Layer(s)
In our earthquakes of California example, the question we are asking is: "Which of the 5,875 earthquakes represented in the point layer happened within the border of the State of California?" After we've established our selection method (create a new selection where there was not one before), we are ready to start placing the different parts of our relationship query into the Select by Location toolbox.
The first part of the question is asking "Which of the 5,875 earthquakes represented in the point layer...", since we are curious about the earthquakes. That would make the Target Layer the earthquakes layer, since the target layer(s) are those where we would like to make the selection - the target of our question, and we note the target layer by placing a check in the box in the appropriate place in the tool.
In our earthquakes of California example, we only have one target layer - the earthquakes, but the Select by Location tool can have many target layers and can select features from all of them at once. Maybe we have an earthquakes layers, a point layer for building locations, and a polyline layer for major freeways, all of which do not have a state field and we wish to know which of the features from all the layers fall within the borders of California (this is rather unlikely, but we will make it true for the purpose of the example). We could, in this case, place a check box next to the name of all three layers, making them the target of our relationship query. In turn, when the tool runs, it will select features from all three layers which meet the criteria set in the relationship query.
Like with the Select by Attribute dialog box, the option to “Only show selectable layers in this list” will limit the list to only layers which are marked as such in the List by Selectable portion of the Table of Contents.
|Setting the Source Layer
Continuing with our question, after we've set the target layer as the layer from which we'd like to have the selection made, the next portion we need to look at is "...the State of California?". It is a bit our of order to the way we phrased our question, but like we looked at with Select by Attribute, machine thinking and syntax structure isn't always the same as human thinking and the structure of the English language. It's our job, both with Select by Attribute and Select by Location, to set up the query in the way the software expects us to, not the software's job to try and figure out what you are trying to say because each and every time, you will lose and the software will not do what you want it to do.
Within the syntax and structure of the Select by Location tool, the Source Layer is the "selector" layer, as in it is the layer for which the relationship with the target layer is established. We would like to select from the earthquakes those which happened in the State of California, so the earthquakes are the target and the US_States is the source. When we look at the relationship types, or the Spatial Selection Method, this relationship will be further solidified by the wording of the tool.
In Section Four of this chapter, we learned about how we can use selections once they are make within attribute tables, and one of those was Limit the Input Features for a Geoprocessing Tool. Technically, Select by Location is a geoprocessing tool, as it processes spatial data via a tool, so making a selection prior to running the tool can be helpful.
In the case of our earthquakes in California example, we see that the polygon layer is not made up of just California, but of all the US_States (cleverly noted by the name of the layer), meaning we need to limit the tool to run only for the State of California. We see that in the US_States layer, there is a selection of one feature - the State of California. Having just the one state selected will help limit the input features of the geoprocessing tool (Select by Location), which is noted by placing a check mark in the box.
Just below the Source Layer dropdown, we see a check box which states Use Selected Features. This tells the tool to not use all the the states, but limit the relationship query to only the State of California. When we look at the US_States attribute table, we see that there is (1 out of 51 Selected), and that one feature is California. Below the Source Layer dropdown (where the US_States are set), the Use Selected features box notes (1 feature selected), referring to the State of California. The Select by Location tool both recognizes and honors a selection made in the Source Layer.
|Understanding the Spatial Selection Method
The last part of the question we need to address is "...happened within the border of...", referring to which earthquakes occurred within the boundary of the State of California. When you examine the picture at the start of this section, you can see that some of points land inside and some outside of the boundary of the State of California. The Select by Location tool needs you to establish what the Spatial selection method for the target layer feature(s), meaning the tool needs to understand how you would like the source and the target layer to interact in order to select features within the target layer. The default is "intersect the source layer feature", meaning that in order to select features (rows) in the target layer, those features must intersect the features in the source layer.
In our earthquakes of California example, we have set the tool to examine what points in the earthquake layer intersect the polygon which makes up the State of California. There are other spatial selection methods, as explained in the table below. This, again, is not a chance for you to memorize how the tool works, but understand that the tool explores spatial relationships between the target and source layers in many different ways.
|Apply a Search Distance
In addition to the spatial selection methods available, you can apply an additional search distance to the relationship to extend the spatial search beyond the definite boundaries of the source layer. For example, if you wanted to know not only which earthquakes occurred within the State of California, but which ones happened within the state AND within a mile of the state border, you could apply a search distance.
If you ever run the tool and you notice the selection happened, but so did a whole bunch that surround the intended selection, you most likely have checked this box. Whenever the tool outcome was unexpected, examine all of the tool inputs, as ArcMap will always do exactly what you tell it to do. If the outcome was not as you anticipated, you most likely told it to do something you didn't want it to do and you need to find that mistake in the tool input. This is true for all things ArcMap, as you will discover over and over in the course of lab.