Section Six: Select by Attribute Part One - Understanding SQL Expressions
When you visit a Denny’s diner, the experience tends to be the same each time. After you arrive, you’re greeted and shown to a table. The host provides you with a menu and informs you that the server will be right with you. You take some time to look over the menu, examining the pictures and reading the descriptions of each dish. If you’re super hungry, you’re more likely to pick a dish based on the pictures, and if you’re less hungry, you are more likely to take more time reading the descriptions. After a few minutes "arc minute" 1/60th of a degree. Usually denoted by " , the server greets you and asks you the name of the dish you’d like, to which you answer with the name of the dish (or if you're really, really, really hungry, you just point at the picture and grunt in an affirming way). They write down the choice for each seat at the table on the ticket, presents the ticket to the kitchen, and return after a bit with your scrumptious choice.
Much like the menu at a diner, spatial data is a combination “pictures” (the points, polylines, and polygons we see in data view), item names (the layer name), and descriptions (the attributes). We add data to an ArcMap session, it is with the intent of examining the data and/or performing some sort of analysis of that data. It is also most likely that we need to examine the attributes for specific features in order to complete that analysis. Like we do when we order food at the diner, in GIS we look over the features, sometimes selecting a feature based on the picture (using Interactive Selection to select the feature's geometry on the map) and other times, selecting a feature based on the description (using Select by Attribute to select features based on the attributes). In this section, we will begin to look at the process of how we go about picking, or selecting, those specific features in the attribute table.
Back at the diner, when it came time for the server to turn the food ticket over to the kitchen, they didn’t just write down just anything on the ticket. The diner has an established manner of taking orders so the kitchen, which is always in a rush, can understand exactly what each person ordered, included special requests. If each server used their own made-up shorthand for menu items, it would cause chaos in the kitchen, leading to frequent mistakes.
We learned last chapter that the key to the definition of a geodatabase is the ability to quickly and efficiently retrieve spatial data utilizing the established relationships shared by the data. Since all the data we encounter in GIS follows the structure of (non-spatial data tables; shapefiles), or is already a member of a geodatabase (feature classes), the rules which apply to databases apply to all GIS data.
Much like the diner uses an established “language” to take and make order quickly and correctly, GIS utilizes a database electronic storage container with a top-down structure in which the items contained are related to each other and that relationship allows for the data to be quickly and efficiently queried and retrieved for use. language called Structured Query Language, or SQL (pronounced: See-Kwel) for short to accomplish the “quick and efficient data access” part. Simply put, SQL is a computer language used to query, or ask questions of, a database electronic storage container with a top-down structure in which the items contained are related to each other and that relationship allows for the data to be quickly and efficiently queried and retrieved for use. via an established structure of expressions.
In GIS, we refer to the process of creating SQL queries to select features in the attribute table as Select by Attribute, or using the attributes in a table to build expressions to return desired values. Since the structure used has been established, it is our job as GIS technicians to learn the basics of creating proper query expressions to most efficiently do our job.
While GIS uses SQL, it is not a language unique to GIS databases. This universal database electronic storage container with a top-down structure in which the items contained are related to each other and that relationship allows for the data to be quickly and efficiently queried and retrieved for use. language is beneficial to GIS because it allows a technician to work with any RDMS written in SQL across several GIS and non-spatial software suites. Since SQL is a huge language with lots of rules written for numerous spatial and non-spatial applications, we will only focus on the rules and syntax (the established structure of a computer language needed to execute commands) needed to complete our job as introduction to GIS students.
5.6.2: Simple SQL Expressions and the Select by Attributes Tool Dialog Box
All simple SQL table expressions follow the format: <using one of the four selection types> SELECT * FROM <layer/non-spatial data table name> WHERE: <field name><comparison operator><value> to query a table and return features. Within ArcGIS, the Select by Attribute dialog box automatically populates the SELECT * FROM <layer/non-spatial data table name> WHERE: portion of the SQL equation, effectively eliminating all other SQL table query types in ArcGIS. This means that as a GIS technician, the only portions of a simple SQL expression you need to complete (and memorize) are <using one of the four selection types> and <field name><comparison operator><value>
Like many ArcGIS tool dialog boxes, the Select by Attribute dialog box is designed to be read and populated from top to bottom. This makes it easy for new technicians to fill in the boxes properly, as the required information is presented in a logical and expected format. In future classes, you will learn how to bypass the dialog boxes and create Python expressions to really amp up your GIS abilities in terms of efficiency throughout a project, but for now, we will focus on the user-friendly (I promise) dialog boxes.
Knowing what we do so far (what is a spatial layer, the fact that SQL is like ordering off an established menu, the result of queries are selected/highlighted features on the map and in the table), let's look at the different parts of the Select by Attribute dialog box, not to memorize the steps, but to explore what goes into the tool and what the expected result of the tool might be. As we work our way through this section, each part of the tool dialog box will be equated back to the Denny's diner example to help better explain the purpose.
Only show selectable layers in this list:
Now that we’ve looked at how to list layers in the Table of Contents by Selection, we can understand the basis of the Only show selectable layers in this list check box. To reduce the "Layer" list size, layers which do not need to participate in a Select by Attribute can be set to non-selectable, and when used in combination with the “only show...” check box, the list can be dramatically reduced in size.
The Selection Method dialog box offers us four options when it comes to selecting data:
Create new selection
Selects only the feature for which the values fit the query criteria and clears any other features which may be currently selected
Most of the time, we are performing new selections, and that is what the dialog box defaults to.
Add to current selection
Selects additional features by means of another query and adds it to the list of selected features. Add to current selection can be used an endless amount of times within one table, well, at least until all the features are selected and there are no more to add to the list.
Remove from current selection
If there is a current selection (features highlighted in the table), remove from current selection will deselect any features which match a new query. This method would be used when you need to select some features, examine them, and decide which ones to move forward with by using another query. Multiple queries can be used sequentially to reduce the pool of selected features until only the final selection is left.
|Select from current selection
|Similar and opposite to remove from current selection, Select from Current Selection re-selects features you do want based on a new query instead of removing them from the list.
|The SQL Diner: Selection Method
When you are first seated, there is nothing to eat or drink on the table - nothing is selected. You're presented with the menu, which you take a minute to look over while waiting for the server. They will come over and initially take a drink order, a new selection you've made.
The sever brings out the drinks, and asks if you are ready to order the meal. You inform them that you are not quite ready to make the meal choice, but you’d like to order some mozzarella sticks. When the mozzarella sticks come out, they are placed in the center of the table, in addition to the drinks. The server doesn’t take the drinks away, they simply add the mozzarella sticks to your current selection, the drinks, and takes your order for the main dish.
Before the main dish arrives at the table, you’ve finished the mozzarella sticks and the server takes the plate away, removing the appetizer from the current selection, leaving just the drinks.
After the main dish arrives, which often is a plate consisting of two or three things, such as hash browns on the side of the omelet. Since it’s hardly possible (or polite) to eat everything at once, you need to pick what to eat first, or select something from the current selection.
We have already learned that columns in GIS are called fields, and the column headers are called the Field Name. The structure of a simple query is <field name><comparison operator><value>, which means SQL requires us to state to which field our query applies. The Select by Attribute tool dialog box addresses this portion of the expression with a list of all the field headers for the table, as defined by the "Layer" dropdown or from whatever table the dialog box was launched from, as we learned earlier that if the tool is launched from the table itself, the assumption is the query will apply to that table.
Comparison Operator Buttons
Sticking with the structure for a simple SQL query, <field name><comparison operator><value>, we have looked at where the field name list comes from within the Select by Attribute box (from the available fields within a specific attribute or non-spatial data table), so the next part is the comparison operator. Comparison operators are found in the Select by Attribute box as a series of buttons, along with the Logical Operators, a set of operators used when we create complex SQL expressions, the topic of the next section of the text.
With SQL expressions, comparison operators compare what you've asked the query to evaluate with what is actually in the table, aka look for some designated word within the table, and are broken into two main categories - those which query for exact values and those which query for approximate values. For example, if you were looking for a feature in a State layer's attribute table with the exact name of Colorado (SQL: "Name" = 'Colorado'), you would be asking the Select by Attribute tool to query the table (as to query is to ask a question, the question you are asking is "hey table, do you have a feature by the exact name of Colorado?"), looking for a value in the 'Name' field which is spelled exactly like you typed it - C-o-l-o-r-a-d-o.
Approximate comparison operators look for, well, approximate values. When you're not too sure of all the available values, but you're mostly sure of what might be in the table, instead of scrolling through the whole table and carefully looking for anything that might meet what you are looking for, you can use a wild card comparison operator instead. For example, if you had a table with 19,000 records of historic sites, and you are only interested in those which might have the word "Roman" in the feature name, you could use a wild card SQL expression such as "Historic_Site_Name" LIKE '%Roman%', which would return all the Roman Villas and the Historic Roman Sites.
Get Unique Values Button:
When we looked at the structure of attribute tables in an earlier section, we learned that for each cell in the attribute table (or non-spatial data table), there is only one value (or record). This means what you see is what you get. If a cell in the "Breakfast Items" field contains the value "Moons Over MyHammie", the only value in that cell is "Moons Over MyHammie", and nothing else. There are no hidden values, no way to make the table show anything else in that cell, and no alternate spellings in that cell (although, another cell may contain an alternate spelling). We call this value a unique value, meaning that it appears as that exact spelling, including capitals and spaces, at least once.
The Get Unique Values button looks for these unique spellings within a single field and lists them out for you in the Unique Values box. Since the button is getting unique values, even if a value appears more than once in a field, it only appears once in the unique values box. For example, if we were looking at the City_Name field for a USA cities layer, the city of Santa Fe would appear four times, since there are four Santa Fe's in the United States. However, if I had the Select by Attributes tool dialog box open and clicked the Get Unique Values button in relation to the City_Name field, the value "Santa Fe" would only appear once in the unique values box since the spelling, spacing, and capitalization is exactly the same for all four of the Santa Fe's in the United States. One unique value, four locations in the United States.
To simplify searching though possibly thousands of unique values, the Unique Value search box (labeled "Go To:" and seen to the right of the Get Unique Values button) will jump to anything you type. For example, if you were looking at the unique values for US State names and entered a “C” in the search box, the list will jump to “California”, and if you type “Co”, it will jump to “Colorado”, and if you type “Con” it will jump to “Connecticut”. You can search by typing as many letters as you need to get to the desired value or just the first letter and scrolling through the list.