In Select by Attribute Part One, we looked at all the components of the Select by Attribute tool dialog box, really trying to understand the basics of SQL and how that plays a role in creating logical expressions, resulting in the features being selected in attribute and non-spatial data tables. In this section, we will look at how to build a proper simple SQL expression, how to combine simple expressions to create complex ones, and some special case expressions we need to use to perform some specific tasks.
5.7.2: Simple Expressions
In the previous section, the established structure of simple SQL expressions were introduced as <field name><comparison operator><value>, where the expression queries the table for a value within a specific field. The comparison operator can be either be an exact or an approximate operator, meaning that within a stated search field, the returned value must be exactly like it's presented in the expression or approximately so, meeting the characters as typed and the wild card characters where appropriate.
Simple expressions are simple because they are used only to return one exact value from one field (regardless of how many times that value appears) or one set of approximate characters and wild cards, and will always follow the established structure. When it comes to learning SQL, simple expressions are definitely easier to learn, since that established structure consists of only three parts - what field to search in, what relationship the value presented in the equation has with the desired returned value, and the actual value itself.
Building Simple SQL Expressions
We’ve looked at all the variables individually, so now let’s put it all together and build some SQL expressions. To return the feature “Massachusetts” withing a States layer:
✔ Again, this section is not to memorize how to the skill, but to watch a demonstration of the main concept behind creating simple SQL expressions.
Result: “State_Name” = ‘Massachusetts’
5.7.3: Complex Structured Query Language Expressions
While it is possible to create simple query after simple query, utilizing the selection method dropdown, there is a faster an more efficient way of creating expressions when more than one value is known to be a contributing factor in the decision to select specific features, and we refer to these as complex SQL queries. Complex Queries can be defined as “two or more simple queries connected by a logical operator, such as AND or OR”, meaning a complex query can only exist if a series of simple queries are strung together.
In-N-Out Burger in California has a pretty simple menu: a hamburger (one patty, no cheese), a cheeseburger (one patty and one slice of cheese), and the Double-Double (two patties and two slices of cheese). They also serve fries, three kinds of shake (strawberry, chocolate, and vanilla), a drink (small, medium, and large). The extremely (and famously) limited menu hasn't changed since 1948 with the exception of lemon aid in 2003 and hot chocolate in 2018. Lemon aid was served long before 2003, but it simply wasn't printed on the menu itself. Because of the extremely limited menu, it makes for a good example of learning SQL. Few choices equate to an easy explanation.
If you think of SQL as a language (which is is), and a simple query as a basic sentence, a complex query must be several simple sentences connected by logical operators. In English, we might say “May I have an order of fries, a burger, and a Lemon-Up?”, which will return a burger, an order of fries, and a Lemon-Up. The language of SQL has its own rules: each item must be ordered with a complete sentence in order to return an answer. If your server only spoke SQL, you would have to say: “SANDWICH= BURGER” AND “SIDE = FRIES” AND “DRINK = LEMON-UP”. Three complete, simple queries in order to return the meal you’d like to eat.
Just like you wouldn’t walk up to the counter at In N Out, say “BURGER!” “FRIES!” “LEMON-UP!”, and (reasonably) expect to get a burger in return, SQL doesn’t like a list of values. In SQL, it is totally and completely incorrect to say “MEAL = BURGER AND FRIES AND LEMON-UP”. In English, with a real person working the register, there is a fairly good chance you might get food if you just say a string of nouns, but as we all know, In N Out serves at least three kinds of burgers. Does BURGER! mean hamburger, cheeseburger, or Double-Double? Do you know the (no longer secret, thanks internet) menu? Did you mean 4x4 or grilled cheese? Did you want your burger Protein Style? What would your random shouts return?
If that counter person spoke only SQL, and you shouted a random string of nouns at them, there would be no chance of food and they would return an error. It is your job as the GIS technician to learn the rules of SQL and not SQL’s job to sort out what on Earth you are talking about.
When we learned the definition of a simple query, we said it always followed the same format, and that is totally still true. When it comes to the word “operator”, however, it specifically means comparison operator, meaning the operator is comparing the values in the specifically stated field. If you used a “greater than” statement, the comparison operator is comparing all the values and sorting those which are above the stated number and those which are below, and returning those above ones. When you use an equal sign, the comparison operator is comparing the value in the expression to all the values listed in the given field, returning those which are exactly the same, character for character.
As we’ve said above, complex queries are a series of simple queries strung together with logical operators. Logical operators are operators which tell SQL that it needs to complete the entire expression to return the proper values. There are several logical operators, but in 101, we focus on “AND” and “OR”. In short, the OR operator means anything is true, while the AND operator means everything is true. If you create an SQL query which reads “BURGER = DOUBLE DOUBLE” AND “BURGER = CHEESEBURGER”, you’ve told SQL you want everything to be true and return one feature (row in the attribute table) where the burger is a Double-Double and a cheeseburger at the same time, it’s not possible. The sandwich has either a single patty and single piece of cheese, or two patties and two pieces of cheese. There is no such burger which has one patty/cheese slice and two patties/cheese slices at the same time (don’t just say: well, then it has 3 patties and 3 cheese slices. That is a whole new burger called a 3x3 - a different and unique value). When selecting values from the same field, it’s not possible for everything to be true. In our In N Out scenario, the only way to have our SQL-only speaking cashier return a cheeseburger and and Double Double at the same time is to use the OR statement - anything is true “BURGER = DOUBLE DOUBLE” AND “BURGER = CHEESEBURGER”, which would return to you a tray with a cheeseburger (the English) and a Double-Double.
When you are going between fields, you can use AND or OR, depending on if you want anything to be true or everything to be true. Let’s say we have several fields describing how we want our burger. If there is a a series of Boolean-type fields for lettuce, tomato, special sauce, grilled onions, raw onions, pickle chips, and Animal Style, we could make complex AND expressions to get the burger just the way we would want it.
For example, a burger with two patties, two slices of cheese, lettuce, tomato, and grilled onions would read “BURGER = DOUBLE DOUBLE” AND “LETTUCE = YES” AND “TOMATO = YES” AND “GRILLED_ONIONS = YES” AND “RAW_ONIONS = NO” AND “PICKLE_CHIPS = NO” AND “ANIMAL_STYLE = NO”. This expression assumes there is a row (feature) for every possible burger combination, since this AND expression would be designed to return only one row.
When it comes to databases, this isn’t exactly how the expressions work, as creating a row in a table for every possible combination would be silly and quite challenging. We use other tactics, such as joins, relates, and IF/THEN statements to combine data. Those are more complex ideas with databases, so for now, we will go with the assumption there are many, many rows with many, many combinations of data.
5.7.4: Logical Operators
We can break operators into two main categories (which we've been introduced to already): comparison operators, which compare what the value in the expression is to the available values in the attribute or non-spatial data table, and logical operators, which create a more complex relationship between two simple SQL queries in order to select a feature with several attributes found to be true or false. In the previous section, we looked at a table of comparison operators and examined how they compared the value given to the values in the table for a specific field, and in this section, we are going to take the same look at logical operators.
|AND||When building complex expressions, the AND logical operator is used to combine two (or more) simple expressions together to find a feature for which both (or more) expressions are true, meaning if an expression is built to find State Name = ‘California’ AND State Name = ‘Colorado’, the query will look for a feature named both California AND Colorado. AND is used to build complex expressions such as State Population greater than or equal to 65000 AND State Area less than 125000 - which will return only states with a population greater than or equal to 65000 AND have a total area less than 125000 square miles.|
|OR||OR is used to combine two (or more) simple expressions together to find features that meet any of the criteria defined, meaning if an expression is built to find State Name = ‘California’ OR State Name = ‘Colorado’, the query will look for a feature named either California OR Colorado.|
|NOT||Used to find values that are not (any of the available operators) to the value defined in the expression. Similar to the Not Equal to button, NOT can be used in front of an expression to make it not equal or not equal to or greater than the defined value, such as NOT State Name less than or equal to ‘California’ will return Colorado to Wyoming, alphabetically - which is the same as greater than . NOT, instead, is most often used in combination with LIKE, such as Owner Name NOT LIKE ‘Br%’, which will return every name in the list that does not start with “Br”.|
This AND That means both this and that must apply to one feature (both queries must be true for a single feature for it to be selected) while either This OR That means either this or that can apply to one feature (one or the other must be true for the item to be selected). For example, if we had a table that had one field for “Feature_Type” and another for “Feature_Status”, an example of an AND statement may be “Feature_Type” = ‘mine’ AND “Feature_Status” = ‘abandoned’ — the only features which would be returned are those where the feature is an abandoned mine. If the expression was “Feature_Type” = ‘mine’ OR “Feature_Status” = ‘abandoned’, any feature which was a mine or was abandoned would be returned (including both abandoned mines and abandoned houses and active mines.... since in all these cases both “mines” is true and (OR to the software) “abandoned” is true.)
|Figure 5.13: Logical Operators AND, OR, NOT|
When building a expression to return two or more values from one field, such as an expression to return both the state of California and the state of Colorado, we must remember SQL sees this as a This OR That situation, “State_Name” = ‘California’ OR “State_Name” = ‘Colorado’, since there will never be a feature in the table which has a state name of Colorado AND a state name of California. All states only have one name, and since we just learned AND expressions require all things in the expression to be true, there will never be a feature selected. According to SQL, there is nothing wrong with the statement based on the syntax (“State_Name” = ‘California’ AND “State_Name” = ‘Colorado’), as it meets the definition of a complex expression - two complete simple statements connected with a logical operator, however, it breaks the logical rules. The only way to return two features based on criteria from the same field is to use an OR statement - a complex statement where any of the criteria are true.
Notice the complex query isn’t written “State_Name” = ‘California’ OR ‘Colorado’, as that is not in line with the definition of a complex query.
|Figure 5.14: Using OR to Select Two Features from One Field|
|OR looks for features that meet any of the criteria||AND looks for a single feature which matches all the criteria|
When building complex queries, any sort of simple query can participate, as long as the logical operator is appropriate. Complex queries can be fairly simple, like our example of “State_Name” = ‘California’ OR “State_Name” = ‘Colorado’, or they can be very complex, utilizing multiple fields, return ranges, wild cards, sub-queries, and a whole lot more not covered in GIS 101.
Complex queries, per the definition, string together two or more simple queries, which mean you must build a simple query, then append to it another simple query - an entire other query.
5.7.5: Saving and Loading SQL Queries
There will be times where you have built a very complex query, verified it, and wish to save it for future use, to share it with a colleague, or in case of calamity, like ArcMap crashing or accidentally closing the window. After clicking Verify button , use the Save button to open the save dialog box, give your query a meaningful name and save location and click Save again to save the file.
If you have a saved query from an earlier session or that someone shared with you, use the Load button to locate the SQL query file (.exp) and populate the Expression Builder window. Once the query has loaded, any modifications you need to make can be done before clicking Apply button .
5.7.6: Special Cases with SQL Expressions
Special Case 1: Finding NULL and Blank Cell Values in a Table
When using SQL to find a string value, we learned we can use the expression Field_Name = ‘Value’ to return values that match the designated value exactly, however, there is a special case when this will not work. We learned in Chapter Four that databases do not like empty cells, and since ArcGIS follows database rules, there cannot be empty cells in ArcGIS tables either, so depending on the table type, cells are populated either with a display value of NULL value or an empty value (meaning there is a 'secret' null in the background, but for readability, the cell shows as blank).
Depending on which display value (as there is no "real" value in the cell, just a 'placeholder' of either " " or "NULL"), we use different SQL expressions to select features. When the cell displays "NULL", we need to use a special case SQL expression of Field_Name IS NULL or Field_Name IS NOT NULL, which basically tells ArcGIS to find the display value Null and not cells populated with the specific text “Null”. When the display value is blank (" "), we can follow the regular rule of "Field_Name" = ' '. It's okay right now that these might be tough to keep straight and you might not be sure which one to use. For now, the take away lesson of the display value cells is when you see a blank cell in ArcMap, it's not really empty. There are display values and each kind of display value is searched for in one of two ways.
Special Case 2: Using the NOT Logical Operator
NOT is a logical operator that will find the opposite of the simple query. For example, NOT “State_Name” = ‘Colorado’ will find all values except Colorado (notice the NOT operator comes before the field name).
We just learned that we use AND for complex queries that find records where all of the simple queries are true and OR to find when any of the simple queries are true. While this remains true when using the logical operator NOT, the way we use AND and OR may seem contradictory to the rule. When we use “positive” SQL queries and we need to find any values which are true, we use OR as the conjunction, especially when we are looking for two or more values in the same field. When we use “negative” SQL queries, we need to use AND to combine all of the simple queries together. For example, if we want to select all of the states except California and Colorado, the SQL statement: NOT “State_Name” = ‘California’ AND NOT “State_Name” = ‘Colorado’ is correct.
5.7.7: A Summary Table of Building SQL Queries