Relational databases store information in tables — in columns that describe aspects of an item and rows that tie together the columns. The SQL In clause is an example of a SQL operation that addresses a specialized programming need: To iterate through collections (sets of related data). To illustrate, let’s consider the offerings of a large department store, described by the following SQL:
CREATE TABLE superstore (
dept VARCHAR(20),
kind VARCHAR(20),
item VARCHAR(50)
);
We add sample data:
INSERT INTO superstore
VALUES
( 'bedroom', 'mattresses', 'Happy Fluffy Dreams' ),
( 'bedroom', 'bedding', 'Comfy Comforter' ),
( 'bedroom', 'bedding', 'Exceedingly Firm Pillow' ),
( 'kitchen', 'cookware', 'Comically Huge Slotted Spoon' ),
( 'kitchen', 'cookware', 'Overly Heavy Cast Iron Pan' ),
( 'kitchen', 'scullery', 'Optimistic Motivational Poster' ),
( 'kitchen', 'scullery', 'Humorous Hand Towels' );
Notice the top-down hierarchy: the store is divided into departments, which in turn groups their offerings by kind, into which the products are organized.
Using or And = to Build a Collection
The canonical way of finding all products of one kind or another is to have a WHERE
clause with an equality test for each possibility, each test connected with a boolean OR operator (because we want to find out if any of the options match the object kind). The SQL to evaluate whether a value is either 'cookware'
or 'scullery'
is:
SELECT
*
FROM
superstore
WHERE
kind = 'cookware' OR kind = 'scullery' ;
dept | kind | item |
kitchen | cookware | Comically Huge Slotted Spoon |
kitchen | cookware | Overly Heavy Cast Iron Pan |
kitchen | scullery | Optimistic Motivational Poster |
kitchen | scullery | Humorous Hand Towels |
Using IN to Build a Collection
“Syntactic sugar” — syntax within a programming language designed to make the code easier to read, express, and maintain over the code lifetime — assists with this task, as the following SQL shows the repetitive equality tests linked by OR
replaced by the IN
operator:
SELECT * FROM superstore;
SELECT
*
FROM
superstore
WHERE
kind IN ('cookware', 'scullery');
Writing, maintaining, and reading such a list — ('a', 'b', 'c', 'd')
— is much less prone to programmer error than the repeated kind = 'a' OR
boolean test blocks.
Using AND and != to Build an Anti-Collection
Exactly opposite from the previous examples, now we concentrate on building an anti-collection — elements that are not members of the specified set. Using a combination of AND
and !=
(not equals) our SQL looks like this:
SELECT
*
FROM
superstore
WHERE
kind != 'cookware' AND kind != 'scullery' ;
dept | kind | item |
bedroom | mattresses | Happy Fluffy Dreams |
bedroom | bedding | Comfy Comforter |
bedroom | bedding | Exceedingly Firm Pillow |
Using NOT IN to Build An Anti-Collection
Continuing to show how syntactic sugar can make code easier to read and maintain, this is how NOT IN
simplifies the previous code:
SELECT
*
FROM
superstore
WHERE
kind NOT IN ('cookware', 'scullery');
Using IN With a Subquery to Dynamically Fetch Values
To extend our example, consider our store’s website allows visitors to specify their purchasing interests. The following code creates a preferences table and populates it with values for an enthusiastic home cook:
CREATE TABLE preferences (
interest VARCHAR(20)
);
INSERT INTO preferences
VALUES
( 'cookware' ),
( 'scullery' );
In the examples above, the kind IN (‘cookware’, ‘scullery’) action has hard-coded values. The preferences table allows for a dynamic value look-up, through a “subquery” (a SQL Select which returns values to its parent SQL Select.
To build a customized collection for this visitor, in order to display enticing products, we use IN
with a subquery:
SELECT
*
FROM
superstore
WHERE
kind IN (SELECT interest
FROM preferences );
Using a subquery allows for a clean separation of product and preference, decreasing chances of ill-considered side-effects when either are changed in future code revisions.
dept | kind | item |
kitchen | cookware | Comically Huge Slotted Spoon |
kitchen | cookware | Overly Heavy Cast Iron Pan |
kitchen | scullery | Optimistic Motivational Poster |
kitchen | scullery | Humorous Hand Towels |
The Difference Between IN and EXISTS
In the examples above, the IN
operation iterates over a small list of elements, whether hard-coded or provided by a subquery, which are each examined in turn. When the list becomes large, performance falters. The EXISTS
operator provides a higher-performing alternative.
While EXISTS
is not exact syntactic sugar for IN, it can be used to a similar effect. EXISTS
returns values only if the subquery comparison is true, resulting in better performance over very large datasets.
In a fashion similar to SQL table joins, a column of common values is needed. In the following example, the connection between product and preference is the WHERE
kind = interest comparison.
SELECT
*
FROM
superstore
WHERE EXISTS (SELECT idea
FROM preferences
WHERE kind = interest );
Conclusion
Covered here are ways of creating collections and anti-collections by the parallel methods of:
IN
vs=
andOR
NOT IN
vs!=
andAND
Also covered are the concepts of “syntactic sugar” and the benefits of using specialized keywords (like IN
and EXISTS
) to enhance the reliability of code and decrease the long-term maintenance costs. Lastly, the relationship between the IN
and EXISTS
clauses is demonstrated.
To learn more about SQL, check out other SQL blog posts and enroll in our SQL Nanodegree program.
SQL Sample Table Code
The following will build and populate the sample data referred to in this blog post. Test SQL interactively at SQL Fiddle or ExtendsClass.
CREATE TABLE superstore (
dept VARCHAR(20),
kind VARCHAR(20),
item VARCHAR(50)
);
INSERT INTO superstore
VALUES
( 'bedroom', 'mattresses', 'Happy Fluffy Dreams' ),
( 'bedroom', 'bedding', 'Comfy Comforter' ),
( 'bedroom', 'bedding', 'Exceedingly Firm Pillow' ),
( 'kitchen', 'cookware', 'Comically Huge Slotted Spoon' ),
( 'kitchen', 'cookware', 'Overly Heavy Cast Iron Pan' ),
( 'kitchen', 'scullery', 'Optimistic Motivational Poster' ),
( 'kitchen', 'scullery', 'Humorous Hand Towels' );
CREATE TABLE preferences (
interest VARCHAR(20)
);
INSERT INTO preferences
VALUES
( 'cookware' ),
( 'scullery' );