Every day, billions of queries are entered into the Google search bar. We’re constantly looking for something — the definition of a word, the bus schedule, the lyrics to that certain song. However, in the realm of computer science the term “search” has a slightly broader meaning. With search algorithms, we’re often not looking for a single item, but for a series of steps: the optimal strategy for solving a given problem. That fact is what makes this family of algorithms so central to so many different applications.

What Are Search Algorithms?

Search algorithms seek to answer a question by exploring a defined domain. Let’s say that we want to look up someone’s phone number. The domain, or search space, is the set of all possible answers to our question — in this case, all the entries of the phone book. Clever search algorithms employ strategies to reduce the search space as much as possible, narrowing down the set of possible solutions (which can number in the millions) to a small subset. In our phone book example, the data is one-dimensional and ordered which, as we will see, is about the most easily manageable of the various types of search spaces.

Because of the importance of structuring the data to be explored, search is closely linked to sorting algorithms. To learn more about sorting algorithms, have a look at our Sorting Algorithms Explained blog post.

Examples of Searches

Let’s look at a couple of common problems that require smart solutions because of the dimensionality of their search space.

Timetable Optimization

A university campus has a large number of rooms whose use must be coordinated to satisfy several constraints, such as the timetables of professors and students or the compatibility of certain classes. Because there are so many possible combinations of courses and rooms, the search space can quickly become too large to be searched with exhaustive methods.

Pathfinding

Here’s an even harder problem. Route planning systems like bbbike work by suggesting the best path from one point to another. Note that “best” doesn’t always mean fastest — it could also mean the calmest with the least traffic, or one that leads you through a park. In practice, route planning has to convert the messy reality of a geographical entity, such as a city, into a graph of nodes and edges in order to turn it into a searchable space.

Integer Factorization

In factorization, we break down an integer into the product of prime numbers. Cryptography exploits the complexity of this problem. RSA encryption keys consist of large integers which are the product of two prime numbers. This process is easy to do in one direction — simply pick two very large primes and multiply them — but hard to reverse. The higher the integer, the larger the search space. However, greater computing power and the invention of new algorithms allow for the decryption of increasingly large encryption keys.

How Do Search Algorithms Work?

The simplest search algorithm imaginable looks at all possible combinations and applies each one to the item sought. The complexity of this algorithm exactly equals the size of the search space. In the example of the unordered library, we might get lucky and find the book that we are looking for early on, but in the worst case, we would have to go through all the books until we reach the very end of the search space. This method is called brute force, and while it is sometimes the only feasible solution to a problem, it can hardly be called an efficient strategy. So let’s turn our attention to some of the more interesting algorithms out there.

Search Algorithms You Should Know

Binary Search

The binary search algorithm is often taught in computer science classes due to its simplicity and effectiveness: It exploits the structure of ordered data. Let’s return to our phone number example. The phone book is ordered alphabetically; let’s say the person we want to call is our friend Eli. Binary search works by recursively splitting the data into subsets of equal size, then comparing the results to the search index. So we open our phone book in the middle and see that we have landed at the letter J. Since the letter E comes before that, we can ignore the second half and zoom in closer on the first half. We open it again, now at the letter D. This time, we ignore the first half and continue with the second half. We repeat this strategy until we find the name we’re looking for.

Because of the binary nature of the search, the complexity is reduced to O(log n). This means that for a phone book consisting of 1,000 entries, we need to submit no more than 10 queries before we are guaranteed to find the page with Eli’s number. And if we know that our data is uniformly distributed, we can use an even faster version of this algorithm called interpolation search. In this search strategy, the algorithm exploits the uniform distribution by making smarter guesses about the position of the search term. We don’t split the search space right down the middle, instead choosing areas closer to the actual position we seek.

The A* Algorithm

Let’s think about the much harder problem of pathfinding. Since spatial data is at least two-dimensional, the search space grows much faster than with our phone book. However, we’re used to our mobile phones recommending reasonably good routes in seconds. How do they do that?

Truth be told, they use a combination of different algorithms and other strategies to solve this difficult problem — but at the heart of most pathfinding systems is a variant of the A* algorithm. Pathfinding’s difficulty lies in the fact that we want the optimal route between two nodes without having to explore all possible routes. A* employs two strategies that, used together, considerably narrow down the search space.

Whenever the algorithm decides on the next node to visit, it computes an estimate of the candidate nodes’ distance to the target. In addition, A* uses the edge-weight method made popular by Dijkstra’s algorithm. In this method, not all edges have the same cost; some will have lower costs (for instance, a well-paved and rarely used road), while some will be assigned higher costs (perhaps a road riddled with potholes that’s always crowded during rush hour).

If you’d like to know more about this algorithm and to see some interactive examples of it in action, have a look at the A* tutorial from Red Blob Games.

Why Are Search Algorithms Useful?

We’ve just outlined some ways in which search algorithms can be useful. In general, they help us process our data more efficiently. This is true for the data-rich environment of the internet, too, where it is increasingly important to be found. In fact, an entire industry has devoted itself to promoting websites to the top of a Google search — search engine optimization (SEO). In SEO, it is the job of the content and SEO teams to optimize websites in a way that will attract more people.

If you’re interested in learning more about SEO, check out our Digital Marketing Nanodegree.

Summary

In this tutorial, we learned about the importance of search in the computer science domain. We looked at the ubiquity of search algorithms in web and mobile applications and at how it extends even to the physical world. We explored search-space reduction and covered two of the most popular search algorithms, binary search and A*.

If you’re interested in learning more about algorithms, enroll in Udacity’s Introduction to Programming Nanodegree.

Start Learning