03/07/2016

697 views

# Which Search Algorithm? - Part 2

In the first installation of Which Search Algorithm, I covered exhaustive and binary searches. I also briefly went through the required knowledge for getting the best out of these articles. This time I will be covering depth first and breadth first searches, so let's jump straight into the subject.

Depth and breadth first searches are performed on graph
structures. For this article I am going to assume we are dealing with an undirected graph in it's most abstract sense. We wont assume that we are using
a graph of any specific type. As an example, if we assumed we were only dealing with trees we could make assumptions about not having to deal
with cyclic paths. By assuming that *anything goes* in our graph we can theoretically write searches that will work on more varieties of graph.

Before we talk about both searches we had better define our nodes so that we know what we are dealing with. I will leave out getters and setters for brevity (please remember encapsulation in your own code), as well as any validation we might want to do when constructing the nodes.

- Java
- JavaScript
- Python

```
public class Node {
public int value;
public Set<Node> connections = new HashSet<>();
}
```

## Depth First Search

Performing a depth first search simply means following possible edges (or connections) in the graph to the end of the path first before going back and searching alternative routes. It is the computer science equivalent of looking for your keys by thoroughly searching each cupboard, draw, or shelf in a single room before moving onto the next room. Below are some simple examples of depth first searches.

- Java
- JavaScript
- Python

```
Node depthFirstSearch(Node root, int value){
Stack<Node> stack = new Stack<Node>();
Set<Node> visited = new HashSet<Node>();
if(root.value == value) return root;
visited.add(root);
stack.add(root);
while(!stack.isEmpty()){
Node node = stack.pop();
for(Node child : node.connections){
if(child.value == value){
return child;
} else if(!visited.contains(child)){
stack.add(child);
visited.add(child);
}
}
}
return null;
}
```

Essentially the way this works is we keep track of each node we look at and then we add the node's connections into a stack to check later. Because stacks are
last-in-first-out (LIFO) we end up dropping down a layer in the graph each time we check the next item in the stack. This is what it means to search **depth
first**. Take a look at the graphic below for a visual representation of how it works. Try searching for a number in the graph and
watch how the search is performed.

## Breadth First Search

Running a breadth first search is the opposite approach to searching depth first. When you search breadth first you check all nearby nodes first before wandering off down any particular path. Where a depth first search goes as far as it can away from your starting position before coming back to check a different path, a breadth first search checks everything close by first before moving onto nodes that are farther away.

There is also an interesting and useful side effect of a breadth first search. It can be used to identify the shortest path from your starting position (or root node), to the node you are looking for. As such, breadth first searches are often used for path-finding. Below are some simple examples.

- Java
- JavaScript
- Python

```
Node breadthFirstSearch(Node root, int value){
Queue<Node> queue = new LinkedList<Node>();
Set<Node> visited = new HashSet<Node>();
if(root.value == value) return root;
visited.add(root);
queue.add(root);
while(!queue.isEmpty()){
Node node = queue.poll();
for(Node child : node.connections){
if(child.value == value){
return child;
} else if(!visited.contains(child)){
queue.add(child);
visited.add(child);
}
}
}
return null;
}
```

If you were paying attention to the above examples you may have noticed that they look almost exactly the same. That's because they
*are* almost exactly the same. The only difference in the implementation is that the breadth first search uses a queue instead of a stack for storing nodes to
visit next. This way the first nodes you check get visited next (fist-in-first-out) and so you always search by spreading out from where you started. This is
ultimately what it means to search **breadth first**. There is another graphic below demonstrating this. Try searching for a number.

## Which One Should We Use?

Determining whether you want to use a depth first or breadth first search depends on the nature of your data as well as the kind of information you want to find. Imagine a family tree expressed as a graph with yourself as the root. If you wanted to perform a search for a living but distant relative you will be wanting to perform a breadth first search. If, on the other hand, you wanted to find historic ancestors from hundreds of years ago a depth first search makes much more sense.

If you have completely unstructured data with no pattern to the depth or breadth it doesn't make a great deal of difference which search type you choose
(both have a complexity of **O(n+e)** where `n` is the number of nodes and `e` is the number of edges/connections). Conversely, knowing a
little about the structure of your graph can tell you how you should be searching it. There is no hard and fast rules for choosing, it is about knowing
your data, and knowing what you are looking for. This stackoverflow answer offers
some good rules of thumb.