Donate profile for Rudi at Stack Overflow, Q&A for professional and enthusiast programmers

03/07/2016

383 views

Which Search Algorithm? - Part 2


Which search algorithm to use?

In the first installation of Which Search Algorithm, I covered exhaustive and binary searches. I also briefly went through the required knowledge for getting the best out of these articles. This time I will be covering depth first and breadth first searches, so let's jump straight into the subject.


Depth and breadth first searches are performed on graph structures. For this article I am going to assume we are dealing with an undirected graph in it's most abstract sense. We wont assume that we are using a graph of any specific type. As an example, if we assumed we were only dealing with trees we could make assumptions about not having to deal with cyclic paths. By assuming that anything goes in our graph we can theoretically write searches that will work on more varieties of graph.


Before we talk about both searches we had better define our nodes so that we know what we are dealing with. I will leave out getters and setters for brevity (please remember encapsulation in your own code), as well as any validation we might want to do when constructing the nodes.


public class Node {
    public int value;
    public Set<Node> connections = new HashSet<>();
}


Depth First Search


Performing a depth first search simply means following possible edges (or connections) in the graph to the end of the path first before going back and searching alternative routes. It is the computer science equivalent of looking for your keys by thoroughly searching each cupboard, draw, or shelf in a single room before moving onto the next room. Below are some simple examples of depth first searches.


Node depthFirstSearch(Node root, int value){
    Stack<Node> stack = new Stack<Node>();
    Set<Node> visited = new HashSet<Node>();
    if(root.value == value) return root;
    visited.add(root);
    stack.add(root);
    while(!stack.isEmpty()){
        Node node = stack.pop();
        for(Node child : node.connections){
            if(child.value == value){
                return child;
            } else if(!visited.contains(child)){
                stack.add(child);
                visited.add(child);
            }
        }
    }
    return null;
}

Essentially the way this works is we keep track of each node we look at and then we add the node's connections into a stack to check later. Because stacks are last-in-first-out (LIFO) we end up dropping down a layer in the graph each time we check the next item in the stack. This is what it means to search depth first. Take a look at the graphic below for a visual representation of how it works. Try searching for a number in the graph and watch how the search is performed.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 You are using an awful browser and it does not support inline SVG.

Breadth First Search


Running a breadth first search is the opposite approach to searching depth first. When you search breadth first you check all nearby nodes first before wandering off down any particular path. Where a depth first search goes as far as it can away from your starting position before coming back to check a different path, a breadth first search checks everything close by first before moving onto nodes that are farther away.


There is also an interesting and useful side effect of a breadth first search. It can be used to identify the shortest path from your starting position (or root node), to the node you are looking for. As such, breadth first searches are often used for path-finding. Below are some simple examples.


Node breadthFirstSearch(Node root, int value){
    Queue<Node> queue = new LinkedList<Node>();
    Set<Node> visited = new HashSet<Node>();
    if(root.value == value) return root;
    visited.add(root);
    queue.add(root);
    while(!queue.isEmpty()){
        Node node = queue.poll();
        for(Node child : node.connections){
            if(child.value == value){
                return child;
            } else if(!visited.contains(child)){
                queue.add(child);
                visited.add(child);
            }
        }
    }
    return null;
}
Note: The JavaScript implementation of Breadth First Search above is not efficient. The use of the .shift() method is slow ( O(n) rather than O(1) ). I have used a JavaScript array for brevity but I suggest finding a correct queue implementation for your own code.

If you were paying attention to the above examples you may have noticed that they look almost exactly the same. That's because they are almost exactly the same. The only difference in the implementation is that the breadth first search uses a queue instead of a stack for storing nodes to visit next. This way the first nodes you check get visited next (fist-in-first-out) and so you always search by spreading out from where you started. This is ultimately what it means to search breadth first. There is another graphic below demonstrating this. Try searching for a number.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 You are using an awful browser and it does not support inline SVG.

Which One Should We Use?


Determining whether you want to use a depth first or breadth first search depends on the nature of your data as well as the kind of information you want to find. Imagine a family tree expressed as a graph with yourself as the root. If you wanted to perform a search for a living but distant relative you will be wanting to perform a breadth first search. If, on the other hand, you wanted to find historic ancestors from hundreds of years ago a depth first search makes much more sense.


If you have completely unstructured data with no pattern to the depth or breadth it doesn't make a great deal of difference which search type you choose (both have a complexity of O(n+e) where n is the number of nodes and e is the number of edges/connections). Conversely, knowing a little about the structure of your graph can tell you how you should be searching it. There is no hard and fast rules for choosing, it is about knowing your data, and knowing what you are looking for. This stackoverflow answer offers some good rules of thumb.




Rudi Kershaw

Web & Software Developer, Science Geek, and Research Enthusiast