Anonymous asked in Computers & InternetProgramming & Design · 8 months ago

Help with Java Programming?

Does anybody know how to write a Java code to implement a depth first search (DFS) algorithm to find the largest file on your hard drive, and then how to verify that the implementation of your algorithm works correctly and the largest file can indeed be identified, a breads first (BFS) search algorithm needs to be implemented in the unit test.

1 Answer

  • 8 months ago

    The breadth-first and depth-first programs are nearly the same--especially if you implement them both with loops instead of recursion.  In pseudocode, they both look like:

        Create an empty "to-do list" of directories to search.

        Add the root of the filesystem to the "to-do list".

        While the "to do list" is not empty:

              Remove the first directory from the "to-do list".

              For each file in that directory:

                  See if it's the largest file seen so far.

              End for

              For each subdirectory in that directory:

                   Add the subdirectory to the "to-do list".

              End For

         End While

    The only difference is that in a BFS, the "to do list" is a queue data structure, and in a DFS it's  a stack instead.

    For a recursive DFS, simply replace the "add to the to-do list" action with a recursive call to immediately scan the subdirectory.

    The implementation in Java depends on whether you're using methods of the classic class to scan a directory, or using paths and streams from the java.nio.file package introduced in Java 7.  Your instructor should have shown you how to list then entries in a directory.

    A couple of suggestions for implementing this:

    1. Even though I wrote the body of the "while" loop above as two separate for-loops, you should implement this with a single scan of the directory being searched with if/else to test whether the entry is a file or a subdirectory.  That will cut the number of system calls roughly in half.

    2. Design the top-level interface to scan starting at any given directory; and use a fairly small subdirectory tree for testing.  Run times can be quite long on a large HD or SSD, and *much* longer on a networked filesystem.

Still have questions? Get answers by asking now.