Parallel Path Traversal

We are not sure that this one exists but we have used the following parallel traversal to speed-up Corrade's initial folder refresh.

The traditional concept is to use a queue for a breadth-first traversal and a stack for a depth-first traversal. For both kinds of traversals each folder is either popped or dequeued one element at a time and then multiple folders are pushed, respectively, enqueued to be visited next. Given a certain amount of processors, it is possible to pop or dequeue a slice of the stack, respectively, queue and process the waiting folders in parallel.

fuss_algorithms_trees_traversal_parallel_queue_overview.svg

If you want to follow the analogy of queues or stacks, the parallel processing occurs when, queued at a store, there are several tellers that can serve clients instead of a single teller.

In the following example, we will use a queue - but exactly the same idea applies to stacks:

// Create the queue of folders.
Queue<InventoryFolder> inventoryFolders = new Queue<InventoryFolder>();
// Enqueue the first folder (root).
inventoryFolders.Enqueue(Client.Inventory.Store.RootFolder);
 
// Create a list of semaphores indexed by the folder UUID.
Dictionary<UUID, AutoResetEvent> FolderUpdatedEvents = new Dictionary<UUID, AutoResetEvent>();
object FolderUpdatedEventsLock = new object();
EventHandler<FolderUpdatedEventArgs> FolderUpdatedEventHandler = (sender, args) =>
{
    // Enqueue all the new folders.
    Client.Inventory.Store.GetContents(args.FolderID).ForEach(o =>
    {
        if (o is InventoryFolder)
        {
            inventoryFolders.Enqueue(o as InventoryFolder);
        }
    });
    FolderUpdatedEvents[args.FolderID].Set();
};
 
do
{
    // Dequeue all the folders in the queue (can also limit to a number of folders).
    HashSet<InventoryFolder> folders = new HashSet<InventoryFolder>();
    do
    {
        folders.Add(inventoryFolders.Dequeue());
    } while (!inventoryFolders.Count.Equals(0));
    // Process all the dequeued elements in parallel.
    Parallel.ForEach(folders.Where(o => o != null), o =>
    {
        // Add an semaphore to wait for the folder contents.
        lock (FolderUpdatedEventsLock)
        {
            FolderUpdatedEvents.Add(o.UUID, new AutoResetEvent(false));
        }
        Client.Inventory.FolderUpdated += FolderUpdatedEventHandler;
        Client.Inventory.RequestFolderContents(o.UUID, Client.Self.AgentID, true, true,
            InventorySortOrder.ByDate);
        // Wait on the semaphore.
        FolderUpdatedEvents[o.UUID].WaitOne(Configuration.SERVICES_TIMEOUT, false);
        Client.Inventory.FolderUpdated -= FolderUpdatedEventHandler;
        // Remove the semaphore for the folder.
        lock (FolderUpdatedEventsLock)
        {
            FolderUpdatedEvents.Remove(o.UUID);
        }
    });
} while (!inventoryFolders.Count.Equals(0));

The principle is that whilst dequeueing a folder is fast, the actual processing of the folder contents take a considerable long time such that we spawn multiple threads to process the folders in parallel. Note that the queueing and dequeueing could also be done in parallel but that the important part is that the elements in the queue get processed in parallel.


fuss/algorithms/path_traversal/parallelism.txt · Last modified: 2017/02/22 18:30 (external edit)

Access website using Tor Access website using i2p


For the copyright, license, warranty and privacy terms for the usage of this website please see the license, privacy and plagiarism pages.