The question comes from the online course Algorithm Design and Analysis (Part II) in coursera. It is an all-pairs shortest path problem. As mentioned in the wikipedia, a more straightforward solution with Floyd–Warshall algorithm takes \(O(N^3)\) complexity, and the more computational efficient approach is to use a combination of Dijkstra’s algorithm, Bellman-Ford algorithm, and Johnson’s algorithm, which chould decrease the complexity to \(O(N^2logN)\). I implement the latter one for more interesting and challenging. There are several obstacles make the implementation a little bit trickier than I thought.
I have tried two or three different versions of Dijkstra’s algorithm while solving other math puzzles, and this time I use the built-in heapq function in Python. Keeping the “about-to explore” nodes in order and extracting the smallest/largest cost one in every iteration are the essense of this algorithm. But the cost to every node has to be updated when a node moves from “about-to explore” to “fully explored” category. At first, I just remove the original value and push the updated one into the heap. Because the heapq has no updated or remove methods, I can only use the remove mehtod of the list. This works fine for small amount of nodes (200) but the heap cannot keep right struture when the data becomes just a little bigger (500). Then I have to heapify it every time I update a “about-to explore” node.
The Bellman-Ford algorithm is more expensive compared with Dijkstra’s algirhtm if the graph is densily connected, but it deals with negative edge cost. This is also the one that helps me fully appreciate the dynamic programming.The implementation is intuitive and straightforward. For every vertex, the brutal force search performs to find the current optimal solution based on the previous knowledge. The optimization for space complexity is to only store the most recent result–keep an 2*N array and use a pointer filp-flop in every iteration is more effecient than keeping two 1*N array, discarding the older one and reclaiming a new one in every iteration. Aother optimization I make is to store the “about-to explore” nodes in a bucket, just as what Dijkstra’s does, which eliminates plenty of unnecessary calculation. But its tricky aspect is that because this algorithm is computing “distributed”, different from the “centralized” spanning of Dijkstra’s, if several “exploring” vertices point to a same “about-to explore” vertex, only the optimal cost should be kept.
Finally, the two algorithms above are connected by the Johnson’s algorithm. Dijkstra’s algorithm cannot deal with negative edge costs, but much more fast than Bellman-Ford especially in the case of “all-pairs”. Furthermore, the negative costs cannot be got rid of by adding uniformly for every edge. Johnson’s algorithm neatly solves it by adding pseudo-node to find weights of vertices and changing values of edges concerning their connected vertices. There is no hidden obstacle in the implementation. Only the special care should be taken for manipulating the nodes. It reminds me coding in C…
The overall algorithm solves the question and runs kind of standard. Still, I do not satisfy with the update procedure of heap. I will write a new version of this data structure in the near future and see if it can provide significant boost for the performance.
p.s. The vistualization of code block is optimized from the original Octopress style to Github-style based on the tutorial.
p.p.s. The inline latex-style formula comes with Kramdown, MathJax and this tutorial. Besides, there is another minor bug to deal with.