Ben Fulton
Look at this graph. How wide is it?
It looks like S and V, or perhaps Y, are the nodes that are the farthest apart. If the numbers on the edges are accurate representations, the distance from S to Y and S to V should be the same, but if you add them up, you see that they aren’t. S to U to V sums up to 11, while S to X to Y is 7.
So our intuitive answer to the question of how wide the graph is is not quite right. We have to find the two nodes that are farthest apart. But if the graph has a loop, or cycle, you could just go around the loop forever and have two nodes that are infinitely far apart!
Let’s say instead that the distance between two nodes is always the shortest possible distance. Then the width, or diameter, of the graph is the longest, shortest distance between any two nodes.
We’ll represent our graph as a Python dictionary. The keys will be the names of the nodes, the values will be lists of 2-tuples, where the first value is the connecting node and the second value is the distance between the nodes. The graph in the picture will look like this:
graph = dict()
graph['S'] = [('U', 10), ('X', 5)]
graph['U'] = [('X', -2), ('V', 1)]
graph['V'] = [('Y', -5)]
graph['X'] = [('U', 3), ('Y', 2)]
graph['Y'] = [('V', 6), ('S', 7)]
The method for calculating this has been known for more than fifty years and is known as the Floyd-Warshall Algorithm. Here’s an example of implementing it in Python.
First, let’s create a dictionary to track the distances between any two nodes. Since we don’t know the distances between the nodes yet, we’ll set all the values to one million.
dist = defaultdict(lambda: defaultdict(lambda: 1000000))
Now let’s fill out the values we do know. We know that the distance from a node to itself is zero.
for v in graph.keys():
dist[v][v] = 0
We know the distances between the nodes that have direct paths. It could be that the direct path isn’t the shortest path, but it’s a place to start:
for u, values in graph.iteritems():
for v in values:
dist[u][v[0]] = v[1]
Note that it’s possible to define a graph where a distance from a node to itself is something other than zero. If it is, that will be picked up in the previous lines of code.
Now, we’ll go through and try to improve the distance between every pair of nodes. Say we look at nodes i and j. We’ll look at every node k, and figure out whether the distance from i to k to j is less than the distance from i to j that we’re already aware of.
for k in graph.keys():
for i in graph.keys():
for j in graph.keys():
if dist[i][j] > dist[i][k] + dist[k][j]:
dist[i][j] = dist[i][k] + dist[k][j]
After this triple loop, our dist dictionary will hold the shortest distances between any two nodes.
So what is the diameter of our graph?
print max(product(dist.keys(), dist.keys()), key=lambda x: dist[x[0]][x[1]])
('Y', 'U')
The shortest distance from Y to U is to go from Y to S to X to U. The edge lengths are 7, 5, and 3 respectively. So the diameter of our graph is 15.