# 代写Project | Data Structure代做 | 作业Algorithm | Java – Graph Representation

### CITS2200 project – Graph Representation

#### Question 1: Shortest Path

In this project, our vertices represent Wikipedia pages, and our edges represent links from one page to another. This means that our graph is directed, but unweighted. In an unweighted graph the length of the shortest path from one vertex to another is the minimum number of edges that must be traversed in order to get from our source to our destination. We can find the lengths of these shortest paths by performing a Breadth First Search (BFS), which enumerates vertices according to the number of edges they are away from our starting vertex. That is, a BFS will visit our starting vertex, then all vertices that are one edge away, then two, and so on. By maintaining an array of distances from our starting vertex, we can fill this array in as we perform our BFS. When the BFS has finished, our array will hold

the length of the shortest path from our source to each vertex it can reach, or the original value of the array if no such path exists. A BFS visits each vertex at most once, and for each vertex iterates over every edge leaving that vertex. This means every vertex and every edge will be considered at most once, giving our BFS an expected and worst-case time complexity of O ( V + E ), which is optimal.

#### Question 3: Strongly Connected Components

A Strongly Connected Component (SCC) is a set of vertices that can all reach each other. This implies that a vertex belongs to exactly one SCC, which may even be just that vertex. This question asks us to partition the vertices of the graph into the SCCs that contain them. There are a several known optimal algorithms, including, in chronological order of publication, Tarjan’s algorithm, Dijkstra’s Path-Based SCC algorithm, and Kosaraju’s algorithm. In the reference solution, we have chosen to implement Kosaraju’s algorithm as it is the simplest to implement. Kosaraju’s algorithm is based on the observation that the SCCs in the original graph are the same as those in the transpose graph (that is, the graph with all edges reversed). This is because for the vertices i and j to be in the same SCC, there must be a path from i to j and from j to i , and so in the transpose graph there must be therefore be a path from j to i and from i to j , meaning they are in the same SCC. We can easily modify addEdge() in order to maintain a copy of the transpose graph as well as the original. We can compute the set of all vertices a vertex can reach by performing a DFS starting at that vertex. Doing this is in both the original graph and the transpose graph is sufficient to compute the SCC to which this vertex belongs, but requires computing the intersection of these two sets, which can be computationally expensive. Kosaraju’s algorithm uses a property of the DFS order through the original graph in order to ensure that the DFS through the transpose graph only explores this intersection. The observation is that any vertex whose subtree was explored before another in the DFS order (this is called post-order) either must not have a path to that other vertex, or is a descendant of it in the DFS tree. We can therefore DFS through the transpose graph starting from the last vertex in post-order, and any vertices it visits must be part of its SCC, as any vertex that is earlier in the post-order that our starting vertex can reach must therefore have been able to reach and be reached from this starting vertex in the original graph. We can then DFS again from the next highest vertex in the post-order that is not yet visited in order to find all its SCC, and so on until we have found all the SCCs. This algorithm requires just two full-graph DFSs, each of which visit every vertex exactly once and consider each edge leaving a vertex exactly once, giving a time complexity for Kosaraju’s algorithm of O ( V + E ), which is optimal.

#### Question 4: Graph Centers

The center of a graph can be thought about in the same terms as the center of a circle: It is the point from which the most distant other point is as close as possible. In a graph we can think about this as the vertex from which the most distant other vertex (by shortest path distance) is as close as possible; that is, the vertex with minimum eccentricity. Unlike circles, however, the definition is not wholly unambiguous: It is possible for a graph to have multiple centers, and for there to be vertices that are unable to reach another vertex. This question permits us to make our own decision as to how to define centers in the case of a disconnected graph, and so for the reference implementation we have chosen to ignore nonexistent paths when computing the eccentricity of a vertex. We did not test for ambiguous cases, and any reasonable way of handling them was accepted. To find the set of vertices with minimum eccentricity, we need simply compute the eccentricity of each

vertex, which is the length of the maximum length shortest path from this vertex to any other. There are many ways of computing this, but we conveniently already have an optimal method for computing the lengths of the shortest paths to each vertex: Our BFS from question 1. To compute the eccentricity of a vertex we need simply perform a BFS from that vertex and take the maximum value from the distances array produced. Doing this once for each starting vertex and maintaining a list of vertices with minimum eccentricity gives us a list of graph centers. This method requires us to perform one BFS per vertex, giving an overall time complexity of O ( V ( V + E )), which reduces to O ( V E ), which is optimal. 2

• V

#### Common Mistakes

We observed a number of common mistakes while marking this project. Here we have compiled a list of some of them, with explanations where appropriate: Using and misanalysing an adjacency matrix: An adjacency matrix takes O ( V^2 ) time to build and requires O ( V ) time to find all neighbours of a vertex. This means that BFS and DFS are no longer O ( V + E ), but rather O ( V^2 ). This made many people’s solutions to all problems slower than expected. Transposing an adjacency matrix: There is no reason to transpose an adjacency matrix; It takes O ( V^2 ) time to do so. Just swap the indices: adjMat[i][j] = transpose[j][i] Using Dijkstra’s algorithm: Many people solved question 1 using Dijkstra’s algorithm, which in an unweighted graph has the exact same function as a BFS, but is slower. Many people also mistook Dijkstra’s algorithm for BFS and vice versa. Misanalysing Dijkstra’s algorithm: Dijkstra’s algorithm was also commonly analysed incorrectly. The O ( E + V log V ) worst-case complexity given on Wikipedia is only when using a special Data structure called a Fibonacci heap as the basis for the priority queue. java does not include Fibonacci heap, and nor did any project submissions. Using a more typical binary heap based priority queue, such as the one in the Java standard library gives O ( E log V + V log V ). A linear time priority queue gives O ( E V + V^2 ). Using Math.pow(2, n) instead of (1 << n) : Java’s Math.pow() function is not constant time, but is rather logarithmic in the power. This introduced an O ( log n ) factor that is not present when using bit-shifts. Not trying every start vertex for Hamiltonian Path: If you are using the backtracking DFS to solve Hamiltonian path, you must remember to try every vertex as a potential starting vertex. This is likely due to people adapting Hamiltonian cycle algorithms. Incorrectly claiming to use HeldKarp: Many people reported using the HeldKarp DP algorithm described above, while actually using backtracking or another factorial algorithm. Misanalysing calling getShortestPath() for all pairs of vertices: Many people computed vertex eccentricities for getCenters() by first calling their getShortestPath() method for all pairs of vertices using a pair of nested for loops, and then analyzed this

incorrectly as being O ( V ) when it is actually or worse depending on the 2 O ( V V )) 2 ( + E shortest path algorithm used. Using Floyd-Warshall: Floyd-Warshall is a good All Pairs Shortest Path algorithm for weighted graphs. Our graph is unweighted, however, meaning repeated BFS is better. Returning the wrong answer for edge cases: Many people forgot to follow the spec in several cases. These included not returning -1 when no shortest path exists, not returning 0 for the shortest path from a vertex to itself, not returning an empty array when no Hamiltonian path exists, and returning arrays containing nulls. In Java a 2D array is just an array of arrays, and so can be jagged rather than square. Not implementing the interface: An interface was provided to ensure that your code was compatible with our testing systems. Many people either failed to implement the interface correctly or modified the interface. Be glad we bothered to mark your submissions. Not compiling or missing code: Many submissions did not compile, often due to missing code. Where possible we fixed them. Be glad we bothered to mark your submissions.