**Graph Data structure**

A graph is an abstract data structure representation of connected nodes (also called vertices) by various edges (or the link/distance between nodes). Theres two kinds of graphs, **directed** and **undirected**. Directed means that each set of nodes are connected by edges, where the edges have a direction associated with them. For example, Node A has a edge to Node B, however Node B does not have an edge back to Node A. In an undirected graph, all nodes are connected by edges that are 2-way (as in Node A can has an edge to Node B and vice versa). For this tutorial, our main focus will be the directed graph(s).

Representing a directed graph; in terms of mathematical use, means being able to also represent the costs/distances between nodes and edges (if and edge ‘E’ exists between some vertices U and V). In mathematics, a graph is represented like so: *V = {V1, V2, V3, V4…}* where V is the set of the vertices in the graph, and, *E = {(V1,V2, 3), (V2,V4,7), (V1,V3, 5), (V3, V1, 5)}* where E represents the set of edges we have.

When a vertex has a known edge to another vertex, or two vertices have an edge, we say that those vertices are **adjacent**. Adjacency will be important later when we implement graph traversal algorithms. In directed graphs, for example, if Node A has an edge to Node B, then A is adjacent to B. However, if Node B does not have an edge back to Node A, then B isnt adjacent to A.

Graphs with specified distances between nodes are called **weighted graphs**. As for graphs with no distances, those are called **unweighted graphs**. Common instances of unweighted graphs tend to be associated with undirected graphs, and in most/common cases, the default distances for a set of nodes is constant (1).

In the context of computer science, a matrix representation is used to represent a graph, called an **adjacency matrix**. An adjacency matrix is essentially a 2D array (or matrix) where each row represents a vertex (node) and column represents a destination vertex. In the matrix, if there is an edge between two vertices, then a distance greater than 0 is specified. If no edge exists we put 0.

**For example, in the adjacency matrix below, here’s how we read the table:**

- From vertex A to A, the distance is 0. There is no edge going to the same vertex and it has no edge to itself.
- From vertex A to C, the cost is 5. From vertex B to C, the cost is 1. And from vertex B to D, the cost is 3.
- From vertex C to A, there is a 0 because there is no edge going back and forth to it.

**Graph Functionality:**

In most cases, a graph is defined by the following functions:

**CreateGraph():**you could also create and initialize your graph using a constructor, but doing in a method is better practice and creates less clutter in terms of how the code will look like. This method basically creates the graph structure using information from a source like a text file or database, and creates an adjacency matrix (or list) based on the information (usually the matrix is 2D array, and the list is a an array with a linked list for each index).**AddEdge(node u, node v, int edge):**adds an edge between two vertices**RemoveEdge(node u, node v):**removes an edge between two vertices**Adjacent(node u, node v):**determines/tests whether an edge exists between a node U and node V**Neighbors(node u):**lists all nodes V adjacent to a node U if edge(s) exist**GetNodeValue(node u):**returns the value or data associated with a specified node**SetNodeValue(data, node):**sets a value or data to a specified node**Display():**displays the graph as an adjacency matrix, in some graphical interfaces both the graph and table are drawn

**Graph Traversal (Breadth First Search):**

- In order for us to do something with this graph, we have to traverse it (visiting all the nodes)
- Breadth First Search is a method of graph & tree traversal
- It works by starting a some source node, and explores the neighbouring nodes first, then moves on the next level of neighbouring nodes
- For this tutorial, we will be doing a variant of the breadth first search: Dijkstra’s Shortest Path algorithm

**Interactive Breadth First Search**

**Pseudo-code:**

1procedureBFS(G,v) where G is our Graph, and v is a vertex 2 letQbe a queue 3Q.enqueue(v) 4 labelvas discovered 5whileQis not empty 6v←Q.dequeue() 7display(v) //display the vertex8for alledges fromvtowinG.adjacentEdges(v)do9ifwis not labeled as discovered 10Q.enqueue(w) 11 labelwas discovered

**Dijkstra’s Algorithm (shortest path):**

One of the most prominent and common uses of the graph data structure is to perform Dijkstra’s shortest path algorithm. Basically, we have a graph, and some starting point, and we determine the shortest path to visit within the graph to reach some target (sometimes, it can also be the shortest path that visits all the nodes). This algorithm can be of great use, a real world example is Google Maps and getting directions to and from addresses. It is an example of a breadth first search (a variant).

In essence:

The idea of the algorithm is very simple.

1. It maintains a list of unvisited vertices.

2. It chooses a vertex (the source) and assigns a maximum possible cost (i.e. infinity) to every other vertex.

3. The cost of the source remains zero as it actually takes nothing to reach from the source vertex to itself.

4. In every subsequent step of the algorithm it tries to improve(minimize) the cost for each vertex. Here the cost can be distance, money or time taken to reach that vertex from the source vertex. The minimization of cost is a multi-step process.

a) For each unvisited neighbor (vertex 2, vertex 3, vertex 4) of the current vertex (vertex 1) calculate the new cost from the vertex (vertex 1).

b) For e.g. the new cost of vertex 2 is calculated as the minimum of the two ( (existing cost of vertex 2) or (sum of cost of vertex 1 + the cost of edge from vertex 1 to vertex 2) )

5. When all the neighbors of the current node are considered, it marks the current node as visited and is removed from the unvisited list.

6. Select a vertex from the list of unvisited nodes (which has the smallest cost) and repeat step 4.

7. At the end there will be no possibilities to improve it further and then the algorithm ends

Further detail.

**Dijkstra Interactive Animation/Visual**

*Algorithm (pseudo-code):*

1functionDijkstra(Graph,source): 2 create arrays dist[], prev[] 3 create PriorityQueue pq 4 5for eachvertexvinGraph:// Initialization6 dist[v] ← INFINITY// Unknown distance from source to v7 prev[v] ← 0// Previous node in optimal path from source8 9 dist[source] ← 0// Distance from source to source10 enqueue source // source node 11 enqueue u to pq // All nodes in the adjacency matrix (unvisited nodes) 12whilepq is not empty: 13u← vertex in pq with min dist[u]// Source node will be selected first14 removeufrompq15 16for eachneighborvofu:// where v is still in Q.17alt← dist[u] + length(u,v) 18ifalt< dist[v]:// A shorter path to v has been found19 dist[v] ←alt20 prev[v] ←u21 pq.enqueue(u, dist[v]) 22returndist[], prev[]

**Implementation in C#:**

For my implementation, I used my priority queue implementation that I made to perform Dijkstra’s on an adjacency matrix. ** Please note that I use non-zero arrays for my adjacency matrix implementation**:

class Graph { class NodeData { private int index; public string data; public NodeData(string data, int index) { this.index = index; this.data = data; } } /// <summary> /// 4 attributes /// A list of vertices (to store node information for each index such as name/text) /// a 2D array - our adjacency matrix, stores edges between vertices /// a graphSize integer /// a StreamReader, to read in graph data to create the data structure /// </summary> private List<NodeData> vertices; private int graphSize; private StreamReader sr; private int[,] adjMatrix; private const int infinity = 9999; public Graph() { vertices = new List<NodeData>(); sr = new StreamReader("graph.txt"); CreateGraph(); } private void CreateGraph() { //get the graph size first graphSize = Convert.ToInt32(sr.ReadLine()) + 1;//non-zero arrays, add 1 adjMatrix = new int[graphSize, graphSize]; //ASSUME ALL DATA HAS BEEN READ FROM A TEXT FILE & ADJACENCY MATRIX HAS BEEN INITIALIZED } public void RunDijkstra()//runs dijkstras algorithm on the adjacency matrix { Console.WriteLine("***********Dijkstra's Shortest Path***********"); int[] distance = new int[graphSize]; int[] previous = new int[graphSize]; for (int i = 1; i < graphSize; i++) { distance[i] = infinity; previous[i] = 0; } int source = 1; distance = 0; PriorityQueue<int> pq = new PriorityQueue<int>(); //enqueue the source pq.Enqueue(source, adjMatrix); //insert all remaining vertices into the pq for (int i = 1; i < graphSize; i++) { for (int j = 1; j < graphSize; j++) { if (adjMatrix[i, j] > 0) { pq.Enqueue(i, adjMatrix[i, j]); } } } while (!pq.Empty()) { int u = pq.dequeue_min(); for (int v = 1; v < graphSize; v++)//scan each row fully { if (adjMatrix[u,v] > 0)//if there is an adjacent node { int alt = distance[u] + adjMatrix[u, v]; if (alt < distance[v]) { distance[v] = alt; previous[v] = u; pq.Enqueue(u, distance[v]); } } } } //distance to 1..2..3..4..5..6 etc lie inside each index for (int i = 1; i < graphSize; i++) { Console.WriteLine("Distance from {0} to {1}: {2}", source, i, distance[i]); } printPath(previous, source, graphSize - 1); } private void printPath(int[] path, int start, int end) { //prints a path, given a start and end, and an array that holds previous //nodes visited Console.WriteLine("Shortest path from source to destination:"); int temp = end; Stack<int> s = new Stack<int>(); while (temp != start) { s.Push(temp); temp = path[temp]; } Console.Write("{0} ", temp);//print source while (s.Count != 0) { Console.Write("{0} ", s.Pop());//print successive nodes to destination } } public void AddEdge(int vertexA, int vertexB, int distance) { if (vertexA > 0 && vertexB > 0 && vertexA <= graphSize && vertexB <= graphSize) { adjMatrix[vertexA, vertexB] = distance; } } public void RemoveEdge(int vertexA, int vertexB) { if (vertexA > 0 && vertexB > 0 && vertexA <= graphSize && vertexB <= graphSize) { adjMatrix[vertexA, vertexB] = 0; } } public bool Adjacent(int vertexA, int vertexB) { //checks whether two vertices are adjacent, returns true or false return (adjMatrix[vertexA, vertexB] > 0); } public int length(int vertex_u, int vertex_v)//returns a distance between 2 nodes { return adjMatrix[vertex_u, vertex_v]; } public void Display() //displays the adjacency matrix { Console.WriteLine("***********Adjacency Matrix Representation***********"); Console.WriteLine("Number of nodes: {0}\n", graphSize - 1); foreach (NodeData n in vertices) { Console.Write("{0}\t", n.data); } Console.WriteLine();//newline for the graph display for (int i = 1; i < graphSize; i++) { Console.Write("{0}\t", vertices[adjMatrix[i,0]].data); for (int j = 1; j < graphSize; j++) { Console.Write("{0}\t", adjMatrix[i,j]); } Console.WriteLine(); Console.WriteLine(); } Console.WriteLine("Read the graph from left to right"); Console.WriteLine("Example: Node A has an edge to Node C with distance: {0}", length(1,3)); } private void DisplayNodeData(int v)//displays data/description for a node { Console.WriteLine(vertices[v].data); } }

**My output on an example graph:**

Pingback: Graphs: Depth First Traversal (C#) | Bits and Pieces of Code

Pingback: Priority Queue Tutorial (C#, C++, Java) | Bits and Pieces of Code