<![CDATA[Mostly Harmless Code]]>https://code.apiad.nethttps://cdn.hashnode.com/res/hashnode/image/upload/v1709738809290/apmaCWBYe.jpegMostly Harmless Codehttps://code.apiad.netRSS for NodeSun, 08 Sep 2024 19:11:02 GMT60<![CDATA[The Hitchhiker's Guide to Graphs - Ep 00]]>https://code.apiad.net/hitchhikers-graphs-00https://code.apiad.net/hitchhikers-graphs-00Wed, 06 Mar 2024 15:04:17 GMT<![CDATA[<p>What do the World Wide Web, your brain, the corpus of scientific knowledge accumulated by all of humanity, the entire list of people youve ever met, and the city you live in have in common?</p><p>These are all very different types of things, from physical to virtual to social, but they share an essential trait. They are all <em>networks</em> that establish <em>relationships</em> between some <em>entities</em>.</p><p>The World Wide Web is a network of interconnected computational resources, data, software, and hardware infrastructure. Your brain is a network of interconnected neurons. The accumulated human knowledge is also a network of interconnected ideas, as all discoveries depend upon prior knowledge and unlock potential discoveries. Your city is also an interconnected network of roads and buildings. And the people you know is also network, as many of them know each other, or know someone that knows someone you know.</p><p>As distinct as these things are, they all share common properties in their networked nature. For example, you can think of how close two entities in this network are. The meaning of distance will be different if youre considering physical networks like roads versus information networks like published papers with citations to other papers versus social networks like your Facebook friends, but in all cases, there is some sense in which some entities are closer together than others.</p><p>What if we could study this abstract notion of <em>networks of interconnected elements</em>, and understand the fundamental properties of all sorts of networks all at once? Welcome to graph theory!</p><blockquote><p>This article is an extract of my upcoming book on graphs, <em>The Hitchhikers Guide to Graphs</em>. The book is available as early release for premium supporters of my main blog, <a target="_blank" href="https://blog.apiad.net">Mostly Harmless Ideas</a>. You can also directly buy the <a target="_blank" href="https://store.apiad.net/l/graphs">early access pass</a> and get the current draft and all future updates forever.</p></blockquote><p>This article introduces graph theory and its applications. It explains what a graph is and how to implement a computational representation suitable for most applications.</p><p>In future articles, we will explore specific algorithms for many practical problems, from pathfinding to gameplaying to social network analysis. If you are interested in the underlying theory, you can check out my series on graph theory, <em>The Computer Scientists Guide to Graph Theory</em> on <a target="_blank" href="https://thepalindrome.org/t/alejandros%E2%88%92lectures">The Palindrome</a>.</p><h2 id="heading-what-is-a-graph">What is a graph?</h2><p>Intuitively, a graph is just a (finite) collection of elements which we will often call <em>vertices</em>, although in many places youll see them called <em>nodes</em> as well connected among them via <em>edges</em>. Thus, a graph represents an abstract <em>relation space</em>, in which the edges define whos related to whom, whatever the nature of that relation is.</p><p>A graph is formally defined as an object composed of two sets: one set of elements we call vertices and another set of elements we call edges, with the detail that edges are nothing but sets of two vertices. Thus, in principle, there is nothing about an edge that matters beyond which are the two vertices it connects (we will see this isnt exactly the case when we examine some special types of graphs.)</p><p>Graphs are abstract objects, which means they dont have an intrinsic visualization. However, we can visualize graphs by drawing dots for vertices and lines for edges. An example of a simple graph is shown below:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735655034/e7273600-3e20-4adf-bea6-95c7beb10776.png" alt class="image--center mx-auto" /></p><p>This graph is composed of the vertices <code>a,b,c,d,e</code> and the edges <code>ab, ae, bc, bd, cd, ce, de</code>. Of course, there is nothing intrinsic to names or the exact location of the vertices in the drawing. Except in very concrete cases such as when a graph represents a geometrical or geographical object the layout of a graph is arbitrary, and thus the same graph can be represented in an infinite number of ways.</p><p>The most important characteristic of a vertex is its <strong>degree</strong>, which basically measures the number of edges between this vertex and any other in the graph. Thus, in the previous example, the degree of vertex <code>a</code> is 2, because only edges <code>ab</code> and <code>ae</code> connect <code>a</code> with another vertex. In contrast, the remaining vertices has degree 3. We name the set of vertices adjacent to an arbitrary vertex <code>v</code> its <em>neighborhood</em>. Thus, the neighborhood of vertex <code>c</code> is the set of vertices <code>{b,d,e}</code>.</p><p>It should now be selfevident that the degree of a vertex is the size of its neighborhood.</p><h2 id="heading-programming-with-graphs">Programming with graphs</h2><p>Computationally speaking, you can think of a graph as an abstract class (or an interface in languages that support that notion) that provides two key operations: listing all nodes, and determining if two nodes are connected.</p><p>In Python, we can achieve this with an abstract class (using the <code>abc</code> module). Since nodes can be literally anything, from numbers to people to diseases, we use a generic type.</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Graph</span>(<span class="hljs-params">Generic[T], ABC</span>):</span><span class="hljs-meta"> @abstractmethod</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nodes</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-keyword">pass</span><span class="hljs-meta"> @abstractmethod</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">adjacent</span>(<span class="hljs-params">self, x: T, y: T</span>) > <span class="hljs-title">bool</span>:</span> <span class="hljs-keyword">pass</span> <span class="hljs-comment"># ... rest of class Graph</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L9%E2%88%92L18">https://github.com/apiad/graphs/blob/main/graphs/core.py#L9L18</a></p></blockquote><p>You may be wondering, what if we want to modify the graph? While that makes total sense in some applications, since we want to use this graph abstraction as flexibly as possible e.g., as a read-only interface to some external resource, such as the graph of your Twitter followers we dont want to constraint ourselves to only graphs that are stored in local memory or that can be modified. In any case, specific local implementations of this interface will certainly have methods to add or remove nodes or edges.</p><p>Just from the previous definition, we can already start to implement general methods in graphs, whatever their underlying implementation. For example, we can already compute the neighborhood of any given vertex, albeit with an extremely slow procedure:</p><pre><code class="lang-python"><span class="hljs-comment"># class Graph(...)</span><span class="hljs-comment"># ...</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">neighborhood</span>(<span class="hljs-params">self, x: T</span>):</span> <span class="hljs-keyword">for</span> y <span class="hljs-keyword">in</span> self.nodes(): <span class="hljs-keyword">if</span> self.adjacent(x, y): <span class="hljs-keyword">yield</span> y <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">degree</span>(<span class="hljs-params">self, x: T</span>) > <span class="hljs-title">int</span>:</span> <span class="hljs-keyword">return</span> len(list(self.neighborhood(x)))<span class="hljs-comment"># ...</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L21%E2%88%92L32">https://github.com/apiad/graphs/blob/main/graphs/core.py#L21L32</a></p></blockquote><p>This is the worst way to compute neighborhoods, but it works. In cases where we have nothing better, this method will do, but some specific representations of graphs we will see shortly can override these methods and provide more efficient implementations.</p><h2 id="heading-computational-representations-of-graphs">Computational representations of graphs</h2><p>There are several computational representations of graphs, with advantages and limitations of their own.</p><p>The most straightforward representation is called the <em>adjacency list</em> method, which references all neighbors of a given node in a structure associated with that node, such as an array. In Python, we can store a dictionary of nodes mapping to a set of their adjacent nodes. We use a set to store the adjacency information to answer as quickly as possible whether two nodes are adjacent.</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AdjGraph</span>(<span class="hljs-params">Graph[T]</span>):</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_</span><span class="hljs-title">nit_</span><span class="hljs-title">self</span>, *<span class="hljs-title">nodes</span>, <span class="hljs-title">directed</span>=<span class="hljs-title">False</span>) > <span class="hljs-title">None</span>:</span> super()._nit_) self.inks = {n: set() <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> nodes} self._directed = directed<span class="hljs-meta"> @property</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">directed</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-keyword">return</span> self._directed <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nodes</span>(<span class="hljs-params">self</span>) > <span class="hljs-title">list</span>[<span class="hljs-title">T</span>]:</span> <span class="hljs-keyword">return</span> iter(self.inks) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">adjacent</span>(<span class="hljs-params">self, x: T, y: T</span>) > <span class="hljs-title">bool</span>:</span> <span class="hljs-keyword">return</span> y <span class="hljs-keyword">in</span> self.inks[x] <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">neighborhood</span>(<span class="hljs-params">self, x: T</span>):</span> <span class="hljs-keyword">return</span> iter(self.inks[x]) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">degree</span>(<span class="hljs-params">self, x: T</span>) > <span class="hljs-title">int</span>:</span> <span class="hljs-keyword">return</span> len(self.inks[x]) <span class="hljs-comment"># ... rest of AdjGraph</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L79%E2%88%92L101">https://github.com/apiad/graphs/blob/main/graphs/core.py#L79L101</a></p></blockquote><p>Note that this implementation allows computing the neighborhood much more directly. It also allows us to dynamically modify the graph by adding vertices and edges.</p><pre><code class="lang-python"><span class="hljs-comment"># class AdjGraph(...)</span><span class="hljs-comment"># ...</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add</span>(<span class="hljs-params">self, *nodes: T</span>):</span> <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> nodes: <span class="hljs-keyword">if</span> n <span class="hljs-keyword">in</span> self.inks: <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span> self.inks[n] = set() <span class="hljs-keyword">return</span> self <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">link</span>(<span class="hljs-params">self, x: T, y: T</span>):</span> <span class="hljs-keyword">if</span> x == y: <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"Selflinks not allowed."</span>) self.add(x) self.add(y) self.inks[x].add(y) <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> self._directed: self.inks[y].add(x) <span class="hljs-keyword">return</span> self<span class="hljs-comment"># ...</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L104%E2%88%92L129">https://github.com/apiad/graphs/blob/main/graphs/core.py#L104L129</a></p></blockquote><p>Notice that we are quite flexible in our modification methods, i.e., we dont complain if a vertex is already added. We take care of adding new vertices in <code>link</code> if necessary. This makes it way easier to use our implementation to dynamically construct a graph without taking too much hassle verifying that we arent adding duplicated things, paying a minimal overhead in performance.</p><h3 id="heading-a-bit-of-syntax-sugar">A bit of syntax sugar</h3><p>If you noticed that <code>return self</code> at the end of the <code>link</code> method, you may be wondering why that line is there. The reason is so we can chain successive calls to <code>link</code> to create custom graphs quickly. For example:</p><pre><code class="lang-python">g = AdjGraph(<span class="hljs-number">1</span>,<span class="hljs-number">2</span>,<span class="hljs-number">3</span>,<span class="hljs-number">4</span>).link(<span class="hljs-number">2</span>,<span class="hljs-number">3</span>).link(<span class="hljs-number">1</span>,<span class="hljs-number">4</span>)</code></pre><p>This pattern is often called a fluent interface and is common in objectoriented design. It is not strictly necessary here but a nice little syntactic sugar, and we can treat ourselves sometimes, right?</p><p>There are a few other similar methods in <code>AdjGraph</code> that lets you quickly build a graph, adding paths, cycles, and other common structures with a single function call and using method chaining to combine multiple operations in a single line. We will use and explain them when we need them.</p><hr /><p>Another commonly used representation is the <em>adjacency matrix</em> method, in which we store a separate structure (like a bidimensional array) that explicitly marks which pairs of nodes are related. The main advantage of this representation is that there is a single place to query or modify for adjacency information. The main disadvantages are the extra wasted space (unless you use a sparse matrix) and the added complexity in computing the neighborhood of a given node (unless you use an extra adjacency list for that, which nullifies the main advantage).</p><p>For those reasons, we dont gain much with adjacency matrices, and thus, we will primarily use the <code>AdjGraph</code> implementation throughout the book. It provides a nice balance between flexibility and performance, although it is neither the most flexible nor the most performant implementation possible. When we need to, we will devise other, more appropriate implementations.</p><h2 id="heading-common-graphs">Common graphs</h2><p>Throughout this series, we will refer to several common graphs by name. These graphs appear repeatedly in proofs and examples, so it pays to enumerate them briefly here.</p><p><strong>The complete graph</strong><code>K</code> is the graph of <code>n</code> vertices and all possible edges. It is, by definition, the densest graph one can have with <code>n</code> vertices. Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735717047/ad1ca7bb-6aec-4f0c-9322-487a34b93818.png" alt class="image--center mx-auto" /></p><p><strong>The path graph</strong><code>P</code> is a graph composed of <code>n</code> vertices stitched together in sequence, hence its a path. (We will formalize this concept in the next chapter). Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735736090/a6447129-52a0-4de8-b7e9-3f2091459b49.png" alt class="image--center mx-auto" /></p><p><strong>The cycle graph</strong><code>C</code> is a closedloop of <code>n</code> vertices. So, just like a path, but the first and last vertices are also adjacent. Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735759151/4725eb27-076c-4297-abe4-8af69eefb77b.png" alt class="image--center mx-auto" /></p><p><strong>The random uniform graph</strong><code>U(n,p)</code> is a graph of <code>n</code> vertices, where each pair of vertices has a probability <code>p [0,1]</code> to exist. It is the simplest random graph one can conceive. Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735793713/8350927a-4ef2-4572-8238-35c787c95eb6.png" alt class="image--center mx-auto" /></p><h2 id="heading-other-types-of-graphs">Other types of graphs</h2><p>So far, weve been talking about <em>undirected and unweighted graphs</em>, called like this, because each edge has no specific direction and no cost associated<em>.</em></p><p>Thus, we can either say the edge <code>ab</code> or the edge <code>ba</code> is in the graph, because both are the same edge, and it would redundant and incorrect to mention them both. And each edge is similarly important.</p><p>However, in some applications, we will need a bit more information. Two specific types of graphs that appear repeatedly are <em>directed</em> and <em>weighted</em> graphs, sometimes both in the same problem.</p><h3 id="heading-directed-graphs">Directed graphs</h3><p>In some applications, it is interesting to give a direction to edges and consider that <code>ab</code> is different from <code>ba</code>. For example, in modeling transportation networks, sometimes you have singledirection routes. These are called <strong>directed graphs</strong>, and although they are essential in practical applications, the fundamental theory is almost identical to undirected graphs, so we will only mention them when theres some relevant difference.</p><p>Here is an example of a directed graph:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735813446/0918256f-b379-4a19-a7c9-701b00456c1b.png" alt class="image--center mx-auto" /></p><h3 id="heading-weighted-graphs">Weighted graphs</h3><p>In planning and routing, edges often represent roads or general connections that involve some costeither a distance, a price, or any other similar notion of cost. In these cases, we use weighted graphs, where each edge has an associated number called a weight. We can ask questions like what is the optimal path between two nodes (where the sum of the weights of the edges involved is smaller).</p><p>We will see more weighted graphs very soon.</p><h2 id="heading-moving-on">Moving on</h2><p>Now that we have laid out the foundational theory and basic implementation, we are ready to move on to specific types of graphs and some concrete problems. In upcoming articles, we will examine specific problems, learn the necessary theory, and design clever algorithms to solve them.</p>]]><![CDATA[<p>What do the World Wide Web, your brain, the corpus of scientific knowledge accumulated by all of humanity, the entire list of people youve ever met, and the city you live in have in common?</p><p>These are all very different types of things, from physical to virtual to social, but they share an essential trait. They are all <em>networks</em> that establish <em>relationships</em> between some <em>entities</em>.</p><p>The World Wide Web is a network of interconnected computational resources, data, software, and hardware infrastructure. Your brain is a network of interconnected neurons. The accumulated human knowledge is also a network of interconnected ideas, as all discoveries depend upon prior knowledge and unlock potential discoveries. Your city is also an interconnected network of roads and buildings. And the people you know is also network, as many of them know each other, or know someone that knows someone you know.</p><p>As distinct as these things are, they all share common properties in their networked nature. For example, you can think of how close two entities in this network are. The meaning of distance will be different if youre considering physical networks like roads versus information networks like published papers with citations to other papers versus social networks like your Facebook friends, but in all cases, there is some sense in which some entities are closer together than others.</p><p>What if we could study this abstract notion of <em>networks of interconnected elements</em>, and understand the fundamental properties of all sorts of networks all at once? Welcome to graph theory!</p><blockquote><p>This article is an extract of my upcoming book on graphs, <em>The Hitchhikers Guide to Graphs</em>. The book is available as early release for premium supporters of my main blog, <a target="_blank" href="https://blog.apiad.net">Mostly Harmless Ideas</a>. You can also directly buy the <a target="_blank" href="https://store.apiad.net/l/graphs">early access pass</a> and get the current draft and all future updates forever.</p></blockquote><p>This article introduces graph theory and its applications. It explains what a graph is and how to implement a computational representation suitable for most applications.</p><p>In future articles, we will explore specific algorithms for many practical problems, from pathfinding to gameplaying to social network analysis. If you are interested in the underlying theory, you can check out my series on graph theory, <em>The Computer Scientists Guide to Graph Theory</em> on <a target="_blank" href="https://thepalindrome.org/t/alejandros%E2%88%92lectures">The Palindrome</a>.</p><h2 id="heading-what-is-a-graph">What is a graph?</h2><p>Intuitively, a graph is just a (finite) collection of elements which we will often call <em>vertices</em>, although in many places youll see them called <em>nodes</em> as well connected among them via <em>edges</em>. Thus, a graph represents an abstract <em>relation space</em>, in which the edges define whos related to whom, whatever the nature of that relation is.</p><p>A graph is formally defined as an object composed of two sets: one set of elements we call vertices and another set of elements we call edges, with the detail that edges are nothing but sets of two vertices. Thus, in principle, there is nothing about an edge that matters beyond which are the two vertices it connects (we will see this isnt exactly the case when we examine some special types of graphs.)</p><p>Graphs are abstract objects, which means they dont have an intrinsic visualization. However, we can visualize graphs by drawing dots for vertices and lines for edges. An example of a simple graph is shown below:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735655034/e7273600-3e20-4adf-bea6-95c7beb10776.png" alt class="image--center mx-auto" /></p><p>This graph is composed of the vertices <code>a,b,c,d,e</code> and the edges <code>ab, ae, bc, bd, cd, ce, de</code>. Of course, there is nothing intrinsic to names or the exact location of the vertices in the drawing. Except in very concrete cases such as when a graph represents a geometrical or geographical object the layout of a graph is arbitrary, and thus the same graph can be represented in an infinite number of ways.</p><p>The most important characteristic of a vertex is its <strong>degree</strong>, which basically measures the number of edges between this vertex and any other in the graph. Thus, in the previous example, the degree of vertex <code>a</code> is 2, because only edges <code>ab</code> and <code>ae</code> connect <code>a</code> with another vertex. In contrast, the remaining vertices has degree 3. We name the set of vertices adjacent to an arbitrary vertex <code>v</code> its <em>neighborhood</em>. Thus, the neighborhood of vertex <code>c</code> is the set of vertices <code>{b,d,e}</code>.</p><p>It should now be selfevident that the degree of a vertex is the size of its neighborhood.</p><h2 id="heading-programming-with-graphs">Programming with graphs</h2><p>Computationally speaking, you can think of a graph as an abstract class (or an interface in languages that support that notion) that provides two key operations: listing all nodes, and determining if two nodes are connected.</p><p>In Python, we can achieve this with an abstract class (using the <code>abc</code> module). Since nodes can be literally anything, from numbers to people to diseases, we use a generic type.</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Graph</span>(<span class="hljs-params">Generic[T], ABC</span>):</span><span class="hljs-meta"> @abstractmethod</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nodes</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-keyword">pass</span><span class="hljs-meta"> @abstractmethod</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">adjacent</span>(<span class="hljs-params">self, x: T, y: T</span>) > <span class="hljs-title">bool</span>:</span> <span class="hljs-keyword">pass</span> <span class="hljs-comment"># ... rest of class Graph</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L9%E2%88%92L18">https://github.com/apiad/graphs/blob/main/graphs/core.py#L9L18</a></p></blockquote><p>You may be wondering, what if we want to modify the graph? While that makes total sense in some applications, since we want to use this graph abstraction as flexibly as possible e.g., as a read-only interface to some external resource, such as the graph of your Twitter followers we dont want to constraint ourselves to only graphs that are stored in local memory or that can be modified. In any case, specific local implementations of this interface will certainly have methods to add or remove nodes or edges.</p><p>Just from the previous definition, we can already start to implement general methods in graphs, whatever their underlying implementation. For example, we can already compute the neighborhood of any given vertex, albeit with an extremely slow procedure:</p><pre><code class="lang-python"><span class="hljs-comment"># class Graph(...)</span><span class="hljs-comment"># ...</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">neighborhood</span>(<span class="hljs-params">self, x: T</span>):</span> <span class="hljs-keyword">for</span> y <span class="hljs-keyword">in</span> self.nodes(): <span class="hljs-keyword">if</span> self.adjacent(x, y): <span class="hljs-keyword">yield</span> y <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">degree</span>(<span class="hljs-params">self, x: T</span>) > <span class="hljs-title">int</span>:</span> <span class="hljs-keyword">return</span> len(list(self.neighborhood(x)))<span class="hljs-comment"># ...</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L21%E2%88%92L32">https://github.com/apiad/graphs/blob/main/graphs/core.py#L21L32</a></p></blockquote><p>This is the worst way to compute neighborhoods, but it works. In cases where we have nothing better, this method will do, but some specific representations of graphs we will see shortly can override these methods and provide more efficient implementations.</p><h2 id="heading-computational-representations-of-graphs">Computational representations of graphs</h2><p>There are several computational representations of graphs, with advantages and limitations of their own.</p><p>The most straightforward representation is called the <em>adjacency list</em> method, which references all neighbors of a given node in a structure associated with that node, such as an array. In Python, we can store a dictionary of nodes mapping to a set of their adjacent nodes. We use a set to store the adjacency information to answer as quickly as possible whether two nodes are adjacent.</p><pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AdjGraph</span>(<span class="hljs-params">Graph[T]</span>):</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_</span><span class="hljs-title">nit_</span><span class="hljs-title">self</span>, *<span class="hljs-title">nodes</span>, <span class="hljs-title">directed</span>=<span class="hljs-title">False</span>) > <span class="hljs-title">None</span>:</span> super()._nit_) self.inks = {n: set() <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> nodes} self._directed = directed<span class="hljs-meta"> @property</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">directed</span>(<span class="hljs-params">self</span>):</span> <span class="hljs-keyword">return</span> self._directed <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nodes</span>(<span class="hljs-params">self</span>) > <span class="hljs-title">list</span>[<span class="hljs-title">T</span>]:</span> <span class="hljs-keyword">return</span> iter(self.inks) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">adjacent</span>(<span class="hljs-params">self, x: T, y: T</span>) > <span class="hljs-title">bool</span>:</span> <span class="hljs-keyword">return</span> y <span class="hljs-keyword">in</span> self.inks[x] <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">neighborhood</span>(<span class="hljs-params">self, x: T</span>):</span> <span class="hljs-keyword">return</span> iter(self.inks[x]) <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">degree</span>(<span class="hljs-params">self, x: T</span>) > <span class="hljs-title">int</span>:</span> <span class="hljs-keyword">return</span> len(self.inks[x]) <span class="hljs-comment"># ... rest of AdjGraph</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L79%E2%88%92L101">https://github.com/apiad/graphs/blob/main/graphs/core.py#L79L101</a></p></blockquote><p>Note that this implementation allows computing the neighborhood much more directly. It also allows us to dynamically modify the graph by adding vertices and edges.</p><pre><code class="lang-python"><span class="hljs-comment"># class AdjGraph(...)</span><span class="hljs-comment"># ...</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add</span>(<span class="hljs-params">self, *nodes: T</span>):</span> <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> nodes: <span class="hljs-keyword">if</span> n <span class="hljs-keyword">in</span> self.inks: <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span> self.inks[n] = set() <span class="hljs-keyword">return</span> self <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">link</span>(<span class="hljs-params">self, x: T, y: T</span>):</span> <span class="hljs-keyword">if</span> x == y: <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"Selflinks not allowed."</span>) self.add(x) self.add(y) self.inks[x].add(y) <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> self._directed: self.inks[y].add(x) <span class="hljs-keyword">return</span> self<span class="hljs-comment"># ...</span></code></pre><blockquote><p>See on Github: <a target="_blank" href="https://github.com/apiad/graphs/blob/main/graphs/core.py#L104%E2%88%92L129">https://github.com/apiad/graphs/blob/main/graphs/core.py#L104L129</a></p></blockquote><p>Notice that we are quite flexible in our modification methods, i.e., we dont complain if a vertex is already added. We take care of adding new vertices in <code>link</code> if necessary. This makes it way easier to use our implementation to dynamically construct a graph without taking too much hassle verifying that we arent adding duplicated things, paying a minimal overhead in performance.</p><h3 id="heading-a-bit-of-syntax-sugar">A bit of syntax sugar</h3><p>If you noticed that <code>return self</code> at the end of the <code>link</code> method, you may be wondering why that line is there. The reason is so we can chain successive calls to <code>link</code> to create custom graphs quickly. For example:</p><pre><code class="lang-python">g = AdjGraph(<span class="hljs-number">1</span>,<span class="hljs-number">2</span>,<span class="hljs-number">3</span>,<span class="hljs-number">4</span>).link(<span class="hljs-number">2</span>,<span class="hljs-number">3</span>).link(<span class="hljs-number">1</span>,<span class="hljs-number">4</span>)</code></pre><p>This pattern is often called a fluent interface and is common in objectoriented design. It is not strictly necessary here but a nice little syntactic sugar, and we can treat ourselves sometimes, right?</p><p>There are a few other similar methods in <code>AdjGraph</code> that lets you quickly build a graph, adding paths, cycles, and other common structures with a single function call and using method chaining to combine multiple operations in a single line. We will use and explain them when we need them.</p><hr /><p>Another commonly used representation is the <em>adjacency matrix</em> method, in which we store a separate structure (like a bidimensional array) that explicitly marks which pairs of nodes are related. The main advantage of this representation is that there is a single place to query or modify for adjacency information. The main disadvantages are the extra wasted space (unless you use a sparse matrix) and the added complexity in computing the neighborhood of a given node (unless you use an extra adjacency list for that, which nullifies the main advantage).</p><p>For those reasons, we dont gain much with adjacency matrices, and thus, we will primarily use the <code>AdjGraph</code> implementation throughout the book. It provides a nice balance between flexibility and performance, although it is neither the most flexible nor the most performant implementation possible. When we need to, we will devise other, more appropriate implementations.</p><h2 id="heading-common-graphs">Common graphs</h2><p>Throughout this series, we will refer to several common graphs by name. These graphs appear repeatedly in proofs and examples, so it pays to enumerate them briefly here.</p><p><strong>The complete graph</strong><code>K</code> is the graph of <code>n</code> vertices and all possible edges. It is, by definition, the densest graph one can have with <code>n</code> vertices. Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735717047/ad1ca7bb-6aec-4f0c-9322-487a34b93818.png" alt class="image--center mx-auto" /></p><p><strong>The path graph</strong><code>P</code> is a graph composed of <code>n</code> vertices stitched together in sequence, hence its a path. (We will formalize this concept in the next chapter). Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735736090/a6447129-52a0-4de8-b7e9-3f2091459b49.png" alt class="image--center mx-auto" /></p><p><strong>The cycle graph</strong><code>C</code> is a closedloop of <code>n</code> vertices. So, just like a path, but the first and last vertices are also adjacent. Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735759151/4725eb27-076c-4297-abe4-8af69eefb77b.png" alt class="image--center mx-auto" /></p><p><strong>The random uniform graph</strong><code>U(n,p)</code> is a graph of <code>n</code> vertices, where each pair of vertices has a probability <code>p [0,1]</code> to exist. It is the simplest random graph one can conceive. Here is one example:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735793713/8350927a-4ef2-4572-8238-35c787c95eb6.png" alt class="image--center mx-auto" /></p><h2 id="heading-other-types-of-graphs">Other types of graphs</h2><p>So far, weve been talking about <em>undirected and unweighted graphs</em>, called like this, because each edge has no specific direction and no cost associated<em>.</em></p><p>Thus, we can either say the edge <code>ab</code> or the edge <code>ba</code> is in the graph, because both are the same edge, and it would redundant and incorrect to mention them both. And each edge is similarly important.</p><p>However, in some applications, we will need a bit more information. Two specific types of graphs that appear repeatedly are <em>directed</em> and <em>weighted</em> graphs, sometimes both in the same problem.</p><h3 id="heading-directed-graphs">Directed graphs</h3><p>In some applications, it is interesting to give a direction to edges and consider that <code>ab</code> is different from <code>ba</code>. For example, in modeling transportation networks, sometimes you have singledirection routes. These are called <strong>directed graphs</strong>, and although they are essential in practical applications, the fundamental theory is almost identical to undirected graphs, so we will only mention them when theres some relevant difference.</p><p>Here is an example of a directed graph:</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709735813446/0918256f-b379-4a19-a7c9-701b00456c1b.png" alt class="image--center mx-auto" /></p><h3 id="heading-weighted-graphs">Weighted graphs</h3><p>In planning and routing, edges often represent roads or general connections that involve some costeither a distance, a price, or any other similar notion of cost. In these cases, we use weighted graphs, where each edge has an associated number called a weight. We can ask questions like what is the optimal path between two nodes (where the sum of the weights of the edges involved is smaller).</p><p>We will see more weighted graphs very soon.</p><h2 id="heading-moving-on">Moving on</h2><p>Now that we have laid out the foundational theory and basic implementation, we are ready to move on to specific types of graphs and some concrete problems. In upcoming articles, we will examine specific problems, learn the necessary theory, and design clever algorithms to solve them.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1709736111899/55c72a6f-c6f7-47a1-8461-16ce8c0a40cc.jpeg