Networks Resilience
Recent collaborations with my colleagues at Beijer Institute, Princeton and Posdam Institute have constantly bring me back to the topic of how is resilience understood from a network perspective. While resilience scholars constantly remind us that network structure are a key feature of resilient systems (e.g. modularity) [e.g. Biggs]; very little is know on what is the functional form between these qualitative claims and the actual resilience of systems. For example, if I give you a dataset consisting of 100 cities with a plethora of networks metrics already measured for you, could you figure out which city is more resilient? Would you be able to rank them? Or reach out to a particular set of majors and warn them that their cities are losing resilience? Would the resilience of say the traffic system be correlated with the resilience of the sewage system, or other important urban functionalities we all enjoy?
While most of the social-ecological resilience literature remains highly conceptual and therefore frustratingly inconclusive when it comes to answer applied questions like the ones above; network scientist have done some progress on approximating network resilience their way. That being said, the two communities do not necessarily agree on what “network structure” or “resilience” means. Understanding the different nuances helps shed light on what is know and where are the gaps. That is the purpose of this mini-review.
Definitions
A network is a representation of a system that maps interactions (links, edges) of the system elements (nodes, vertex). They are fancy words to mean points and lines, they can be anything you like: airports and air travel, people and friendship, banks and loans, landscapes and movement of people or animals. This simplification will enable answering some research questions, but be limiting for others. Resilience is the ability of a system to maintain its function and structure despite disturbance. You can make it more complicated than that, but it is often the case that the complications depend on contextual details of the system under study, not really the definition of the concept. So in the case of an air travel network, the function is transportation; and you can have function (and resilience) of the elements (airplanes) as well as the system as a whole (air traffic).
Resilience of what to what?
That is the crux of the problem, right? When looking at a system from a network perspective, you can break the question as resilience of some node attribute, link attribute, or some emergent property of the network as a whole. It gets even more complicated when you decide to include the arrow of time and allow dynamics on your network. Then all of the sudden resilience could be influenced by the rate of change of some node or link attribute (e.g. traffic jams in our city example, or a change in predation rate between two species in a food web), that in the static case you would have not been able to detect. Below I review some of these cases and point to references you might find useful.
Percolation
A percolation is a classic example of a phase transition (aka. critical transitions or regime shifts); where a networked system can be characterised either by a giant component — a large cluster that comprises most of the network—, or a large set of isolated nodes and small groups. Early work on network resilience focused on perturbation experiments (node deletion) to produce percolations, assuming that once the system has fallen apart some key function or feature is lost. For example, once a critical number of server nodes are down, the internet collapses. The question is, what is the minimum number of servers that can be down (or proportion of nodes) without collapsing the internet? Albert et al (2000) investigated exactly that question by performing node deletion experiments in large networks with different architectures or structures. By structures, they mean here different degree distributions (a statistical measure related to the number of links per node), such as scale-free networks, small-world networks, random networks and others.
They found that scale-free networks are more resilient to random failures but not to targeted attacks (attacking the hubs or nodes with a lot of connections). Under random failures, scale-free networks continue working as usual. Networks with exponential degree distributions, on the other hand, do not exhibit resilience to random attacks.
Cascades
On a similar vein, you might think of the case when the disturbance is not node or link deletion, but instead is failing at doing something the node was expected to do. That failure in one node, or small locality of the network, might or might not spread to the network as a whole. That is the case of cascading failures.
Lee et al (2011) studied that problem on the case of spreading economic crises, by looking how an economic crisis of a country could induce a crisis on a trading partner. They measure the importance of a node for the resilience of the system as the avalanche size and avalanche duration (or how far a cascade goes, how long it last) once it is perturbed. The found that an economic crisis in countries like the USA, Germany or Japan can induce a global recession, while a crisis in a country like Colombia or Venezuela is locally contained.
Another example that builds on the concept of cascades is the work on power grids, where electricity is flowing through the system, a local failure could be compensated by neighbouring stations, but large failures can scale up to become large black outs. Charlie Brummit and collaborators (2012) showed that when several power grids are interconnected, some connectivity is good because it avoid larger cascading failures, but too much connectivity actually amplifies them. Their results goes back to the intuition that modularity provides the system with resilience, but they go a step further in measuring what is the optimal level of connectivity that reduces failure. Brummit has also expanded his work to coupled systems with critical transitions on them (2015), and models of cascading failure on global supply chain dynamics and poverty traps (2017).
Cascades have also been studied on the context of simple coupled dynamical systems. They are toy models with differential equations that can exhibit bifurcations (regime shift like dynamics). Now this bifurcations are on the nodes (not like percolation which was a feature of the full network), and the authors are typically interested on how tipping in one element can cascade to another. Brummit (2015) found that when systems are connected the tipping of one can affect another tipping, but he also found that in a chain of 3 systems such as \(X \rightarrow Y \rightarrow Z\), the tipping of the first can hop the second and tip the third. Dekker et al (2018) used a similar approach to see if two climate systems with tipping points (e.g. ENSO and AMOC) obeying different types of bifurcations (e.g. Fold, Hopf) could predict each other when coupled through critical slowing down early warning signals. The found that the ability to predict coupling between system is rather limited.
My colleagues at PIK have also pursue some work with connected toy models to investigate tipping cascades. They find for example that some network motifs (geometrical shapes) can destabilise networks and large average clustering make them vulnerable (Krönke et al 2019). When looking in more detail of the kind of motifs that makes a network unstable, they found that feedforward loops decrease the coupling strength necessary to initiate a cascade, a result that they apply on the moisture recycling network of the Amazon rainforest (Wunderling et al 2020a), or some climate tipping elements (Wunderling 2020b).
The examples above assume that i) you have small number of systems, ii) you know the combinatorics a priori — you know which systems are connected and which ones not, and iii) you know the mechanism through which the coupling emerge (e.g. temperature or rain). Relaxing the three assumptions above leaves you with a problem of how to discover which systems can be interconnected. Structurally discovering mechanisms of cascades is what our paper advanced (Rocha et al. 2018) by developing simple algorithms that takes the causal structure of two systems —interpreted as networks— and can tell whether they could be connected by sharing drivers, one way connections, or two way interactions. For ecological regime shifts, we found that climate is a typical connector, but even if the climate crises were all of the sudden off, ecosystem could connect through other mechanisms that include water and nutrient transport, food production and biomass appropriation, or fires and fragmentation.
Flows
This is a stream of the literature that I have not followed so closely, but comprises problems where the function of the system of interest is captured by the ability of links to provide some flow. Examples include traffic jams on a city, or money flow between financial institutions, or pollination in an agricultural landscape. Traffic jams
Synchronization
Dynamic networks
This section is dedicated to my colleague Anne-Sophie Crépin, and digs a bit deeper on what happens when you add the arrow of time.
Some of the literature cited already approaches system whose relevant variables (nodes, links) change over time obeying a set of rules or equations. These are dynamical systems. But the references so far deal with small number of systems, the dynamics are embedded on what happens to say nodes (and their time series) but links don’t change over time, at least not in magnitude, direction, or disappearance given node dynamics. This section cover some of the more dynamically oriented cases.
In a non-networked system, one typically studies the stability (and resilience) of the system by means of linear stability analysis around fix points (see e.g. Strogatz 1994). One assumes very small perturbations around the vicinity of fixed points (equilibria) and if the eigenvalue is \(\lambda < 0\) the system will be attracted to the fix point, if \(\lambda >0\) it will be repelled. When you have more than one dimension (more than one state variable of interest), determining the stability properties becomes a bit more interesting, because the eigenvalues do not have to be real, they can be imaginary or complex, giving rise to more interesting dynamics such as growing or damping oscillations, or limit cycles.
But in networked system, the stability properties of the system do not depend only on the dynamic of its elements (nodes). The stability of the system will depend on the intrinsic dynamics of the system (\(f(x)\)), the coupling dynamics between nodes (link dynamics or \(g(x_i, x_j)\)), and the structure of the system (often denoted as the adjacency matrix \(A_{ij}\)).
Undirected networks:
Mark Newman in his Networks Introduction book (Ch 18) derives the master stability equations, and some special cases, for undirected networks (when \(A_{ij}\) is symmetric). Sparing you the math derivations, he reaches the following conclusions:
- For the non-symmetric case (meaning each node can have different fixed points): the stability depends on the partial derivatives of the change of the node dynamics as response to itself, the change in the link dynamics as response to the origin node, and the link dynamics as response to the receiving node. These derivatives can be added into a matrix \(M_{ij}\) (in his notation) with eigenvalues \(\mu_r\) and eigenvector \(v_r\). If all real parts of \(mu_i\) are positive, the fixed point will be repelling, and attracting if all real parts are negative. If mixed, it will be a saddle point (stable in some directions while unstable in others).
- For the case where the fixed point is symmetric (\(x_i = x \forall i\)), either all nodes have the same number of connections, or the link dynamics cancel out. Thus the position of the fix points are independent of network structure. Newman further derives the following stability condition when the link function only depends on one node \(g(x_j)\): \(1/kn > - \gamma/ \alpha < 1/k1\) which can be rewritten as \(1/kn < - \frac{dg/dx}{df/dx} < 1/k1\)
What does it mean? The middle term is the ratio of the partial derivatives for the link dynamics with respect to the node (\(dg/dx\)) and the node dynamics with respect to itself (\(df / dx\)). The upper and lower bounds are the reciprocals of the largest (\(k_n\)) and smaller (\(k_1\)) eigenvalues of the adjacency matrix \(A_{ij}\). If the inequality is satisfied.
- For the case when the link dynamics depends on two arguments like \(g(x_i,x_j) = g(x_i) - g(x_j)\), the conditions simplifies to \(1/ \lambda_n > - \frac{dg/dx}{df/dx} | x= x^*\). Where \(\lambda_n\) is the largest eigenvalue of the graph laplacian \(L\).
The results generalise for undirected networks where multiple dynamics occur in each node (for example a classic SIR epidemic model). The take home message is that a master stability condition separates network structure from dynamics. The eigenvalues are properties solely of the structure (the adjacency matrix), while their limits (min and max eigenvalue) are properties of the dynamics. Depending of the case, one would use the adjacency matrix or laplacian.
Directed networks:
In the directed case, complications arise from the fact that the Laplacian matrix is not symmetrical. The results based on the Laplacian matrix of undirected networks do not generalise to the directed case, although some candidates have been suggested (Newman 2010). For example, Asllani et al (2014) studied the case of balanced directed networks (the outgoing connectivity equals the incoming one). For their linear stability analysis, one restriction was that the eigenvectors should be linearly independent. They conclude that instabilities arise in homogeneous directed networks when the \(|(tr J_\alpha)_{Re}| \leq \gamma\), or the absolute value of the real part of the trace of the Jacobbian is less than gamma. Where:
\(\gamma = \sqrt{\frac{A + \sqrt{A^2 + B^2}}{2}}\)
and
\(A = [(trJ_\alpha)_{Re}]^2- [(trJ_\alpha)_{Im}]^2 - 4(detJ_\alpha)_{Re}\)
Stepping back from the equation, their central result is that instabilities arise in directed networks not only from the dynamics of the nodes. The topology of the space where these interactions are embedded plays an important role and can be responsible for the onset of dynamic instabilities.
Asllani and collaborators further explored the case of directed non-normal networks (Asllani and Carletti 2018; Asllani et al 2018). Non-normal means that a suitable orthogonal basis of eigenvectors does not exist. A matrix is non-normal when \(AA^T \neq A^TA\), it is often the case when the network is directed (asymmetrical), has low reciprocity, lack cycles (feedbacks), and has a hierarchical organisation (like food webs). They found that sparse random networks whose nodes fall into a directed chainlike structure have a good degree of non-normality. The non-normality of the network increase the return time to equilibrium after perturbation, thus affecting the system’s resilience. They further proposed different measures of non-normality including the spectral abscissa (for long term behaviour), the numerical abscissa (short term behaviour), the Hermitian part of the matrix \(M\) that defines the system of equations, the pseudo spectrum, the Kreiss constant (or transient amplification), and the Henrici’s departure of normality (Asllani et al 2018).
So what is missing? directed graphs with feedbacks or the typical causal networks that underlie regime shift dynamics.