Hyperharmonic analysis for the study of high-order information-theoretic signals
- Derived From
- Derived On
Network representations often cannot fully account for the structural richness of complex systems spanning multiple levels of organisation. Recently proposed high-order information-theoretic signals are well-suited to capture synergistic phenomena that transcend pairwise interactions; however, the exponential-growth of their cardinality severely hinders their applicability. In this work, we combine methods from harmonic analysis and combinatorial topology to construct efficient representations of high-order information-theoretic signals. The core of our method is the diangonalisation of a discrete version of the Laplace-de Rham operator, that geometrically encodes structural properties of the system. We capitalise these ideas by developing a complete workflow for the construction of hyperharmonic representations of high-order signals, which is applicable to a wide range of scenarios.
Keywords: high-order phenomena, Laplace operator, harmonic analysis, signal processing, information theory
The principle of representing interdependencies as networks has revolutionised complexity science by introducing a systematic approach to gain insight into the inner structure of a wide range of complex systems (Newman, 2018). These networks provided a lingua franca to describe the properties of interdependencies found in chemical, biological, social, and technological systems (Vasiliauskaite and Rosas, 2020), and have enabled great advances in a widening range of areas including computational neuroscience (Rubinov and Sporns, 2010), human evolution (Donges et al., 2011), financial analysis (Bonanno et al., 2004), and epidemic spreading (Pastor-Satorras and Vespignani, 2001), just to name a few. However, by their very nature these methods focus on the analysis of pairwise interactions, and hence are prone to miss important high-order synergistic phenomena that are a hallmark of many complex systems.
This critical limitation of traditional network analyses has been acknowledged by a number of recent research efforts that focus on developing techniques to study high-order interactions (Petri et al., 2014, Iacopini et al., 2019, Battiston et al., 2020). These developments have provided novel techniques capable of, for example, detecting non-local structures (Petri et al., 2013), highlighting the role of inhomogeneities in functional connections (Petri et al., 2014), and characterising discontinuous transitions (Iacopini et al., 2019). However, it is important to notice that many of these approaches are based on hypergraphs and other high-order structures build solely from pairwise statistics, and hence their scope remains limited.
An attractive set of tools to further extend the reach of high-order analyses can be found in a parallel body of work, which originated in efforts to develop tools to capture high-order statistical phenomena related to the brain (Tononi et al., 1994, Schneidman et al., 2003a, Latham and Nirenberg, 2005, Ganmor et al., 2011). Particularly interesting are multivariate extensions of Shannon’s mutual information, including the Interaction Information (McGill, 1954), Total Correlation (Watanabe, 1960), and Dual Total Correlation (Han, 1978), which can be used to gain insights about the high-order structure exhibited by groups of three or more interdependent variables (see e.g. Timme et al. (2014), Baudot et al. (2019)). In this work we focus on the recently proposed O-information (Rosas et al., 2019), which is a principled tool to identify synergy-dominated systems, and has have found to be relevant for analysing various complex systems — including the study of neural spiking data (Stramaglia et al., 2020) and aging in fMRI data (Gatica et al., 2020).
When applied to large systems, metrics such as the O-information are naturally represented as high-order signals whose domain is the set of all hyper-edges on a regular hypergraph. Because the cardinality of these hypegraphs grows super-exponentially with the system size, a key open problem is to find efficient ways to represent the content of these signals. While simple approaches such as computing the average of the signal across dimensions can be effective (Gatica et al., 2020), an important challenge is to find principled ways to generate low-dimensional representations of these signals while preserving their intrinsic relational structure across the hypergraph. A popular technique to study similar issues in the case of traditional (weighted) networks is to transform the signals into a basis of eigenvectors of the graph Laplacian (Belkin and Niyogi, 2003), which has been used with great success in the context of graph signal processing (see e.g. Shuman et al. (2013), Sandryhaila and Moura (2014), Atasoy et al. (2016, 2017), Expert et al. (2017)). Related methods based on harmonic analysis and combinatorial topology, which we refer to as hyperharmonic analysis, have been recently developed to study high-order signals (see e.g. Barbarossa and Sardellitti (2020)); however, they have not yet — to the best of our knowledge — been used to analyse high-order information-theoretic signals. Other aspects of the relationship between high-order information-theoretic quantities and topology has been explored in Baudot and Bennequin (2015), Baudot (2019).
This article establishes a bridge between the domains of high-order information-theory and hyperharmonic analysis, and introduces a complete workflow for the study of high-order information-theoretic signals using the hyperharmonic modes of a structural simplicial complex. Our choice of hyperharmonic modes is based on a discrete version of the Laplace-de Rham operator that geometrically encodes the strength of low-order interactions. Our workflow makes no assumptions about the structure of the data, and hence can be applied to a broad range of scenarios. As a proof of concept, we illustrate our approach analysing the musical scores of the latter symphonies written by F.J. Hadyn, were our results demonstrate the far superior dimensionality-reduction capabilities of our method compared to other representations.
The rest of the article is structured as follows. First, Section 2 introduces key notions from multivariate information theory, reviewing the state-of-the-art of high-order metrics. Then, Section 3 presents fundamental notions from combinatorial topology, required to define the Fourier transform of higher-dimensional signal. Section 4 introduces our proposed workflow, and Section 5 illustrates the method on Hadyn’s symphonies. Finally, Section 6 summarizes our conclusions and discusses future work.
2 High order information-theoretic measures
Let us consider a scientist studying a complex system, whose state is described by the vector . Let us assume that the scientist has enough data to allow for the construction of a reliable statistical description of its joint probability, which is denoted by . The goal of this study is to leverage the statistics encoded in in order to understand the structure of interdependencies that characterize . This endeavor can lead either to build statistical markers to classify different systems — or different states of the same system, or to build parallels between seemingly heterogeneous systems based on the similarity of their internal structure.
Through this section, random variables are denoted by capital letters (e.g. ) and their realisations by lower case letters (e.g. ). Random vectors and their realisations are denoted by capital and lower case boldface letters, respectively.
2.1 Networks and hypergraphs based on pairwise statistics
A popular way to analyze the interactions within is to represent them as networks, where each variable is represented as a node, and edges between nodes represent the strength of their interaction. A simple way to build such a network is to calculate the correlation matrix with components given by
and use it as an adjacency matrix — either binarising its components via a threshold, or considering weighted edges. This construction, however, only captures linear relationships between the variables. More encompassing analyses often consider non-linear measures of dependency such as Shannon’s mutual information and focus on the matrix of mutual information terms, which are computed as
2.2 High-order statistical effects
It is important to realise that the joint probability distribution may contain substantial information that is not assessed by any of the pairwise marginals . An elegant way of gaining insights into this is provided by information geometry, as presented in Amari (2001) (for related discussions, see also Rosas et al. (2016)). Consider the -marginals that are obtained by marginalising over variables. One can see that the -marginals provide a more detailed description of the system than the -marginals by noting that the latter can be directly computed from the former by marginalising the corresponding variables. In contrast, the process of marginalising involves irreversible information loss, as there are many marginals that are consistent with a given a set of -marginals.
Perhaps the simplest example of high-order statistical dependency is given by the exclusive-or (xor) logic gate. To see this, consider two independent fair coins and , and let
A quick calculation shows that the mutual information matrix of has all its off-diagonal elements equal to zero, making it indistinguishable from an alternative situation where is just another independent fair coin. Therefore, put simply, any “xor-like” effects is completely ignored by constructions based only on pairwise statistics — either networks or hypergraphs.
While commonly neglected, high-order statistics has been recently proven to be instrumental in a number of systems. For example, synergies have been shown to play a key role in distributed interdependent systems, including the most complex types of elementary cellular automata (Rosas et al., 2018). It has also been suggested that high-order interactions could drive thermalisation processes within closed systems (Lindgren and Olbrich, 2017). Additionally, high-order interactions have been argued to play a key role in neural information processing (Wibral et al., 2017) and high-order brain functions (Tononi et al., 1994), being at the core of popular metrics employed in consciousness science (Mediano et al., 2019). Furthermore, it has been recently shown that high-order statistics also play a crucial role in enabling emergent phenomena (Rosas et al., 2020a).
Shannon’s mutual information is limited to capture the dependencies of two groups of variables, but cannot directly assess triple or higher interactions. The most popular non-negative multivariate extensions of the mutual information are the Total Correlation (TC) (Watanabe, 1960) and the Dual Total Correlation (DTC) (Han, 1978), which are defined as
Above, corresponds to Shannon’s entropy, is the conditional Shannon entropy, and is the vector of all variables except (i.e., ). Importantly, both TC and DTC are zero if and only if all variables are jointly statistically independent — i.e. if .
Unfortunately, both TC and DTC provide metrics for high-order interdependency which are difficult to analyse together. A recent approach, introduced in Rosas et al. (2019), proposes to employ a linear transform over these two metrics to obtain the O-information and the S-information, which have more intuitive interpretations.
Given a set of random variables , their O-information is defined as
Similarly, their S-information is defined as
The O-information can be seen as a revision of the measure of neural complexity proposed by Tononi, Sporns and Edelman in Tononi et al. (1994), which provides a mathematical construction that is closer to their original desiderata (Rosas et al., 2019). In effect, the O-information is a signed metric that satisfies the following key properties:
It is zero for systems with only pairwise interdependencies.
It is additive over non-interactive subsystems.
It is maximised by redundant distributions, and minimised by synergistic (“xor-like”) distributions.
Hence, implies a predominance of statistical synergy within the system . Conversely, implies that the system is redundancy-dominated.
On the other hand, the S-information is an over-encompassing account of interdependencies taking place at all orders, being sometimes described as a “very mutual information” (James et al., 2011). In fact, a quick calculation shows that the S-information can be decomposed as , which can be seen as a chain rule where the interdependencies involving each variable are sequentially addressed.
In summary, while the TC and DTC provide alternative representations to the same construct, and provide a complementary account of the system: the latter addressing the overall strength of interdependencies, and the former qualitatively characterising their dominant nature.
2.4 Other metrics of high-order effects
Another popular metric of high-order interdependencies in the Interaction Information, first introduced in (McGill, 1954) for systems with three variables.111The Interaction Information is closely related to the I-measures (Yeung, 1991), the co-information (Bell, 2003), and the multi-scale complexity (Bar-Yam, 2004)1 Building on an application of the inclusion-exclusion principle to entropies, the Interaction Information of is a signed metric given by
where the sum is performed over all subsets , with the cardinality of , and the vector of all variables with indices in . While this measure has a direct interpretation as redundancy minus synergy for , it no longer reflects this balance for larger system sizes (Williams and Beer, 2010, Section V). However, the Interaction Information can still be interpreted for arbitrary under a topological formulation of information, as described in Baudot and Bennequin (2015), Baudot et al. (2019).
Other well-known metrics of high-order effects include the Redundancy-Synergy Index (Chechik et al., 2002, Timme et al., 2014), the Connected Information (Amari, 2001, Schneidman et al., 2003b), and the Partial Entropy Decomposition (Ince, 2017). Furthermore, a detailed exploration of multivariate decompositions can be found in the Partial Information Decomposition framework (Williams and Beer, 2010) and its constantly growing associated literature (Faes et al., 2017, Finn and Lizier, 2018, Ay et al., 2019, Rosas et al., 2020b, Makkeh et al., 2020).
3 Hyperharmonic analysis
This section, subdivided into three parts, introduces the basic concepts from combinatorial topology that we used to decompose high-order signals into hyperharmonic modes. First, Section 3.1 describes the objects over which signals will be considered; these are higher-dimensional versions of weighted graphs known as weighted simplicial complexes. Then, Section 3.2 introduces the algebraic structure used to model high-order signals on weighted simplicial complexes; it consists of a family of inner product spaces, one for each dimension, and a pair of canonical linear maps between adjacent spaces. Finally, Section 3.3 introduces the discrete analogue of the Laplace-de Rham operator, and defines the Fourier basis using a maximal set of linearly independent eigenvectors of this operator. Throughout the presentation, we provide references to more general treatments and original sources when possible.
3.1 Simplicial complexes
A hypergraph is determined by a set of vertices with and a set of hyper-edges , where is the power set of . Furthermore, a hypergraph is said to be a simplicial complex if it satisfies two conditions:
all singletons with are included in , and
if and is a subset of , then .
Please note that the passage to simplicial complexes does not restrict the theory of hypergraphs significantly, since to every hypergraph one can assign a canonical simplicial complex by downward closure. Explicitly, this is the smallest simplicial complex that contains the hypergraph. For an introduction to graphs and hypergraphs, we refer to Berge and Minieka (1973); for a comprehensive introduction to hypergraphs in the context of complex system analysis see Johnson (2013).
If is a simplicial complex, the elements of are referred to as simplices; and their dimension is defined as one less than their cardinality (i.e. the number of elements they connect). This shift can be understood noting that the natural dimension of a point, corresponding to a singleton, is . The subset of consisting of simplices of dimension is denoted by . The elements of are typically called “-simplices”, with the -simplices and -simplices being informally called vertices and edges, respectively. If are the vertices of a simplex, we denote this simplex by .
The simplicial complex with vertices and containing all possible simplices is referred to as the standard -simplex. In our applications, the vertices of the will be associated with a set of random variables .
3.2 Higher-dimensional signals
An -dimensional signal on a simplicial complex is a function , that is to say, an assignment of a real number to each -simplex. We refer to -signals with as high-order signals.
Consider a collection of random variables . For every , the high-order -signals and on are defined by
For a simplicial complex and , we denote by the vector space generated by all the -simplices of , i.e.,
Please note that there is a natural bijection between the set of -signals on and established by
This bijection is used implicitly when referring to the elements of as -signals.222In the mathematics literature, the elements of are called “real-valued -chains”, but we do not use this terminology.2
Let us now consider the linear map defined on basis elements by
with denoting the absence of from the simplex. One can visualise geometrically in terms of the boundary of a basis element. Up to a sign, the basis elements appearing as summands on are all simplices of dimension contained in . Furthermore, the sign is determined in terms of the orientations of the simplices, as illustrated in Figure 1. The maps play a central role in algebraic topology holding a significant amount of topological information. In particular, the difference between the dimension of the kernel of and the dimension of the image of is known as the -Betti number of the simplicial complex, a powerful topological invariant generalising the Euler characteristic. For a systematic treatment of these ideas, please consult Hatcher (2002).
In this work we are not only interested in the topology of , but also on the “geometric” structure encoded by weights assigned to its simplices. A weighted simplicial complex is a pair where is a simplicial complex and is a set of non-negative weight functions. In the following, a weighted standard simplex (see Example 1) is referred to as a structural simplex.
We can build a structural simplex using a collection of random variables by following Example 1, and defining the weights as follows. First, for each vertex in we assign , and for each edge we assigns the mutual information
Subsequently, for an -simplex with , its weight is defined as the mean value of the mutual information of all pairs of random variables in .
Given a weighted simplicial complex , one can encode the structural information provided by (where the subscript denotes the dimension) as an inner product on the vector space of -signals on . Explicitly, for any pair of -simplices we have
Importantly, this inner product allow us to introduce the adjoint operator of , which is denoted by . That is to say, the operator is defined by the identity
which holds for any and . Note that does not depend on the weights , but does.
3.3 High-order Laplace operator and Fourier basis
The classical sine and cosine functions form the basis of the spectral representation of time-domain signals. In effect, classical Fourier analysis guarantees that a large class of functions are expressible in terms of linear combination of these functions. The coefficients of this linear combination are the Fourier coefficients, which correspond to a geometric projection of the original function over this new basis. Interestingly, numerous signals of practical relevance are more compactly represented in the spectral domain, and hence Fourier analysis is often employed as a principled way to perform dimensionality-reduction. For a more extensive exposition of these ideas, we refer the reader to Bracewell (1986).
Importantly, sine and cosine functions are also eigenfunctions of the Laplace operator on the circle — a one-dimensional manifold. Put differently, a spectral representation is equivalent to a diagonalisation of the Laplace operator. Mathematicians have generalized the Laplace operator to higher-dimensional manifolds via the Laplace-de Rham operators. In this more general context, functions on a smooth manifold lie at the bottom of a sequence of higher-dimensional objects on which the corresponding Laplace-de Rham operator acts. The eigenvectors of these operators play a central role in modern geometry — most notably through Hodge theory (Hodge and Atiyah, 1989). For an exposition of these ideas we refer the reader to Morita et al. (2001).
A discrete analogue of the higher-dimensional objects on which the Laplace-de Rham operators act is provided by high-order signals defined over a simplicial complex. In this correspondence, the Laplace-de Rham operators are represented by the discrete Laplace operators first introduced in Eckmann (1944). See also Horak and Jost (2013) where a weighted version of these is presented, and Parzanchevski and Rosenthal (2017) where the connection to random walks is explored. We now introduce the particular version of the discrete Laplace-de Rham operators that we use.
Let be a weighted simplicial complex. The -Laplace operator is defined by
Notice that we are abusing notation by omitting from the notation referencing the Laplace operators since, although does not depend on , and therefore do. For the interested reader, we remark that this Definition 2 has a strong resemblance to the form the Laplace-de Rham operator adopts in Rimmanian geometry. In effect, by denoting by the exterior derivative of differential forms and its adjoint, the Laplace-de Rham operator in this context can be expressed as . Furthermore, the geometry defined by the Riemannian metric is reflected in the Laplace-de Rham operator through the operator only. Consult Morita et al. (2001) for further details.
The well-known graph Laplacian, a central concept in spectral graph theory (Chung et al., 1997), is equivalent to the -Laplace operator defined above — when a weighted graph is regarded as a weighted simplicial complex. Importantly, the -Laplace operator is self-adjoint for any weighted simplicial complex , i.e.
for all . This implies that is diagonalisable; that is to say, there exists a basis of consisting of eigenvectors of .333This result is a more general form of the well-known fact that symmetric (i.e. self-adjoint) matrices are diagonalisable.3 An -Fourier basis for is a maximal set of linearly independent orthonormal (with respect to ) eigenvectors of . Please note that we speak of the Fourier basis, with the understanding that there is a sign choice for each of its elements.
Finally, the hyperharmonic representation of a high-order signal defined over a weighted simplicial complex is given by its change of bases from the canonical to the Fourier one. In the case of graphs, the use of this transformation as a dimensionality-reduction method was pioneered by Belkin and Niyogi (2003), and has served as an early application of the field of graph signal processing — see for example Shuman et al. (2013), Sandryhaila and Moura (2014) or Ortega et al. (2018) for an overview of this field. In recent years, some key ideas from graph signal processing have been adapted to hypergraphs, see for example Barbarossa and Sardellitti (2020) and Schaub and Segarra (2018); however, the applicability of Fourier analysis as a compression tool of high-order signals is, to a large extend, still unexplored territory.
4 Proposed workflow
This section describes our proposed workflow, which capitalises the theoretical constructs elaborated in Sections 2 and 3. The overall pipeline is illustrated in Figure 2, being composed of six steps that are described in the following.
4.1 Steps of the analysis
The proposed pipeline consists in six steps.
From data to a distribution: The starting point of the workflow is multivariate data, which typically takes the shape of a sequence of vectors of dimension (e.g. successive samples of time series of the form at different values of ). Using these data, the first step is to construct a joint probability distribution for a -dimensional vector, denoted as .
The structural simplex: To build the structural simplex, the first step is to use to calculate the mutual information for each pair of variables of the system. These values are then used to define the weights on a pairwise network, where each edge correspond to one of the variables . Subsequently, a weighted -simplex is build for each by taking the average value of all the edges that connect two elements within the simplex.
High-order signals: For each subset of cardinality of , one computes the O-information and S-information and arrange them as high-order signals following Example 2. These signals are stored as -dimensional vectors and , representing them in the canonical basis of simplices. Note that there is a standard order on the elements of this basis, which is given by the lexicographic principle (i.e. if and for all ).
The Laplace operator: Using the weights of the structural simplex constructed in Step (2), one then builds the corresponding -Laplace operator as in Definition 2. Concretely, we use the canonical basis to represent the weights of order within a diagonal matrix , and represents the linear map as a matrix of the same dimensions (for an algorithmic description of the construction of , see B). Then, the discrete -Laplace operator is represented in the canonical basis as the matrix
with and given by
The Fourier basis: The -Fourier basis of the structural simplex is constructed by choosing a maximal linearly independent set of eigenvectors of the -Laplace operator that are orthonormal with respect to the inner product defined by the weights of the structural simplex (see Eq. (8)). Concretely, one needs to find a matrix such that
with a diagonal matrix, which also satisfies
where is the identity matrix of dimension .
Hyperharmonic representation: As a final step, one calculates the Fourier transform of the high-order signals calculated in Step (3), denoted by and , as follows:
This workflow has been designed with modularity and flexibility in mind, in order to facilitate its adaptation to the needs of diverse applications. This section highlights possible variations over the workflow that can better accommodate the specific needs of different scenarios.
First, note that our choice of and in Step (3) as high-order information signals is based on their interpretability, and on their promising value for the analysis of complex systems. Nevertheless, other high-order measures (e.g. the ones described in 2.4) can also be analysed following the same pipeline.
Additionally, Step (2) suggests to build the structural simplex by propagating the underlying mutual information from edges to higher-dimensional simplices via averages. However, one could use other constructions: e.g. use the maximum or minimum mutual information instead of the average. Additionally, one could replace the mutual information with other non-negative metrics of similarity, including the total variation distance, the Wasserstein distance, or the absolute value of the Pearson correlation. Furthermore, one could also use non-negative high-order metrics (such as the TC or DTC, see Section 2.3) for building the structural simplex directly.
Finally, it is important to note that Step (1) is included mainly for pedagogical purposes, but is often omitted in practice. In effect, most modern techniques to estimate information-theoretic quantities from data avoid to build an explicit joint distribution, as this introduces additional biases. For discrete data, we recommend considering Bayesian estimators such as the ones discussed in Archer et al. (2013), and the software package DIT (James et al., 2018). For continuous data, our preferred choice are Kraskof estimators (Kraskov et al., 2004) that can be implemented via JIDT (Lizier, 2014). However, these estimators require substantial amounts of data, and a more flexible alternative are provided by estimators based on Gaussian Copulas (Ince et al., 2017).
5 Proof of concept: Haydn symphonies
As an illustration of the proposed workflow, this section presents an analysis of the degree of dimensionality-reduction that can be attained by performing hyperharmonic analysis over the O-information and S-information signals calculated over a small dataset. For this purpose, we use data from the music scores of Franz Joseph Haydn (1732–1809), one of the most iconic figures of the Classic Period. We focus on Haydn’s latter “London symphonies”, which are typically divided into two groups: Symphonies Nos. 93–98, which were composed during the first visit of Haydn to London, and Symphonies 99–104, which were composed either in Vienna or London during Haydn’s second visit (Clark, 2005).
5.1 Method description
Our analysis is based on electronic scores that are publicly available at http://kern.ccarh.org, from where we extracted the scores of Symphonies No. 93--94 and No. 99--104.444The data for Symphonies 95–98 was not available.4 All symphonies use the same basic instrumentation: flutes, oboes, bassoons, horns, trumpets, timpani, violins, viola, violoncello, and double bass; only some of these symphonies employ clarinets, which were therefore not included in the analysis. To avoid duplicates, our analyses consider only one instrument of each kind, which left an arrangement of nine parts (violoncello and double bass were assumed to be equivalent).
The scores of the four movements of the selected symphonies were pre-processed in Python 3.8.5 using the Music21 package (http://web.mit.edu/music21). Each movement was transformed into nine coupled time series taking 13 possible values — one for each note plus one for the silence, using a small rhythmic duration as time unit. With these data, the joint distribution of the values for the nine-note chords was estimated using their empirical frequency. Note that regularisation methods (such as Laplace smoothing) were not employed, as many configurations (e.g. highly dissonant chords) are never explored in the Classic repertoire.
Our subsequent analyses were restricted to high-order signals depending on the joint probability of no more than instruments. Given the structure of the dataset (8 symphonies with 4 movements each), we computed the structural simplex using a distribution calculated using all the data. In contrast, individual high-order signals were calculated for each movement of each symphony by using only the corresponding data.
To measure compressibility, we used the following metric defined for any basis. Consider a given -dimensional high-order signal, whose coefficients on the given basis are . Without loss of generality we assume that if . Then, we define the functions
with being the (normalised) explained variance by the -th strongest component, and the cumulative explained variance by the strongest components. This definition is motivated by Parseval’s theorem, which guarantees that the sum of the square of the Fourier coefficients is equal to the variance of the signal; hence, the of the Fourier transform of a signal corresponds to the percentage of the variance that is accounted by the strongest components.
The hyperharmonic representation of the considered high-order signals ( and ) was found to be substantially more concentrated than their corresponding canonical representations. Figure 3 illustrates the curves of cumulative explained variance for various dimensions, and Table 1 presents the number of components required to fulfil various reconstruction levels. In particular, it was found that a small number of components suffices to account for most of the variance observed in the hyperharmonic representations of and . Moreover, this good performance is found to be stable across dimensions.
|Cumulative Explained Variance|
|O-info||2||3 / 13||5 / 26||9 / 38||14 / 47||28 / 63|
|3||4 / 22||8 / 41||13 / 57||20 / 71||37 / 95|
|4||1 / 35||3 / 58||6 / 78||12 / 93||29 / 113|
|5||2 / 34||4 / 53||8 / 65||12 / 73||26 / 82|
|S-info||2||2 / 29||4 / 48||6 / 62||8 / 71||19 / 81|
|3||3 / 52||7 / 80||11 / 99||14 / 111||27 / 123|
|4||1 / 57||2 / 86||3 / 103||9 / 113||24 / 123|
|5||2 / 42||4 / 60||8 / 71||13 / 78||26 / 83|
To verify the unique properties of the Laplace operators, an additional control was run were the same signals were transformed according to randomly-generated bases. Interestingly, these representations do not exhibit the degree of dimensionality-reduction shown by the hyperharmonic representations (see A).
Finally, our results also show that the canonical representation of the S-information is much less concentrated than the canonical representation of the O-information. This dissimilarity suggests that the O-information is capturing a more specific signature of the different modalities of interdependency that exist across the orchestra. Moreover, the fact that the dissimilarity is not seen in their hyperharmonic representation further suggests that both signals may actually carry an equivalent amount of information, which happens to be differently localised over the canonical basis.
Complex systems are characterised by having multiple levels of organisation, which makes network analyses focused on pairwise interactions unable to give a full account of their properties. Phenomena beyond pairwise interactions can be effectively captured by high-order information-theoretic metrics; however, their applicability and interpretability is limited due to the fast growth of their cardinality as a function of the system’s size. Here we propose to represent these high-order metrics using the hyperharmonic modes of a geometrical representation of structural properties of the system. This provides a principled approach to constructing low-dimensional representations of high-order signals, which can retain most of the informational content.
Our proposed approach is widely applicable, and promises to enable a range of future explorations. It is our hope that this incursion into the intersection of multivariate information theory and combinatorial topology will motivate further developments on this fertile area of research.
Appendix A Control employing randomly generated bases
Here we review the additional control that confirms that the results presented in Section 5.2 are not due to a limitation of the canonical bases, but are due to the special advantages of the Laplace operators. Specifically, we calculated the average CEV over randomly generated bases for the same high-order signals considered in Section 5.2, and then compared them to the CEV obtained via the hyperharmonic representation. Our result, shown in Figure 4, reveals that the explained variance attainable via randomly generated bases is substantially and consistently lower than the one associated to the Fourier bases. This provides additional evidence on the favorable properties of the hyperharmonic representation of the considered signals.
Appendix B Construction of
The matrices correspond to the representation of the linear maps on the canonical basis (see Section 3.2 and Figure 1). Algorithmically, one can use Eq. (7) to determine the value of each column, inserting either a or a to the entries corresponding to simplices in its boundary — as depicted in Figure 1 — and setting the other entries to 0. To illustrate the procedure, let us present the four matrices that correspond to :
- Amari (2001) S. I. Amari. Information geometry on hierarchy of probability distributions. IEEE Transactions on Information Theory, 2001. ISSN 00189448. doi: 10.1109/18.930911.
- Archer et al. (2013) E. Archer, I. M. Park, and J. W. Pillow. Bayesian and quasi-bayesian estimators for mutual information from discrete data. Entropy, 15(5):1738–1755, 2013.
- Atasoy et al. (2016) S. Atasoy, I. Donnelly, and J. Pearson. Human brain networks function in connectome-specific harmonic waves. Nature Communications, 2016. ISSN 20411723. doi: 10.1038/ncomms10340.
- Atasoy et al. (2017) S. Atasoy, L. Roseman, M. Kaelen, M. L. Kringelbach, G. Deco, and R. L. Carhart-Harris. Connectome-harmonic decomposition of human brain activity reveals dynamical repertoire re-organization under lsd. Scientific reports, 7(1):1–18, 2017.
- Ay et al. (2019) N. Ay, D. Polani, and N. Virgo. Information decomposition based on cooperative game theory. arXiv preprint arXiv:1910.05979, 2019.
- Bar-Yam (2004) Y. Bar-Yam. Multiscale variety in complex systems. Complexity, 2004. ISSN 10762787. doi: 10.1002/cplx.20014.
- Barbarossa and Sardellitti (2020) S. Barbarossa and S. Sardellitti. Topological signal processing over simplicial complexes. IEEE Transactions on Signal Processing, 2020.
- Battiston et al. (2020) F. Battiston, G. Cencetti, I. Iacopini, V. Latora, M. Lucas, A. Patania, J. G. Young, and G. Petri. Networks beyond pairwise interactions: Structure and dynamics, 2020. ISSN 03701573.
- Baudot (2019) P. Baudot. The poincare-shannon machine: statistical physics and machine learning aspects of information cohomology. Entropy, 21(9):881, 2019.
- Baudot and Bennequin (2015) P. Baudot and D. Bennequin. The homological nature of entropy. Entropy, 17(5):3253–3318, 2015.
- Baudot et al. (2019) P. Baudot, M. Tapia, D. Bennequin, and J.-M. Goaillard. Topological information data analysis. Entropy, 21(9):869, 2019.
- Belkin and Niyogi (2003) M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
- Bell (2003) A. J. Bell. The co-information lattice. In Proc. 4th Int. Symp. Independent Component Analysis and Blind Source Separation, pages 921–926, 2003.
- Berge and Minieka (1973) C. Berge and E. Minieka. Graphs and Hypergraphs. Graphs and Hypergraphs. North-Holland Publishing Company, 1973. ISBN 9780444103994.
- Bonanno et al. (2004) G. Bonanno, G. Caldarelli, F. Lillo, S. Miccichè, N. Vandewalle, and R. N. Mantegna. Networks of equities in financial markets. In European Physical Journal B, 2004. doi: 10.1140/epjb/e2004-00129-6.
- Bracewell (1986) R. N. Bracewell. The Fourier transform and its applications, volume 31999. McGraw-Hill New York, 1986.
- Chechik et al. (2002) G. Chechik, A. Globerson, M. J. Anderson, E. D. Young, I. Nelken, and N. Tishby. Group redundancy measures reveal redundancy reduction in the auditory pathway. In Advances in neural information processing systems, pages 173–180, 2002.
- Chung et al. (1997) F. Chung, F. Graham, C. C. on Recent Advances in Spectral Graph Theory, N. S. F. (U.S.), A. M. Society, and C. B. of the Mathematical Sciences. Spectral Graph Theory. CBMS Regional Conference Series. Conference Board of the mathematical sciences, 1997. ISBN 9780821803158.
- Clark (2005) C. Clark. The Cambridge Companion to Haydn. Cambridge Companions to Music. Cambridge University Press, 2005. ISBN 9780521833479.
- Donges et al. (2011) J. F. Donges, R. V. Donner, M. H. Trauth, N. Marwan, H.-J. Schellnhuber, and J. Kurths. Nonlinear detection of paleoclimate-variability transitions possibly related to human evolution. Proceedings of the National Academy of Sciences, 108(51):20422–20427, 2011.
- Eckmann (1944) B. Eckmann. Harmonische funktionen und randwertaufgaben in einem komplex. Commentarii Mathematici Helvetici, 17(1):240–255, 1944.
- Expert et al. (2017) P. Expert, S. De Nigris, T. Takaguchi, and R. Lambiotte. Graph spectral characterization of the x y model on complex networks. Physical Review E, 96(1):012312, 2017.
- Faes et al. (2017) L. Faes, D. Marinazzo, and S. Stramaglia. Multiscale information decomposition: Exact computation for multivariate gaussian processes. Entropy, 19(8):408, 2017.
- Finn and Lizier (2018) C. Finn and J. T. Lizier. Pointwise partial information decomposition using the specificity and ambiguity lattices. Entropy, 20(4):297, 2018.
- Ganmor et al. (2011) E. Ganmor, R. Segev, and E. Schneidman. Sparse low-order interaction network underlies a highly correlated and learnable neural population code. Proceedings of the National Academy of sciences, 108(23):9679–9684, 2011.
- Gatica et al. (2020) M. Gatica, R. Cofre, P. A. Mediano, F. E. Rosas, P. Orio, I. Diez, S. Swinnen, and J. M. Cortes. High-order interdependencies in the aging brain. bioRxiv, 2020. doi: https://doi.org/10.1101/2020.03.17.995886.
- Han (1978) T. S. Han. Nonnegative entropy measures of multivariate symmetric correlations. Information and Control, 1978. ISSN 00199958. doi: 10.1016/S0019-9958(78)90275-9.
- Hatcher (2002) A. Hatcher. Algebraic topology. Cambridge University Press, Cambridge, 2002. ISBN 0-521-79160-X; 0-521-79540-0.
- Hodge and Atiyah (1989) W. Hodge and M. Atiyah. The Theory and Applications of Harmonic Integrals. Cambridge mathematical library. Cambridge University Press, 1989. ISBN 9780521358811.
- Horak and Jost (2013) D. Horak and J. Jost. Spectra of combinatorial laplace operators on simplicial complexes. Advances in Mathematics, 244:303–336, 2013.
- Iacopini et al. (2019) I. Iacopini, G. Petri, A. Barrat, and V. Latora. Simplicial models of social contagion. Nature Communications, 2019. ISSN 20411723. doi: 10.1038/s41467-019-10431-6.
- Ince (2017) R. A. Ince. The partial entropy decomposition: Decomposing multivariate entropy and mutual information via pointwise common surprisal. arXiv preprint arXiv:1702.01591, 2017.
- Ince et al. (2017) R. A. Ince, B. L. Giordano, C. Kayser, G. A. Rousselet, J. Gross, and P. G. Schyns. A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula. Human Brain Mapping, 38(3):1541–1573, 2017. ISSN 10970193. doi: 10.1002/hbm.23471.
- James et al. (2011) R. G. James, C. J. Ellison, and J. P. Crutchfield. Anatomy of a bit: Information in a time series observation. Chaos: An Interdisciplinary Journal of Nonlinear Science, 21(3):037109, 2011.
- James et al. (2018) R. G. James, C. J. Ellison, and J. P. Crutchfield. “dit“: a python package for discrete information theory. Journal of Open Source Software, 3(25):738, 2018.
- Johnson (2013) J. Johnson. Hypernetworks in the science of complex systems, volume 3. World Scientific, 2013.
- Kraskov et al. (2004) A. Kraskov, H. Stögbauer, and P. Grassberger. Estimating mutual information. Physical review E, 69(6):066138, 2004.
- Latham and Nirenberg (2005) P. E. Latham and S. Nirenberg. Synergy, redundancy, and independence in population codes, revisited. Journal of Neuroscience, 25(21):5195–5206, 2005.
- Lindgren and Olbrich (2017) K. Lindgren and E. Olbrich. The approach towards equilibrium in a reversible ising dynamics model: an information-theoretic analysis based on an exact solution. Journal of Statistical Physics, 168(4):919–935, 2017.
- Lizier (2014) J. T. Lizier. Jidt: An information-theoretic toolkit for studying the dynamics of complex systems. Frontiers in Robotics and AI, 1:11, 2014.
- Makkeh et al. (2020) A. Makkeh, A. J. Gutknecht, and M. Wibral. A differentiable measure of pointwise shared information. arXiv preprint arXiv:2002.03356, 2020.
- McGill (1954) W. McGill. Multivariate information transmission. Transactions of the IRE Professional Group on Information Theory, 4(4):93–111, 1954.
- Mediano et al. (2019) P. A. Mediano, F. Rosas, R. L. Carhart-Harris, A. K. Seth, and A. B. Barrett. Beyond integrated information: A taxonomy of information dynamics phenomena. arXiv preprint arXiv:1909.02297, 2019.
- Morita et al. (2001) S. Morita, T. Nagase, A. M. Society, and K. Nomizu. Geometry of Differential Forms. Iwanami series in modern mathematics. American Mathematical Society, 2001. ISBN 9780821810453.
- Newman (2018) M. Newman. Networks. OUP Oxford, 2018. ISBN 9780192527493.
- Ortega et al. (2018) A. Ortega, P. Frossard, J. Kovačević, J. M. Moura, and P. Vandergheynst. Graph signal processing: Overview, challenges, and applications. Proceedings of the IEEE, 106(5):808–828, 2018.
- Parzanchevski and Rosenthal (2017) O. Parzanchevski and R. Rosenthal. Simplicial complexes: spectrum, homology and random walks. Random Structures & Algorithms, 50(2):225–261, 2017.
- Pastor-Satorras and Vespignani (2001) R. Pastor-Satorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical Review Letters, 2001. ISSN 00319007. doi: 10.1103/PhysRevLett.86.3200.
- Petri et al. (2013) G. Petri, M. Scolamiero, I. Donato, and F. Vaccarino. Topological Strata of Weighted Complex Networks. PLoS ONE, 2013. ISSN 19326203. doi: 10.1371/journal.pone.0066506.
- Petri et al. (2014) G. Petri, P. Expert, F. Turkheimer, R. Carhart-Harris, D. Nutt, P. J. Hellyer, and F. Vaccarino. Homological scaffolds of brain functional networks. Journal of the Royal Society Interface, 2014. ISSN 17425662. doi: 10.1098/rsif.2014.0873.
- Rosas et al. (2016) F. E. Rosas, V. Ntranos, C. J. Ellison, S. Pollin, and M. Verhelst. Understanding interdependency through complex information sharing. Entropy, 18(2):38, 2016.
- Rosas et al. (2018) F. E. Rosas, P. A. Mediano, M. Ugarte, and H. J. Jensen. An information-theoretic approach to self-organisation: Emergence of complex interdependencies in coupled dynamical systems. Entropy, 20(10):793, 2018.
- Rosas et al. (2019) F. E. Rosas, P. A. Mediano, M. Gastpar, and H. J. Jensen. Quantifying high-order interdependencies via multivariate extensions of the mutual information. Physical Review E, 100(3):032305, 2019.
- Rosas et al. (2020a) F. E. Rosas, P. A. Mediano, H. J. Jensen, A. K. Seth, A. B. Barrett, R. L. Carhart-Harris, and D. Bor. Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data. arXiv preprint arXiv:2004.08220, 2020a.
- Rosas et al. (2020b) F. E. Rosas, P. A. Mediano, B. Rassouli, and A. Barrett. An operational information decomposition via synergistic disclosure. arXiv preprint arXiv:2001.10387, 2020b.
- Rubinov and Sporns (2010) M. Rubinov and O. Sporns. Complex network measures of brain connectivity: Uses and interpretations. NeuroImage, 2010. ISSN 10538119. doi: 10.1016/j.neuroimage.2009.10.003.
- Sandryhaila and Moura (2014) A. Sandryhaila and J. M. Moura. Discrete signal processing on graphs: Frequency analysis. IEEE Transactions on Signal Processing, 62(12):3042–3054, 2014.
- Schaub and Segarra (2018) M. T. Schaub and S. Segarra. Flow smoothing and denoising: graph signal processing in the edge-space. In 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pages 735–739. IEEE, 2018.
- Schneidman et al. (2003a) E. Schneidman, W. Bialek, and M. J. Berry. Synergy, redundancy, and independence in population codes. Journal of Neuroscience, 23(37):11539–11553, 2003a.
- Schneidman et al. (2003b) E. Schneidman, S. Still, M. J. Berry, W. Bialek, et al. Network information and connected correlations. Physical review letters, 91(23):238701, 2003b.
- Shuman et al. (2013) D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine, 30(3):83–98, 2013.
- Stramaglia et al. (2020) S. Stramaglia, T. Scagliarini, B. C. Daniels, and D. Marinazzo. Quantifying dynamical high-order interdependencies from the o-information: an application to neural spiking dynamics. arXiv preprint arXiv:2007.16018, 2020.
- Timme et al. (2014) N. Timme, W. Alford, B. Flecker, and J. M. Beggs. Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective. Journal of computational neuroscience, 36(2):119–140, 2014.
- Tononi et al. (1994) G. Tononi, O. Sporns, and G. M. Edelman. A measure for brain complexity: relating functional segregation and integration in the nervous system. Proceedings of the National Academy of Sciences, 91(11):5033–5037, 1994.
- Vasiliauskaite and Rosas (2020) V. Vasiliauskaite and F. E. Rosas. Understanding complexity via network theory: a gentle introduction. arXiv preprint arXiv:2004.14845, 2020.
- Watanabe (1960) S. Watanabe. Information theoretical analysis of multivariate correlation. IBM Journal of research and development, 4(1):66–82, 1960.
- Wibral et al. (2017) M. Wibral, V. Priesemann, J. W. Kay, J. T. Lizier, and W. A. Phillips. Partial information decomposition as a unified approach to the specification of neural goal functions. Brain and cognition, 112:25–38, 2017.
- Williams and Beer (2010) P. L. Williams and R. D. Beer. Nonnegative decomposition of multivariate information. arXiv preprint arXiv:1004.2515, 2010.
- Yeung (1991) R. W. Yeung. A New Outlook on Shannon’s Information Measures. IEEE Transactions on Information Theory, 1991. ISSN 15579654. doi: 10.1109/18.79902.