Hyperharmonic analysis for the study of high-order information-theoretic signals

Anibal M. Medina-Mardones; Fernando E. Rosas; Sebastián E. Rodríguez; Rodrigo Cofré

Hyperharmonic analysis for the study of high-order information-theoretic signals

Identifier: https://froehlichmarcel.inrupt.net/public/dcbc8aba-9f46-4ec8-b9cc-2a0eecc4fcb0

Derived From: https://www.arxiv-vanity.com/papers/2010.01117/

Derived On: 2020-10-08

Anibal M. Medina-Mardones

^{1, 2}

, Fernando E. Rosas

^{3, 4, 5}

, Sebastián E. Rodríguez

^{6}

, Rodrigo Cofré

^{7}

^{1}

Laboratory of Topology and Neuroscience, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

^{2}

Department of Mathematics, University of Notre Dame du Lac, Notre Dame, Indiana, USA

^{3}

Data Science Institute, Imperial College London, London SW7 2AZ, UK

^{4}

Center for Psychedelic Research, Department of Medicine, Imperial College London, London SW7 2DD, UK

^{5}

Center for Complexity Science, Imperial College London, London SW7 2AZ, UK

^{6}

CIMFAV-Ingemat, Facultad de Ingeniería, Universidad de Valparaíso, Valparaíso, Chile

^{7}

Universidad Técnica Federico Santa María, Departamento de Informática, Valparaíso, Chile

Abstract

Network representations often cannot fully account for the structural richness of complex systems spanning multiple levels of organisation. Recently proposed high-order information-theoretic signals are well-suited to capture synergistic phenomena that transcend pairwise interactions; however, the exponential-growth of their cardinality severely hinders their applicability. In this work, we combine methods from harmonic analysis and combinatorial topology to construct efficient representations of high-order information-theoretic signals. The core of our method is the diangonalisation of a discrete version of the Laplace-de Rham operator, that geometrically encodes structural properties of the system. We capitalise these ideas by developing a complete workflow for the construction of hyperharmonic representations of high-order signals, which is applicable to a wide range of scenarios.

September 2020

Keywords: high-order phenomena, Laplace operator, harmonic analysis, signal processing, information theory

1 Introduction

The principle of representing interdependencies as networks has revolutionised complexity science by introducing a systematic approach to gain insight into the inner structure of a wide range of complex systems (Newman, 2018). These networks provided a lingua franca to describe the properties of interdependencies found in chemical, biological, social, and technological systems (Vasiliauskaite and Rosas, 2020), and have enabled great advances in a widening range of areas including computational neuroscience (Rubinov and Sporns, 2010), human evolution (Donges et al., 2011), financial analysis (Bonanno et al., 2004), and epidemic spreading (Pastor-Satorras and Vespignani, 2001), just to name a few. However, by their very nature these methods focus on the analysis of pairwise interactions, and hence are prone to miss important high-order synergistic phenomena that are a hallmark of many complex systems.

This critical limitation of traditional network analyses has been acknowledged by a number of recent research efforts that focus on developing techniques to study high-order interactions (Petri et al., 2014, Iacopini et al., 2019, Battiston et al., 2020). These developments have provided novel techniques capable of, for example, detecting non-local structures (Petri et al., 2013), highlighting the role of inhomogeneities in functional connections (Petri et al., 2014), and characterising discontinuous transitions (Iacopini et al., 2019). However, it is important to notice that many of these approaches are based on hypergraphs and other high-order structures build solely from pairwise statistics, and hence their scope remains limited.

An attractive set of tools to further extend the reach of high-order analyses can be found in a parallel body of work, which originated in efforts to develop tools to capture high-order statistical phenomena related to the brain (Tononi et al., 1994, Schneidman et al., 2003a, Latham and Nirenberg, 2005, Ganmor et al., 2011). Particularly interesting are multivariate extensions of Shannon’s mutual information, including the Interaction Information (McGill, 1954), Total Correlation (Watanabe, 1960), and Dual Total Correlation (Han, 1978), which can be used to gain insights about the high-order structure exhibited by groups of three or more interdependent variables (see e.g. Timme et al. (2014), Baudot et al. (2019)). In this work we focus on the recently proposed O-information (Rosas et al., 2019), which is a principled tool to identify synergy-dominated systems, and has have found to be relevant for analysing various complex systems — including the study of neural spiking data (Stramaglia et al., 2020) and aging in fMRI data (Gatica et al., 2020).

When applied to large systems, metrics such as the O-information are naturally represented as high-order signals whose domain is the set of all hyper-edges on a regular hypergraph. Because the cardinality of these hypegraphs grows super-exponentially with the system size, a key open problem is to find efficient ways to represent the content of these signals. While simple approaches such as computing the average of the signal across dimensions can be effective (Gatica et al., 2020), an important challenge is to find principled ways to generate low-dimensional representations of these signals while preserving their intrinsic relational structure across the hypergraph. A popular technique to study similar issues in the case of traditional (weighted) networks is to transform the signals into a basis of eigenvectors of the graph Laplacian (Belkin and Niyogi, 2003), which has been used with great success in the context of graph signal processing (see e.g. Shuman et al. (2013), Sandryhaila and Moura (2014), Atasoy et al. (2016, 2017), Expert et al. (2017)). Related methods based on harmonic analysis and combinatorial topology, which we refer to as hyperharmonic analysis, have been recently developed to study high-order signals (see e.g. Barbarossa and Sardellitti (2020)); however, they have not yet — to the best of our knowledge — been used to analyse high-order information-theoretic signals. Other aspects of the relationship between high-order information-theoretic quantities and topology has been explored in Baudot and Bennequin (2015), Baudot (2019).

This article establishes a bridge between the domains of high-order information-theory and hyperharmonic analysis, and introduces a complete workflow for the study of high-order information-theoretic signals using the hyperharmonic modes of a structural simplicial complex. Our choice of hyperharmonic modes is based on a discrete version of the Laplace-de Rham operator that geometrically encodes the strength of low-order interactions. Our workflow makes no assumptions about the structure of the data, and hence can be applied to a broad range of scenarios. As a proof of concept, we illustrate our approach analysing the musical scores of the latter symphonies written by F.J. Hadyn, were our results demonstrate the far superior dimensionality-reduction capabilities of our method compared to other representations.

The rest of the article is structured as follows. First, Section 2 introduces key notions from multivariate information theory, reviewing the state-of-the-art of high-order metrics. Then, Section 3 presents fundamental notions from combinatorial topology, required to define the Fourier transform of higher-dimensional signal. Section 4 introduces our proposed workflow, and Section 5 illustrates the method on Hadyn’s symphonies. Finally, Section 6 summarizes our conclusions and discusses future work.

2 High order information-theoretic measures

Let us consider a scientist studying a complex system, whose state is described by the vector $\biXN=(X0,…,XN)$ . Let us assume that the scientist has enough data to allow for the construction of a reliable statistical description of its joint probability, which is denoted by $p (X_{0}, \dots, X_{N})$ . The goal of this study is to leverage the statistics encoded in $p$ in order to understand the structure of interdependencies that characterize $\biXN$ . This endeavor can lead either to build statistical markers to classify different systems — or different states of the same system, or to build parallels between seemingly heterogeneous systems based on the similarity of their internal structure.

Through this section, random variables are denoted by capital letters (e.g. $X, Y$ ) and their realisations by lower case letters (e.g. $x, y$ ). Random vectors and their realisations are denoted by capital and lower case boldface letters, respectively.

2.1 Networks and hypergraphs based on pairwise statistics

A popular way to analyze the interactions within $\biXN$ is to represent them as networks, where each variable $X_{0}, \dots, X_{N}$ is represented as a node, and edges between nodes represent the strength of their interaction. A simple way to build such a network is to calculate the correlation matrix $R\biXN:=[ri,j]$ with components given by

r_{i, j} = \sum x_{i}, x_{j} p (x_{i}, x_{j}) [x_{i} x_{j} - \sum x_{i} x_{i} \sum x_{j} x_{j}] .

(1)

and use it as an adjacency matrix — either binarising its components via a threshold, or considering weighted edges. This construction, however, only captures linear relationships between the variables. More encompassing analyses often consider non-linear measures of dependency such as Shannon’s mutual information and focus on the matrix $I\biXN:=[I(Xi;Xj)]$ of mutual information terms, which are computed as

I (X_{i}; X_{j}) = \sum x_{i}, x_{j} p (x_{i}, x_{j}) log \frac{p (x_{i}, x_{j})}{p (x_{i}) p (x_{j})} .

(2)

2.2 High-order statistical effects

It is important to realise that the joint probability distribution $p(\biXN)$ may contain substantial information that is not assessed by any of the pairwise marginals $p (X_{i}, X_{j})$ . An elegant way of gaining insights into this is provided by information geometry, as presented in Amari (2001) (for related discussions, see also Rosas et al. (2016)). Consider the $k$ -marginals that are obtained by marginalising $p(\biXN)$ over $N - k$ variables. One can see that the $k$ -marginals provide a more detailed description of the system than the $(k - 1)$ -marginals by noting that the latter can be directly computed from the former by marginalising the corresponding variables. In contrast, the process of marginalising involves irreversible information loss, as there are many $k - 1$ marginals that are consistent with a given a set of $k$ -marginals.

Perhaps the simplest example of high-order statistical dependency is given by the exclusive-or (xor) logic gate. To see this, consider two independent fair coins $X_{0}$ and $X_{1}$ , and let

X_{2} = X_{0} (xor) X_{1} = {\begin{matrix} 0 if X_{0} = X_{1}, 1 otherwise. \end{matrix}

(3)

A quick calculation shows that the mutual information matrix $I\biX2$ of $(X_{0}, X_{1}, X_{2})$ has all its off-diagonal elements equal to zero, making it indistinguishable from an alternative situation where $X_{2}$ is just another independent fair coin. Therefore, put simply, any “xor-like” effects is completely ignored by constructions based only on pairwise statistics — either networks or hypergraphs.

While commonly neglected, high-order statistics has been recently proven to be instrumental in a number of systems. For example, synergies have been shown to play a key role in distributed interdependent systems, including the most complex types of elementary cellular automata (Rosas et al., 2018). It has also been suggested that high-order interactions could drive thermalisation processes within closed systems (Lindgren and Olbrich, 2017). Additionally, high-order interactions have been argued to play a key role in neural information processing (Wibral et al., 2017) and high-order brain functions (Tononi et al., 1994), being at the core of popular metrics employed in consciousness science (Mediano et al., 2019). Furthermore, it has been recently shown that high-order statistics also play a crucial role in enabling emergent phenomena (Rosas et al., 2020a).

2.3 O-information

Shannon’s mutual information is limited to capture the dependencies of two groups of variables, but cannot directly assess triple or higher interactions. The most popular non-negative multivariate extensions of the mutual information are the Total Correlation (TC) (Watanabe, 1960) and the Dual Total Correlation (DTC) (Han, 1978), which are defined as

	$TC(\biXN)$	$:=N∑i=0H(Xi)−H(\biXN),$
	$DTC(\biXN)$	$:=H(\biXN)−N∑i=0H(Xi∣\biXN−i).$

Above, $H (X_{i}) = - \sum_{x_{i}} p (x_{i}) log p (x_{i})$ corresponds to Shannon’s entropy, $H (X_{i} | X_{j}) = H (X_{i}, X_{j}) - H (X_{j})$ is the conditional Shannon entropy, and $\biXN−i$ is the vector of all variables except $X_{i}$ (i.e., $\biXN−i=(X0,…,Xi−1,Xi+1,…,XN)$ ). Importantly, both TC and DTC are zero if and only if all variables $X_{0}, \dots, X_{N}$ are jointly statistically independent — i.e. if $p(\biXN)=∏Ni=0p(Xi)$ .

Unfortunately, both TC and DTC provide metrics for high-order interdependency which are difficult to analyse together. A recent approach, introduced in Rosas et al. (2019), proposes to employ a linear transform over these two metrics to obtain the O-information and the S-information, which have more intuitive interpretations.

Definition 1

Given a set of $N + 1$ random variables $\biXN=(X0,…,XN)$ , their O-information is defined as

Ω(\biXN)=TC(\biXN)−DTC(\biXN).

(4)

Similarly, their S-information is defined as

Σ(\biXN)=TC(\biXN)+DTC(\biXN).

(5)

The O-information can be seen as a revision of the measure of neural complexity proposed by Tononi, Sporns and Edelman in Tononi et al. (1994), which provides a mathematical construction that is closer to their original desiderata (Rosas et al., 2019). In effect, the O-information is a signed metric that satisfies the following key properties:

It is zero for systems with only pairwise interdependencies.
It is additive over non-interactive subsystems.
It is maximised by redundant distributions, and minimised by synergistic (“xor-like”) distributions.

Hence, $Ω(\biXN)<0$ implies a predominance of statistical synergy within the system $\biXN$ . Conversely, $Ω(\biXN)>0$ implies that the system $\biXN$ is redundancy-dominated.

On the other hand, the S-information is an over-encompassing account of interdependencies taking place at all orders, being sometimes described as a “very mutual information” (James et al., 2011). In fact, a quick calculation shows that the S-information can be decomposed as $Σ(\biXN)=∑N−1i=0I(Xi;XN−i)$ , which can be seen as a chain rule where the interdependencies involving each variable are sequentially addressed.

In summary, while the TC and DTC provide alternative representations to the same construct, $Ω$ and $Σ$ provide a complementary account of the system: the latter addressing the overall strength of interdependencies, and the former qualitatively characterising their dominant nature.

2.4 Other metrics of high-order effects

Another popular metric of high-order interdependencies in the Interaction Information, first introduced in (McGill, 1954) for systems with three variables.¹¹1The Interaction Information is closely related to the I-measures (Yeung, 1991), the co-information (Bell, 2003), and the multi-scale complexity (Bar-Yam, 2004)¹ Building on an application of the inclusion-exclusion principle to entropies, the Interaction Information of $\biXn$ is a signed metric given by

I(X0;X1;…;XN):=−∑\biγ⊆{0,…,N}(−1)|\biγ|H(\biX\biγ).

(6)

where the sum is performed over all subsets $\biγ⊆{0,…,N}$ , with $|\biγ|$ the cardinality of $\biγ$ , and $\biX\biγ$ the vector of all variables with indices in $\biγ$ . While this measure has a direct interpretation as redundancy minus synergy for $N = 2$ , it no longer reflects this balance for larger system sizes (Williams and Beer, 2010, Section V). However, the Interaction Information can still be interpreted for arbitrary $N$ under a topological formulation of information, as described in Baudot and Bennequin (2015), Baudot et al. (2019).

Other well-known metrics of high-order effects include the Redundancy-Synergy Index (Chechik et al., 2002, Timme et al., 2014), the Connected Information (Amari, 2001, Schneidman et al., 2003b), and the Partial Entropy Decomposition (Ince, 2017). Furthermore, a detailed exploration of multivariate decompositions can be found in the Partial Information Decomposition framework (Williams and Beer, 2010) and its constantly growing associated literature (Faes et al., 2017, Finn and Lizier, 2018, Ay et al., 2019, Rosas et al., 2020b, Makkeh et al., 2020).

3 Hyperharmonic analysis

This section, subdivided into three parts, introduces the basic concepts from combinatorial topology that we used to decompose high-order signals into hyperharmonic modes. First, Section 3.1 describes the objects over which signals will be considered; these are higher-dimensional versions of weighted graphs known as weighted simplicial complexes. Then, Section 3.2 introduces the algebraic structure used to model high-order signals on weighted simplicial complexes; it consists of a family of inner product spaces, one for each dimension, and a pair of canonical linear maps between adjacent spaces. Finally, Section 3.3 introduces the discrete analogue of the Laplace-de Rham operator, and defines the Fourier basis using a maximal set of linearly independent eigenvectors of this operator. Throughout the presentation, we provide references to more general treatments and original sources when possible.

3.1 Simplicial complexes

A hypergraph $S = (V, E)$ is determined by a set of vertices $V = {0, \dots, N}$ with $N \in N$ and a set of hyper-edges $E \subseteq P (V)$ , where $P (V)$ is the power set of $V$ . Furthermore, a hypergraph $S$ is said to be a simplicial complex if it satisfies two conditions:

all singletons ${k}$ with $k \in V$ are included in $E$ , and
if $σ \in E$ and $ρ$ is a subset of $σ$ , then $ρ \in E$ .

Please note that the passage to simplicial complexes does not restrict the theory of hypergraphs significantly, since to every hypergraph one can assign a canonical simplicial complex by downward closure. Explicitly, this is the smallest simplicial complex that contains the hypergraph. For an introduction to graphs and hypergraphs, we refer to Berge and Minieka (1973); for a comprehensive introduction to hypergraphs in the context of complex system analysis see Johnson (2013).

If $S = (V, E)$ is a simplicial complex, the elements of $E$ are referred to as simplices; and their dimension is defined as one less than their cardinality (i.e. the number of elements they connect). This shift can be understood noting that the natural dimension of a point, corresponding to a singleton, is $0$ . The subset of $E$ consisting of simplices of dimension $n$ is denoted by $S_{n}$ . The elements of $S_{n}$ are typically called “ $n$ -simplices”, with the $0$ -simplices and $1$ -simplices being informally called vertices and edges, respectively. If $v_{0} < \dots < v_{n}$ are the vertices of a simplex, we denote this simplex by $[v_{0}, \dots, v_{n}]$ .

Example 1

The simplicial complex $Δ^{N}$ with vertices $V = {0, \dots, N}$ and containing all possible simplices is referred to as the standard $N$ -simplex. In our applications, the $N + 1$ vertices of the $Δ^{N}$ will be associated with a set of random variables $X_{0}, \dots, X_{N}$ .

3.2 Higher-dimensional signals

An $n$ -dimensional signal on a simplicial complex $S$ is a function $α : S_{n} \to R$ , that is to say, an assignment of a real number to each $n$ -simplex. We refer to $n$ -signals with $n \geq 2$ as high-order signals.

Example 2

Consider a collection of random variables $X_{0}, \dots, X_{N}$ . For every $n \in {2, \dots, N}$ , the high-order $n$ -signals $Ω$ and $Σ$ on $Δ^{N}$ are defined by

	$Ω ([v_{0}, \dots, v_{n}])$	$:= Ω (X_{v_{0}}, \dots, X_{v_{n}}),$
	$Σ ([v_{0}, \dots, v_{n}])$	$:= Σ (X_{v_{0}}, \dots, X_{v_{n}}) .$

For a simplicial complex $S$ and $n \in N$ , we denote by $C_{n} (S)$ the vector space generated by all the $n$ -simplices of $S$ , i.e.,

C_{n} (S) = {α_{1} σ_{1} + \dots + α_{r} σ_{r} | α_{i} \in R and σ_{i} \in S_{n}} .

Please note that there is a natural bijection between the set of $n$ -signals on $S$ and $C_{n} (S)$ established by

α \mapsto \sum σ \in S_{n} α (σ) σ .

This bijection is used implicitly when referring to the elements of $C_{n} (S)$ as $n$ -signals.²²2In the mathematics literature, the elements of $C_{n} (S)$ are called “real-valued $n$ -chains”, but we do not use this terminology.²

Let us now consider the linear map $\partial_{n} : C_{n} (S) \to C_{n - 1} (S)$ defined on basis elements by

\partial_{n} ([v_{0}, \dots, v_{n}]) = n \sum i = 0 (- 1)^{i} [v_{0}, \dots, {ˆ v}_{i}, \dots, v_{n}],

(7)

with ${ˆ v}_{i}$ denoting the absence of $v_{i}$ from the simplex. One can visualise $\partial_{n}$ geometrically in terms of the boundary of a basis element. Up to a sign, the basis elements appearing as summands on $\partial_{n} ([v_{0}, \dots, v_{n}])$ are all simplices of dimension $n - 1$ contained in $[v_{0}, \dots, v_{n}]$ . Furthermore, the sign is determined in terms of the orientations of the simplices, as illustrated in Figure 1. The maps $\partial_{n}$ play a central role in algebraic topology holding a significant amount of topological information. In particular, the difference between the dimension of the kernel of $\partial_{n}$ and the dimension of the image of $\partial_{n + 1}$ is known as the $n$ -Betti number of the simplicial complex, a powerful topological invariant generalising the Euler characteristic. For a systematic treatment of these ideas, please consult Hatcher (2002).

Figure 1: The linear map $\partial_{n}$ interpreted geometrically in terms of the boundary of a simplex and the induced orientations.

In this work we are not only interested in the topology of $S$ , but also on the “geometric” structure encoded by weights assigned to its simplices. A weighted simplicial complex is a pair $(S, w)$ where $S$ is a simplicial complex and $w = {w_{n} : S_{n} \to R_{> 0}}$ is a set of non-negative weight functions. In the following, a weighted standard simplex (see Example 1) is referred to as a structural simplex.

Example 3

We can build a structural simplex using a collection of random variables $X_{0}, \dots, X_{N}$ by following Example 1, and defining the weights as follows. First, for each vertex $v_{k}$ in $Δ^{N}$ we assign $w_{0} ([v_{k}]) = 1$ , and for each edge $[v_{i}, v_{j}]$ we assigns the mutual information

w_{1} ([v_{i}, v_{j}]) = I (X_{v_{i}}, X_{v_{j}}) .

Subsequently, for an $n$ -simplex $[v_{0}, \dots, v_{n}]$ with $n > 1$ , its weight is defined as the mean value of the mutual information of all pairs of random variables in ${X_{v_{0}}, \dots, X_{v_{n}}}$ .

Given a weighted simplicial complex $(S, w)$ , one can encode the structural information provided by $w_{n}$ (where the subscript denotes the dimension) as an inner product on the vector space of $n$ -signals on $S$ . Explicitly, for any pair of $n$ -simplices we have

⟨ σ_{i}, σ_{j} ⟩_{w} = {\begin{matrix} w_{n} (σ_{i}) & % if i = j, 0 & otherwise . \end{matrix}

(8)

Importantly, this inner product allow us to introduce the adjoint operator of $\partial_{n}$ , which is denoted by $δ_{n}$ . That is to say, the operator $δ_{n} : C_{n} (S) \to C_{n + 1} (S)$ is defined by the identity

⟨ \partial_{n + 1} α, β ⟩_{w} = ⟨ α, δ_{n} β ⟩_{w}

which holds for any $α \in C_{n + 1} (S)$ and $β \in C_{n} (S)$ . Note that $\partial_{n}$ does not depend on the weights $w$ , but $δ_{n}$ does.

3.3 High-order Laplace operator and Fourier basis

The classical sine and cosine functions form the basis of the spectral representation of time-domain signals. In effect, classical Fourier analysis guarantees that a large class of functions are expressible in terms of linear combination of these functions. The coefficients of this linear combination are the Fourier coefficients, which correspond to a geometric projection of the original function over this new basis. Interestingly, numerous signals of practical relevance are more compactly represented in the spectral domain, and hence Fourier analysis is often employed as a principled way to perform dimensionality-reduction. For a more extensive exposition of these ideas, we refer the reader to Bracewell (1986).

Importantly, sine and cosine functions are also eigenfunctions of the Laplace operator on the circle — a one-dimensional manifold. Put differently, a spectral representation is equivalent to a diagonalisation of the Laplace operator. Mathematicians have generalized the Laplace operator to higher-dimensional manifolds via the Laplace-de Rham operators. In this more general context, functions on a smooth manifold lie at the bottom of a sequence of higher-dimensional objects on which the corresponding Laplace-de Rham operator acts. The eigenvectors of these operators play a central role in modern geometry — most notably through Hodge theory (Hodge and Atiyah, 1989). For an exposition of these ideas we refer the reader to Morita et al. (2001).

A discrete analogue of the higher-dimensional objects on which the Laplace-de Rham operators act is provided by high-order signals defined over a simplicial complex. In this correspondence, the Laplace-de Rham operators are represented by the discrete Laplace operators first introduced in Eckmann (1944). See also Horak and Jost (2013) where a weighted version of these is presented, and Parzanchevski and Rosenthal (2017) where the connection to random walks is explored. We now introduce the particular version of the discrete Laplace-de Rham operators that we use.

Definition 2

Let $(S, w)$ be a weighted simplicial complex. The $n$ -Laplace operator $Δ_{n} : C_{n} (S) \to C_{n} (S)$ is defined by

Δ_{n} = \partial_{n + 1} δ_{n} + δ_{n - 1} \partial_{n} .

(9)

Notice that we are abusing notation by omitting $w$ from the notation referencing the Laplace operators since, although $\partial_{n}$ does not depend on $w_{n}$ , $δ_{n}$ and therefore $Δ_{n}$ do. For the interested reader, we remark that this Definition 2 has a strong resemblance to the form the Laplace-de Rham operator adopts in Rimmanian geometry. In effect, by denoting by $d$ the exterior derivative of differential forms and $d^{*}$ its adjoint, the Laplace-de Rham operator in this context can be expressed as $d d^{*} + d^{*} d$ . Furthermore, the geometry defined by the Riemannian metric is reflected in the Laplace-de Rham operator through the operator $d^{*}$ only. Consult Morita et al. (2001) for further details.

The well-known graph Laplacian, a central concept in spectral graph theory (Chung et al., 1997), is equivalent to the $0$ -Laplace operator defined above — when a weighted graph is regarded as a weighted simplicial complex. Importantly, the $n$ -Laplace operator is self-adjoint for any weighted simplicial complex $(S, w)$ , i.e.

	$⟨ Δ_{n} (α), β ⟩_{w} =$
	$=$
	$=$	$⟨ α, Δ_{n} (β) ⟩_{w}$

for all $α, β \in C_{n} (S)$ . This implies that $Δ_{n} (α)$ is diagonalisable; that is to say, there exists a basis of $C_{n} (S)$ consisting of eigenvectors of $Δ_{n}$ .³³3This result is a more general form of the well-known fact that symmetric (i.e. self-adjoint) matrices are diagonalisable.³ An $n$ -Fourier basis for $(S, w)$ is a maximal set of linearly independent orthonormal (with respect to $⟨ \cdot, \cdot ⟩_{w}$ ) eigenvectors of $Δ_{n}$ . Please note that we speak of the Fourier basis, with the understanding that there is a sign choice for each of its elements.

Finally, the hyperharmonic representation of a high-order signal defined over a weighted simplicial complex is given by its change of bases from the canonical to the Fourier one. In the case of graphs, the use of this transformation as a dimensionality-reduction method was pioneered by Belkin and Niyogi (2003), and has served as an early application of the field of graph signal processing — see for example Shuman et al. (2013), Sandryhaila and Moura (2014) or Ortega et al. (2018) for an overview of this field. In recent years, some key ideas from graph signal processing have been adapted to hypergraphs, see for example Barbarossa and Sardellitti (2020) and Schaub and Segarra (2018); however, the applicability of Fourier analysis as a compression tool of high-order signals is, to a large extend, still unexplored territory.

4 Proposed workflow

This section describes our proposed workflow, which capitalises the theoretical constructs elaborated in Sections 2 and 3. The overall pipeline is illustrated in Figure 2, being composed of six steps that are described in the following.

Figure 2: Workflow: (1) Estimate a joint probability from the input data (2) Build a structural simplex using averages of mutual information values. (3) Compute the O-information (and S-information) and store it as an $(\frac{N + 1}{n + 1})$ -dimensional vector for every dimension $n$ (4) Compute the $n$ -Laplace matrix using the boundary and weight matrices. (5) Diagonalise the Laplace matrices for each dimension. (6) Write the signal in the Fourier basis and return it as output.

4.1 Steps of the analysis

The proposed pipeline consists in six steps.

From data to a distribution: The starting point of the workflow is multivariate data, which typically takes the shape of a sequence of vectors of dimension $N + 1$ (e.g. successive samples of time series of the form $\bis(t)=(s0(t),…,sN(t))$ at different values of $t \in R$ ). Using these data, the first step is to construct a joint probability distribution for a $(N + 1)$ -dimensional vector, denoted as $p (X_{0}, \dots, X_{N})$ .
The structural simplex: To build the structural simplex, the first step is to use $p$ to calculate the mutual information $I (X_{i}; X_{j})$ for each pair of variables of the system. These values are then used to define the weights on a pairwise network, where each edge correspond to one of the variables $X_{0}, \dots, X_{N}$ . Subsequently, a weighted $n$ -simplex is build for each $n = 2, \dots, N$ by taking the average value of all the edges that connect two elements within the simplex.
High-order signals: For each subset of cardinality $n + 1$ of ${X_{0}, \dots, X_{N}}$ , one computes the O-information and S-information and arrange them as high-order signals following Example 2. These signals are stored as $(\frac{N + 1}{n + 1})$ -dimensional vectors $\bion$ and $\bisn$ , representing them in the canonical basis of simplices. Note that there is a standard order on the elements of this basis, which is given by the lexicographic principle (i.e. $[v_{0}, \dots, v_{n}] < [v_{0}^{'}, \dots, v_{n}^{'}]$ if $v_{j} < v_{j}^{'}$ and $v_{i} = v_{i}^{'}$ for all $i < j$ ).
The Laplace operator: Using the weights of the structural simplex constructed in Step (2), one then builds the corresponding $n$ -Laplace operator as in Definition 2. Concretely, we use the canonical basis to represent the $d =$ $(\frac{N + 1}{n + 1})$ weights of order $n + 1$ within a $d \times d$ diagonal matrix $W_{n}$ , and represents the linear map $\partial_{n}$ as a matrix $B_{n}$ of the same dimensions (for an algorithmic description of the construction of $B_{n}$ , see B). Then, the discrete $n$ -Laplace operator is represented in the canonical basis as the matrix

$L_{n} = L_{n}^{up} + L_{n}^{down},$ (10)

with $L_{n}^{up}$ and $L_{n}^{down}$ given by

$L_{n}^{up}$ $= W_{n}^{- 1} B_{n}^{⊺} W_{n + 1} B_{n},$

$L_{n}^{down}$ $= B_{n - 1} W_{n - 1}^{- 1} B_{n - 1}^{⊺} W_{n},$

where $B_{n}^{⊺}$ is the transpose of $B_{n}$ . To recapitulate, Eq. (10) is the matrix representation, in the canonical basis, of Eq. (9) with respect to the inner product given by Eq. (8).
The Fourier basis: The $n$ -Fourier basis of the structural simplex is constructed by choosing a maximal linearly independent set of eigenvectors of the $n$ -Laplace operator that are orthonormal with respect to the inner product $⟨ \cdot, \cdot ⟩_{w}$ defined by the weights of the structural simplex (see Eq. (8)). Concretely, one needs to find a matrix $F_{n}$ such that

$F_{n} L_{n} F_{n}^{- 1} = D_{n}$ (11)

with $D_{n}$ a diagonal matrix, which also satisfies

$(F_{n}^{- 1})^{⊺} W_{n} F_{n}^{- 1} = I_{n},$

where $I_{n}$ is the identity matrix of dimension $(\frac{N + 1}{n + 1})$ .
Hyperharmonic representation: As a final step, one calculates the Fourier transform of the high-order signals calculated in Step (3), denoted by $ˆ\bion$ and $ˆ\bisn$ , as follows:

$Fn\bion$ $=ˆ\bion,$

$Fn\bisn$ $=ˆ\bisn.$

4.2 Variations

This workflow has been designed with modularity and flexibility in mind, in order to facilitate its adaptation to the needs of diverse applications. This section highlights possible variations over the workflow that can better accommodate the specific needs of different scenarios.

First, note that our choice of $Ω$ and $Σ$ in Step (3) as high-order information signals is based on their interpretability, and on their promising value for the analysis of complex systems. Nevertheless, other high-order measures (e.g. the ones described in 2.4) can also be analysed following the same pipeline.

Additionally, Step (2) suggests to build the structural simplex by propagating the underlying mutual information from edges to higher-dimensional simplices via averages. However, one could use other constructions: e.g. use the maximum or minimum mutual information instead of the average. Additionally, one could replace the mutual information with other non-negative metrics of similarity, including the total variation distance, the Wasserstein distance, or the absolute value of the Pearson correlation. Furthermore, one could also use non-negative high-order metrics (such as the TC or DTC, see Section 2.3) for building the structural simplex directly.

Finally, it is important to note that Step (1) is included mainly for pedagogical purposes, but is often omitted in practice. In effect, most modern techniques to estimate information-theoretic quantities from data avoid to build an explicit joint distribution, as this introduces additional biases. For discrete data, we recommend considering Bayesian estimators such as the ones discussed in Archer et al. (2013), and the software package DIT (James et al., 2018). For continuous data, our preferred choice are Kraskof estimators (Kraskov et al., 2004) that can be implemented via JIDT (Lizier, 2014). However, these estimators require substantial amounts of data, and a more flexible alternative are provided by estimators based on Gaussian Copulas (Ince et al., 2017).

5 Proof of concept: Haydn symphonies

As an illustration of the proposed workflow, this section presents an analysis of the degree of dimensionality-reduction that can be attained by performing hyperharmonic analysis over the O-information and S-information signals calculated over a small dataset. For this purpose, we use data from the music scores of Franz Joseph Haydn (1732–1809), one of the most iconic figures of the Classic Period. We focus on Haydn’s latter “London symphonies”, which are typically divided into two groups: Symphonies Nos. 93–98, which were composed during the first visit of Haydn to London, and Symphonies 99–104, which were composed either in Vienna or London during Haydn’s second visit (Clark, 2005).

5.1 Method description

Our analysis is based on electronic scores that are publicly available at http://kern.ccarh.org, from where we extracted the scores of Symphonies No. 93--94 and No. 99--104.⁴⁴4The data for Symphonies 95–98 was not available.⁴ All symphonies use the same basic instrumentation: flutes, oboes, bassoons, horns, trumpets, timpani, violins, viola, violoncello, and double bass; only some of these symphonies employ clarinets, which were therefore not included in the analysis. To avoid duplicates, our analyses consider only one instrument of each kind, which left an arrangement of nine parts (violoncello and double bass were assumed to be equivalent).

The scores of the four movements of the selected symphonies were pre-processed in Python 3.8.5 using the Music21 package (http://web.mit.edu/music21). Each movement was transformed into nine coupled time series taking 13 possible values — one for each note plus one for the silence, using a small rhythmic duration as time unit. With these data, the joint distribution of the values for the nine-note chords was estimated using their empirical frequency. Note that regularisation methods (such as Laplace smoothing) were not employed, as many configurations (e.g. highly dissonant chords) are never explored in the Classic repertoire.

Our subsequent analyses were restricted to high-order signals depending on the joint probability of no more than $6$ instruments. Given the structure of the dataset (8 symphonies with 4 movements each), we computed the structural simplex using a distribution calculated using all the data. In contrast, individual high-order signals were calculated for each movement of each symphony by using only the corresponding data.

To measure compressibility, we used the following metric defined for any basis. Consider a given $n$ -dimensional high-order signal, whose coefficients on the given basis are $α = {α_{i}}$ . Without loss of generality we assume that $α_{i}^{2} \geq α_{j}^{2}$ if $i \leq j$ . Then, we define the functions

{EV}_{α} (k) = \frac{α_{k}^{2}}{\sum i α_{i}^{2}} and {CEV}_{α} (k) = \sum 1 \leq i \leq k {% EV}_{α} (i),

(12)

with ${EV}_{α} (k)$ being the (normalised) explained variance by the $k$ -th strongest component, and ${CEV}_{α} (k)$ the cumulative explained variance by the $k$ strongest components. This definition is motivated by Parseval’s theorem, which guarantees that the sum of the square of the Fourier coefficients is equal to the variance of the signal; hence, the $CEV (k)$ of the Fourier transform of a signal corresponds to the percentage of the variance that is accounted by the $k$ strongest components.

5.2 Results

The hyperharmonic representation of the considered high-order signals ( $Ω$ and $Σ$ ) was found to be substantially more concentrated than their corresponding canonical representations. Figure 3 illustrates the curves of cumulative explained variance for various dimensions, and Table 1 presents the number of components required to fulfil various reconstruction levels. In particular, it was found that a small number of components suffices to account for most of the variance observed in the hyperharmonic representations of $Ω$ and $Σ$ . Moreover, this good performance is found to be stable across dimensions.

Figure 3: Comparison of the cumulative explained variance (as defined by Eq. (12)) of high-order signals in their canonical and hyperharmonic representation. The dimensionality-reduction capabilities of hyperharmonic analysis is substantial, and stays consistent across dimensions.

Cumulative Explained Variance
Signal	Dim	60%	80%	90%	95%	99%
O-info	2	3 / 13	5 / 26	9 / 38	14 / 47	28 / 63
	3	4 / 22	8 / 41	13 / 57	20 / 71	37 / 95
	4	1 / 35	3 / 58	6 / 78	12 / 93	29 / 113
	5	2 / 34	4 / 53	8 / 65	12 / 73	26 / 82
S-info	2	2 / 29	4 / 48	6 / 62	8 / 71	19 / 81
	3	3 / 52	7 / 80	11 / 99	14 / 111	27 / 123
	4	1 / 57	2 / 86	3 / 103	9 / 113	24 / 123
	5	2 / 42	4 / 60	8 / 71	13 / 78	26 / 83

Table 1: Number of components needed to recover a given percentage of the cumulative explained variance for the considered signals, either in the Fourier basis or in the canonical basis.

To verify the unique properties of the Laplace operators, an additional control was run were the same signals were transformed according to randomly-generated bases. Interestingly, these representations do not exhibit the degree of dimensionality-reduction shown by the hyperharmonic representations (see A).

Finally, our results also show that the canonical representation of the S-information is much less concentrated than the canonical representation of the O-information. This dissimilarity suggests that the O-information is capturing a more specific signature of the different modalities of interdependency that exist across the orchestra. Moreover, the fact that the dissimilarity is not seen in their hyperharmonic representation further suggests that both signals may actually carry an equivalent amount of information, which happens to be differently localised over the canonical basis.

6 Conclusion

Complex systems are characterised by having multiple levels of organisation, which makes network analyses focused on pairwise interactions unable to give a full account of their properties. Phenomena beyond pairwise interactions can be effectively captured by high-order information-theoretic metrics; however, their applicability and interpretability is limited due to the fast growth of their cardinality as a function of the system’s size. Here we propose to represent these high-order metrics using the hyperharmonic modes of a geometrical representation of structural properties of the system. This provides a principled approach to constructing low-dimensional representations of high-order signals, which can retain most of the informational content.

Our proposed approach is widely applicable, and promises to enable a range of future explorations. It is our hope that this incursion into the intersection of multivariate information theory and combinatorial topology will motivate further developments on this fertile area of research.

The authors thank Pedro A. M. Mediano and Umberto Lupo for insightful discussions and valuable suggestions. The authors also thank Kathryn Hess and Yike Guo for supporting this research. A.M-M. acknowledges financial support from Innosuisse grant 32875.1 IP-ICT - 1, and the hospitality of the Max Plank Institute for Mathematics. R.C. acknowledges financial support from Fondecyt Iniciación 2018 Proyecto 11181072. F.R. acknowledges the support of the Ad Astra Chandaria Foundation.

Appendix A Control employing randomly generated bases

Here we review the additional control that confirms that the results presented in Section 5.2 are not due to a limitation of the canonical bases, but are due to the special advantages of the Laplace operators. Specifically, we calculated the average CEV over $80$ randomly generated bases for the same high-order signals considered in Section 5.2, and then compared them to the CEV obtained via the hyperharmonic representation. Our result, shown in Figure 4, reveals that the explained variance attainable via randomly generated bases is substantially and consistently lower than the one associated to the Fourier bases. This provides additional evidence on the favorable properties of the hyperharmonic representation of the considered signals.

Appendix B Construction of $B_{n}$

The matrices $B_{n}$ correspond to the representation of the linear maps $\partial_{n} : C_{n} (Δ^{N}) \to C_{n - 1} (Δ^{N})$ on the canonical basis (see Section 3.2 and Figure 1). Algorithmically, one can use Eq. (7) to determine the value of each column, inserting either a $+ 1$ or a $- 1$ to the entries corresponding to simplices in its boundary — as depicted in Figure 1 — and setting the other entries to 0. To illustrate the procedure, let us present the four matrices that correspond to $N = 3$ :

	$B_{0}$	$= [\begin{matrix} 0 & 0 & 0 & 0 \end{matrix}],$	$B_{1}$	$= ⎡ ⎢ ⎢ ⎢ ⎣ \begin{matrix} - 1 & - 1 & - 1 & 0 & 0 & 0 + 1 & 0 & 0 & - 1 & - 1 & 0 0 & + 1 & 0 & + 1 & 0 & - 1 0 & 0 & + 1 & 0 & + 1 & + 1 \end{matrix} ⎤ ⎥ ⎥ ⎥ ⎦,$
	$B_{2}$	$= ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ \begin{matrix} + 1 & + 1 & 0 & 0 - 1 & 0 & + 1 & 0 0 & - 1 & - 1 & 0 + 1 & 0 & 0 & + 1 0 & + 1 & 0 & - 1 0 & 0 & + 1 & + 1 \end{matrix} ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦,$	$B_{3}$	$= ⎡ ⎢ ⎢ ⎢ ⎣ \begin{matrix} - 1 + 1 - 1 + 1 \end{matrix} ⎤ ⎥ ⎥ ⎥ ⎦ .$

References

Amari (2001) S. I. Amari. Information geometry on hierarchy of probability distributions. IEEE Transactions on Information Theory, 2001. ISSN 00189448. doi: 10.1109/18.930911.
Archer et al. (2013) E. Archer, I. M. Park, and J. W. Pillow. Bayesian and quasi-bayesian estimators for mutual information from discrete data. Entropy, 15(5):1738–1755, 2013.
Atasoy et al. (2016) S. Atasoy, I. Donnelly, and J. Pearson. Human brain networks function in connectome-specific harmonic waves. Nature Communications, 2016. ISSN 20411723. doi: 10.1038/ncomms10340.
Atasoy et al. (2017) S. Atasoy, L. Roseman, M. Kaelen, M. L. Kringelbach, G. Deco, and R. L. Carhart-Harris. Connectome-harmonic decomposition of human brain activity reveals dynamical repertoire re-organization under lsd. Scientific reports, 7(1):1–18, 2017.
Ay et al. (2019) N. Ay, D. Polani, and N. Virgo. Information decomposition based on cooperative game theory. arXiv preprint arXiv:1910.05979, 2019.
Bar-Yam (2004) Y. Bar-Yam. Multiscale variety in complex systems. Complexity, 2004. ISSN 10762787. doi: 10.1002/cplx.20014.
Barbarossa and Sardellitti (2020) S. Barbarossa and S. Sardellitti. Topological signal processing over simplicial complexes. IEEE Transactions on Signal Processing, 2020.
Battiston et al. (2020) F. Battiston, G. Cencetti, I. Iacopini, V. Latora, M. Lucas, A. Patania, J. G. Young, and G. Petri. Networks beyond pairwise interactions: Structure and dynamics, 2020. ISSN 03701573.
Baudot (2019) P. Baudot. The poincare-shannon machine: statistical physics and machine learning aspects of information cohomology. Entropy, 21(9):881, 2019.
Baudot and Bennequin (2015) P. Baudot and D. Bennequin. The homological nature of entropy. Entropy, 17(5):3253–3318, 2015.
Baudot et al. (2019) P. Baudot, M. Tapia, D. Bennequin, and J.-M. Goaillard. Topological information data analysis. Entropy, 21(9):869, 2019.
Belkin and Niyogi (2003) M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
Bell (2003) A. J. Bell. The co-information lattice. In Proc. 4th Int. Symp. Independent Component Analysis and Blind Source Separation, pages 921–926, 2003.
Berge and Minieka (1973) C. Berge and E. Minieka. Graphs and Hypergraphs. Graphs and Hypergraphs. North-Holland Publishing Company, 1973. ISBN 9780444103994.
Bonanno et al. (2004) G. Bonanno, G. Caldarelli, F. Lillo, S. Miccichè, N. Vandewalle, and R. N. Mantegna. Networks of equities in financial markets. In European Physical Journal B, 2004. doi: 10.1140/epjb/e2004-00129-6.
Bracewell (1986) R. N. Bracewell. The Fourier transform and its applications, volume 31999. McGraw-Hill New York, 1986.
Chechik et al. (2002) G. Chechik, A. Globerson, M. J. Anderson, E. D. Young, I. Nelken, and N. Tishby. Group redundancy measures reveal redundancy reduction in the auditory pathway. In Advances in neural information processing systems, pages 173–180, 2002.
Chung et al. (1997) F. Chung, F. Graham, C. C. on Recent Advances in Spectral Graph Theory, N. S. F. (U.S.), A. M. Society, and C. B. of the Mathematical Sciences. Spectral Graph Theory. CBMS Regional Conference Series. Conference Board of the mathematical sciences, 1997. ISBN 9780821803158.
Clark (2005) C. Clark. The Cambridge Companion to Haydn. Cambridge Companions to Music. Cambridge University Press, 2005. ISBN 9780521833479.
Donges et al. (2011) J. F. Donges, R. V. Donner, M. H. Trauth, N. Marwan, H.-J. Schellnhuber, and J. Kurths. Nonlinear detection of paleoclimate-variability transitions possibly related to human evolution. Proceedings of the National Academy of Sciences, 108(51):20422–20427, 2011.
Eckmann (1944) B. Eckmann. Harmonische funktionen und randwertaufgaben in einem komplex. Commentarii Mathematici Helvetici, 17(1):240–255, 1944.
Expert et al. (2017) P. Expert, S. De Nigris, T. Takaguchi, and R. Lambiotte. Graph spectral characterization of the x y model on complex networks. Physical Review E, 96(1):012312, 2017.
Faes et al. (2017) L. Faes, D. Marinazzo, and S. Stramaglia. Multiscale information decomposition: Exact computation for multivariate gaussian processes. Entropy, 19(8):408, 2017.
Finn and Lizier (2018) C. Finn and J. T. Lizier. Pointwise partial information decomposition using the specificity and ambiguity lattices. Entropy, 20(4):297, 2018.
Ganmor et al. (2011) E. Ganmor, R. Segev, and E. Schneidman. Sparse low-order interaction network underlies a highly correlated and learnable neural population code. Proceedings of the National Academy of sciences, 108(23):9679–9684, 2011.
Gatica et al. (2020) M. Gatica, R. Cofre, P. A. Mediano, F. E. Rosas, P. Orio, I. Diez, S. Swinnen, and J. M. Cortes. High-order interdependencies in the aging brain. bioRxiv, 2020. doi: https://doi.org/10.1101/2020.03.17.995886.
Han (1978) T. S. Han. Nonnegative entropy measures of multivariate symmetric correlations. Information and Control, 1978. ISSN 00199958. doi: 10.1016/S0019-9958(78)90275-9.
Hatcher (2002) A. Hatcher. Algebraic topology. Cambridge University Press, Cambridge, 2002. ISBN 0-521-79160-X; 0-521-79540-0.
Hodge and Atiyah (1989) W. Hodge and M. Atiyah. The Theory and Applications of Harmonic Integrals. Cambridge mathematical library. Cambridge University Press, 1989. ISBN 9780521358811.
Horak and Jost (2013) D. Horak and J. Jost. Spectra of combinatorial laplace operators on simplicial complexes. Advances in Mathematics, 244:303–336, 2013.
Iacopini et al. (2019) I. Iacopini, G. Petri, A. Barrat, and V. Latora. Simplicial models of social contagion. Nature Communications, 2019. ISSN 20411723. doi: 10.1038/s41467-019-10431-6.
Ince (2017) R. A. Ince. The partial entropy decomposition: Decomposing multivariate entropy and mutual information via pointwise common surprisal. arXiv preprint arXiv:1702.01591, 2017.
Ince et al. (2017) R. A. Ince, B. L. Giordano, C. Kayser, G. A. Rousselet, J. Gross, and P. G. Schyns. A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula. Human Brain Mapping, 38(3):1541–1573, 2017. ISSN 10970193. doi: 10.1002/hbm.23471.
James et al. (2011) R. G. James, C. J. Ellison, and J. P. Crutchfield. Anatomy of a bit: Information in a time series observation. Chaos: An Interdisciplinary Journal of Nonlinear Science, 21(3):037109, 2011.
James et al. (2018) R. G. James, C. J. Ellison, and J. P. Crutchfield. “dit“: a python package for discrete information theory. Journal of Open Source Software, 3(25):738, 2018.
Johnson (2013) J. Johnson. Hypernetworks in the science of complex systems, volume 3. World Scientific, 2013.
Kraskov et al. (2004) A. Kraskov, H. Stögbauer, and P. Grassberger. Estimating mutual information. Physical review E, 69(6):066138, 2004.
Latham and Nirenberg (2005) P. E. Latham and S. Nirenberg. Synergy, redundancy, and independence in population codes, revisited. Journal of Neuroscience, 25(21):5195–5206, 2005.
Lindgren and Olbrich (2017) K. Lindgren and E. Olbrich. The approach towards equilibrium in a reversible ising dynamics model: an information-theoretic analysis based on an exact solution. Journal of Statistical Physics, 168(4):919–935, 2017.
Lizier (2014) J. T. Lizier. Jidt: An information-theoretic toolkit for studying the dynamics of complex systems. Frontiers in Robotics and AI, 1:11, 2014.
Makkeh et al. (2020) A. Makkeh, A. J. Gutknecht, and M. Wibral. A differentiable measure of pointwise shared information. arXiv preprint arXiv:2002.03356, 2020.
McGill (1954) W. McGill. Multivariate information transmission. Transactions of the IRE Professional Group on Information Theory, 4(4):93–111, 1954.
Mediano et al. (2019) P. A. Mediano, F. Rosas, R. L. Carhart-Harris, A. K. Seth, and A. B. Barrett. Beyond integrated information: A taxonomy of information dynamics phenomena. arXiv preprint arXiv:1909.02297, 2019.
Morita et al. (2001) S. Morita, T. Nagase, A. M. Society, and K. Nomizu. Geometry of Differential Forms. Iwanami series in modern mathematics. American Mathematical Society, 2001. ISBN 9780821810453.
Newman (2018) M. Newman. Networks. OUP Oxford, 2018. ISBN 9780192527493.
Ortega et al. (2018) A. Ortega, P. Frossard, J. Kovačević, J. M. Moura, and P. Vandergheynst. Graph signal processing: Overview, challenges, and applications. Proceedings of the IEEE, 106(5):808–828, 2018.
Parzanchevski and Rosenthal (2017) O. Parzanchevski and R. Rosenthal. Simplicial complexes: spectrum, homology and random walks. Random Structures & Algorithms, 50(2):225–261, 2017.
Pastor-Satorras and Vespignani (2001) R. Pastor-Satorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical Review Letters, 2001. ISSN 00319007. doi: 10.1103/PhysRevLett.86.3200.
Petri et al. (2013) G. Petri, M. Scolamiero, I. Donato, and F. Vaccarino. Topological Strata of Weighted Complex Networks. PLoS ONE, 2013. ISSN 19326203. doi: 10.1371/journal.pone.0066506.
Petri et al. (2014) G. Petri, P. Expert, F. Turkheimer, R. Carhart-Harris, D. Nutt, P. J. Hellyer, and F. Vaccarino. Homological scaffolds of brain functional networks. Journal of the Royal Society Interface, 2014. ISSN 17425662. doi: 10.1098/rsif.2014.0873.
Rosas et al. (2016) F. E. Rosas, V. Ntranos, C. J. Ellison, S. Pollin, and M. Verhelst. Understanding interdependency through complex information sharing. Entropy, 18(2):38, 2016.
Rosas et al. (2018) F. E. Rosas, P. A. Mediano, M. Ugarte, and H. J. Jensen. An information-theoretic approach to self-organisation: Emergence of complex interdependencies in coupled dynamical systems. Entropy, 20(10):793, 2018.
Rosas et al. (2019) F. E. Rosas, P. A. Mediano, M. Gastpar, and H. J. Jensen. Quantifying high-order interdependencies via multivariate extensions of the mutual information. Physical Review E, 100(3):032305, 2019.
Rosas et al. (2020a) F. E. Rosas, P. A. Mediano, H. J. Jensen, A. K. Seth, A. B. Barrett, R. L. Carhart-Harris, and D. Bor. Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data. arXiv preprint arXiv:2004.08220, 2020a.
Rosas et al. (2020b) F. E. Rosas, P. A. Mediano, B. Rassouli, and A. Barrett. An operational information decomposition via synergistic disclosure. arXiv preprint arXiv:2001.10387, 2020b.
Rubinov and Sporns (2010) M. Rubinov and O. Sporns. Complex network measures of brain connectivity: Uses and interpretations. NeuroImage, 2010. ISSN 10538119. doi: 10.1016/j.neuroimage.2009.10.003.
Sandryhaila and Moura (2014) A. Sandryhaila and J. M. Moura. Discrete signal processing on graphs: Frequency analysis. IEEE Transactions on Signal Processing, 62(12):3042–3054, 2014.
Schaub and Segarra (2018) M. T. Schaub and S. Segarra. Flow smoothing and denoising: graph signal processing in the edge-space. In 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pages 735–739. IEEE, 2018.
Schneidman et al. (2003a) E. Schneidman, W. Bialek, and M. J. Berry. Synergy, redundancy, and independence in population codes. Journal of Neuroscience, 23(37):11539–11553, 2003a.
Schneidman et al. (2003b) E. Schneidman, S. Still, M. J. Berry, W. Bialek, et al. Network information and connected correlations. Physical review letters, 91(23):238701, 2003b.
Shuman et al. (2013) D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine, 30(3):83–98, 2013.
Stramaglia et al. (2020) S. Stramaglia, T. Scagliarini, B. C. Daniels, and D. Marinazzo. Quantifying dynamical high-order interdependencies from the o-information: an application to neural spiking dynamics. arXiv preprint arXiv:2007.16018, 2020.
Timme et al. (2014) N. Timme, W. Alford, B. Flecker, and J. M. Beggs. Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective. Journal of computational neuroscience, 36(2):119–140, 2014.
Tononi et al. (1994) G. Tononi, O. Sporns, and G. M. Edelman. A measure for brain complexity: relating functional segregation and integration in the nervous system. Proceedings of the National Academy of Sciences, 91(11):5033–5037, 1994.
Vasiliauskaite and Rosas (2020) V. Vasiliauskaite and F. E. Rosas. Understanding complexity via network theory: a gentle introduction. arXiv preprint arXiv:2004.14845, 2020.
Watanabe (1960) S. Watanabe. Information theoretical analysis of multivariate correlation. IBM Journal of research and development, 4(1):66–82, 1960.
Wibral et al. (2017) M. Wibral, V. Priesemann, J. W. Kay, J. T. Lizier, and W. A. Phillips. Partial information decomposition as a unified approach to the specification of neural goal functions. Brain and cognition, 112:25–38, 2017.
Williams and Beer (2010) P. L. Williams and R. D. Beer. Nonnegative decomposition of multivariate information. arXiv preprint arXiv:1004.2515, 2010.
Yeung (1991) R. W. Yeung. A New Outlook on Shannon’s Information Measures. IEEE Transactions on Information Theory, 1991. ISSN 15579654. doi: 10.1109/18.79902.

[bib.bib1] Amari (2001) S. I. Amari. Information geometry on hierarchy of probability distributions. IEEE Transactions on Information Theory, 2001. ISSN 00189448. doi: 10.1109/18.930911.

[bib.bib2] Archer et al. (2013) E. Archer, I. M. Park, and J. W. Pillow. Bayesian and quasi-bayesian estimators for mutual information from discrete data. Entropy, 15(5):1738–1755, 2013.

[bib.bib3] Atasoy et al. (2016) S. Atasoy, I. Donnelly, and J. Pearson. Human brain networks function in connectome-specific harmonic waves. Nature Communications, 2016. ISSN 20411723. doi: 10.1038/ncomms10340.

[bib.bib4] Atasoy et al. (2017) S. Atasoy, L. Roseman, M. Kaelen, M. L. Kringelbach, G. Deco, and R. L. Carhart-Harris. Connectome-harmonic decomposition of human brain activity reveals dynamical repertoire re-organization under lsd. Scientific reports, 7(1):1–18, 2017.

[bib.bib5] Ay et al. (2019) N. Ay, D. Polani, and N. Virgo. Information decomposition based on cooperative game theory. arXiv preprint arXiv:1910.05979, 2019.

[bib.bib6] Bar-Yam (2004) Y. Bar-Yam. Multiscale variety in complex systems. Complexity, 2004. ISSN 10762787. doi: 10.1002/cplx.20014.

[bib.bib7] Barbarossa and Sardellitti (2020) S. Barbarossa and S. Sardellitti. Topological signal processing over simplicial complexes. IEEE Transactions on Signal Processing, 2020.

[bib.bib8] Battiston et al. (2020) F. Battiston, G. Cencetti, I. Iacopini, V. Latora, M. Lucas, A. Patania, J. G. Young, and G. Petri. Networks beyond pairwise interactions: Structure and dynamics, 2020. ISSN 03701573.

[bib.bib9] Baudot (2019) P. Baudot. The poincare-shannon machine: statistical physics and machine learning aspects of information cohomology. Entropy, 21(9):881, 2019.

[bib.bib10] Baudot and Bennequin (2015) P. Baudot and D. Bennequin. The homological nature of entropy. Entropy, 17(5):3253–3318, 2015.

[bib.bib11] Baudot et al. (2019) P. Baudot, M. Tapia, D. Bennequin, and J.-M. Goaillard. Topological information data analysis. Entropy, 21(9):869, 2019.

[bib.bib12] Belkin and Niyogi (2003) M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.

[bib.bib13] Bell (2003) A. J. Bell. The co-information lattice. In Proc. 4th Int. Symp. Independent Component Analysis and Blind Source Separation, pages 921–926, 2003.

[bib.bib14] Berge and Minieka (1973) C. Berge and E. Minieka. Graphs and Hypergraphs. Graphs and Hypergraphs. North-Holland Publishing Company, 1973. ISBN 9780444103994.

[bib.bib15] Bonanno et al. (2004) G. Bonanno, G. Caldarelli, F. Lillo, S. Miccichè, N. Vandewalle, and R. N. Mantegna. Networks of equities in financial markets. In European Physical Journal B, 2004. doi: 10.1140/epjb/e2004-00129-6.

[bib.bib16] Bracewell (1986) R. N. Bracewell. The Fourier transform and its applications, volume 31999. McGraw-Hill New York, 1986.

[bib.bib17] Chechik et al. (2002) G. Chechik, A. Globerson, M. J. Anderson, E. D. Young, I. Nelken, and N. Tishby. Group redundancy measures reveal redundancy reduction in the auditory pathway. In Advances in neural information processing systems, pages 173–180, 2002.

[bib.bib18] Chung et al. (1997) F. Chung, F. Graham, C. C. on Recent Advances in Spectral Graph Theory, N. S. F. (U.S.), A. M. Society, and C. B. of the Mathematical Sciences. Spectral Graph Theory. CBMS Regional Conference Series. Conference Board of the mathematical sciences, 1997. ISBN 9780821803158.

[bib.bib19] Clark (2005) C. Clark. The Cambridge Companion to Haydn. Cambridge Companions to Music. Cambridge University Press, 2005. ISBN 9780521833479.

[bib.bib20] Donges et al. (2011) J. F. Donges, R. V. Donner, M. H. Trauth, N. Marwan, H.-J. Schellnhuber, and J. Kurths. Nonlinear detection of paleoclimate-variability transitions possibly related to human evolution. Proceedings of the National Academy of Sciences, 108(51):20422–20427, 2011.

[bib.bib21] Eckmann (1944) B. Eckmann. Harmonische funktionen und randwertaufgaben in einem komplex. Commentarii Mathematici Helvetici, 17(1):240–255, 1944.

[bib.bib22] Expert et al. (2017) P. Expert, S. De Nigris, T. Takaguchi, and R. Lambiotte. Graph spectral characterization of the x y model on complex networks. Physical Review E, 96(1):012312, 2017.

[bib.bib23] Faes et al. (2017) L. Faes, D. Marinazzo, and S. Stramaglia. Multiscale information decomposition: Exact computation for multivariate gaussian processes. Entropy, 19(8):408, 2017.

[bib.bib24] Finn and Lizier (2018) C. Finn and J. T. Lizier. Pointwise partial information decomposition using the specificity and ambiguity lattices. Entropy, 20(4):297, 2018.

[bib.bib25] Ganmor et al. (2011) E. Ganmor, R. Segev, and E. Schneidman. Sparse low-order interaction network underlies a highly correlated and learnable neural population code. Proceedings of the National Academy of sciences, 108(23):9679–9684, 2011.

[bib.bib26] Gatica et al. (2020) M. Gatica, R. Cofre, P. A. Mediano, F. E. Rosas, P. Orio, I. Diez, S. Swinnen, and J. M. Cortes. High-order interdependencies in the aging brain. bioRxiv, 2020. doi: https://doi.org/10.1101/2020.03.17.995886.

[bib.bib27] Han (1978) T. S. Han. Nonnegative entropy measures of multivariate symmetric correlations. Information and Control, 1978. ISSN 00199958. doi: 10.1016/S0019-9958(78)90275-9.

[bib.bib28] Hatcher (2002) A. Hatcher. Algebraic topology. Cambridge University Press, Cambridge, 2002. ISBN 0-521-79160-X; 0-521-79540-0.

[bib.bib29] Hodge and Atiyah (1989) W. Hodge and M. Atiyah. The Theory and Applications of Harmonic Integrals. Cambridge mathematical library. Cambridge University Press, 1989. ISBN 9780521358811.

[bib.bib30] Horak and Jost (2013) D. Horak and J. Jost. Spectra of combinatorial laplace operators on simplicial complexes. Advances in Mathematics, 244:303–336, 2013.

[bib.bib31] Iacopini et al. (2019) I. Iacopini, G. Petri, A. Barrat, and V. Latora. Simplicial models of social contagion. Nature Communications, 2019. ISSN 20411723. doi: 10.1038/s41467-019-10431-6.

[bib.bib32] Ince (2017) R. A. Ince. The partial entropy decomposition: Decomposing multivariate entropy and mutual information via pointwise common surprisal. arXiv preprint arXiv:1702.01591, 2017.

[bib.bib33] Ince et al. (2017) R. A. Ince, B. L. Giordano, C. Kayser, G. A. Rousselet, J. Gross, and P. G. Schyns. A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula. Human Brain Mapping, 38(3):1541–1573, 2017. ISSN 10970193. doi: 10.1002/hbm.23471.

[bib.bib34] James et al. (2011) R. G. James, C. J. Ellison, and J. P. Crutchfield. Anatomy of a bit: Information in a time series observation. Chaos: An Interdisciplinary Journal of Nonlinear Science, 21(3):037109, 2011.

[bib.bib35] James et al. (2018) R. G. James, C. J. Ellison, and J. P. Crutchfield. “dit“: a python package for discrete information theory. Journal of Open Source Software, 3(25):738, 2018.

[bib.bib36] Johnson (2013) J. Johnson. Hypernetworks in the science of complex systems, volume 3. World Scientific, 2013.

[bib.bib37] Kraskov et al. (2004) A. Kraskov, H. Stögbauer, and P. Grassberger. Estimating mutual information. Physical review E, 69(6):066138, 2004.

[bib.bib38] Latham and Nirenberg (2005) P. E. Latham and S. Nirenberg. Synergy, redundancy, and independence in population codes, revisited. Journal of Neuroscience, 25(21):5195–5206, 2005.

[bib.bib39] Lindgren and Olbrich (2017) K. Lindgren and E. Olbrich. The approach towards equilibrium in a reversible ising dynamics model: an information-theoretic analysis based on an exact solution. Journal of Statistical Physics, 168(4):919–935, 2017.

[bib.bib40] Lizier (2014) J. T. Lizier. Jidt: An information-theoretic toolkit for studying the dynamics of complex systems. Frontiers in Robotics and AI, 1:11, 2014.

[bib.bib41] Makkeh et al. (2020) A. Makkeh, A. J. Gutknecht, and M. Wibral. A differentiable measure of pointwise shared information. arXiv preprint arXiv:2002.03356, 2020.

[bib.bib42] McGill (1954) W. McGill. Multivariate information transmission. Transactions of the IRE Professional Group on Information Theory, 4(4):93–111, 1954.

[bib.bib43] Mediano et al. (2019) P. A. Mediano, F. Rosas, R. L. Carhart-Harris, A. K. Seth, and A. B. Barrett. Beyond integrated information: A taxonomy of information dynamics phenomena. arXiv preprint arXiv:1909.02297, 2019.

[bib.bib44] Morita et al. (2001) S. Morita, T. Nagase, A. M. Society, and K. Nomizu. Geometry of Differential Forms. Iwanami series in modern mathematics. American Mathematical Society, 2001. ISBN 9780821810453.

[bib.bib45] Newman (2018) M. Newman. Networks. OUP Oxford, 2018. ISBN 9780192527493.

[bib.bib46] Ortega et al. (2018) A. Ortega, P. Frossard, J. Kovačević, J. M. Moura, and P. Vandergheynst. Graph signal processing: Overview, challenges, and applications. Proceedings of the IEEE, 106(5):808–828, 2018.

[bib.bib47] Parzanchevski and Rosenthal (2017) O. Parzanchevski and R. Rosenthal. Simplicial complexes: spectrum, homology and random walks. Random Structures & Algorithms, 50(2):225–261, 2017.

[bib.bib48] Pastor-Satorras and Vespignani (2001) R. Pastor-Satorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical Review Letters, 2001. ISSN 00319007. doi: 10.1103/PhysRevLett.86.3200.

[bib.bib49] Petri et al. (2013) G. Petri, M. Scolamiero, I. Donato, and F. Vaccarino. Topological Strata of Weighted Complex Networks. PLoS ONE, 2013. ISSN 19326203. doi: 10.1371/journal.pone.0066506.

[bib.bib50] Petri et al. (2014) G. Petri, P. Expert, F. Turkheimer, R. Carhart-Harris, D. Nutt, P. J. Hellyer, and F. Vaccarino. Homological scaffolds of brain functional networks. Journal of the Royal Society Interface, 2014. ISSN 17425662. doi: 10.1098/rsif.2014.0873.

[bib.bib51] Rosas et al. (2016) F. E. Rosas, V. Ntranos, C. J. Ellison, S. Pollin, and M. Verhelst. Understanding interdependency through complex information sharing. Entropy, 18(2):38, 2016.

[bib.bib52] Rosas et al. (2018) F. E. Rosas, P. A. Mediano, M. Ugarte, and H. J. Jensen. An information-theoretic approach to self-organisation: Emergence of complex interdependencies in coupled dynamical systems. Entropy, 20(10):793, 2018.

[bib.bib53] Rosas et al. (2019) F. E. Rosas, P. A. Mediano, M. Gastpar, and H. J. Jensen. Quantifying high-order interdependencies via multivariate extensions of the mutual information. Physical Review E, 100(3):032305, 2019.

[bib.bib54] Rosas et al. (2020a) F. E. Rosas, P. A. Mediano, H. J. Jensen, A. K. Seth, A. B. Barrett, R. L. Carhart-Harris, and D. Bor. Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data. arXiv preprint arXiv:2004.08220, 2020a.

[bib.bib55] Rosas et al. (2020b) F. E. Rosas, P. A. Mediano, B. Rassouli, and A. Barrett. An operational information decomposition via synergistic disclosure. arXiv preprint arXiv:2001.10387, 2020b.

[bib.bib56] Rubinov and Sporns (2010) M. Rubinov and O. Sporns. Complex network measures of brain connectivity: Uses and interpretations. NeuroImage, 2010. ISSN 10538119. doi: 10.1016/j.neuroimage.2009.10.003.

[bib.bib57] Sandryhaila and Moura (2014) A. Sandryhaila and J. M. Moura. Discrete signal processing on graphs: Frequency analysis. IEEE Transactions on Signal Processing, 62(12):3042–3054, 2014.

[bib.bib58] Schaub and Segarra (2018) M. T. Schaub and S. Segarra. Flow smoothing and denoising: graph signal processing in the edge-space. In 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pages 735–739. IEEE, 2018.

[bib.bib59] Schneidman et al. (2003a) E. Schneidman, W. Bialek, and M. J. Berry. Synergy, redundancy, and independence in population codes. Journal of Neuroscience, 23(37):11539–11553, 2003a.

[bib.bib60] Schneidman et al. (2003b) E. Schneidman, S. Still, M. J. Berry, W. Bialek, et al. Network information and connected correlations. Physical review letters, 91(23):238701, 2003b.

[bib.bib61] Shuman et al. (2013) D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine, 30(3):83–98, 2013.

[bib.bib62] Stramaglia et al. (2020) S. Stramaglia, T. Scagliarini, B. C. Daniels, and D. Marinazzo. Quantifying dynamical high-order interdependencies from the o-information: an application to neural spiking dynamics. arXiv preprint arXiv:2007.16018, 2020.

[bib.bib63] Timme et al. (2014) N. Timme, W. Alford, B. Flecker, and J. M. Beggs. Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective. Journal of computational neuroscience, 36(2):119–140, 2014.

[bib.bib64] Tononi et al. (1994) G. Tononi, O. Sporns, and G. M. Edelman. A measure for brain complexity: relating functional segregation and integration in the nervous system. Proceedings of the National Academy of Sciences, 91(11):5033–5037, 1994.

[bib.bib65] Vasiliauskaite and Rosas (2020) V. Vasiliauskaite and F. E. Rosas. Understanding complexity via network theory: a gentle introduction. arXiv preprint arXiv:2004.14845, 2020.

[bib.bib66] Watanabe (1960) S. Watanabe. Information theoretical analysis of multivariate correlation. IBM Journal of research and development, 4(1):66–82, 1960.

[bib.bib67] Wibral et al. (2017) M. Wibral, V. Priesemann, J. W. Kay, J. T. Lizier, and W. A. Phillips. Partial information decomposition as a unified approach to the specification of neural goal functions. Brain and cognition, 112:25–38, 2017.

[bib.bib68] Williams and Beer (2010) P. L. Williams and R. D. Beer. Nonnegative decomposition of multivariate information. arXiv preprint arXiv:1004.2515, 2010.

[bib.bib69] Yeung (1991) R. W. Yeung. A New Outlook on Shannon’s Information Measures. IEEE Transactions on Information Theory, 1991. ISSN 15579654. doi: 10.1109/18.79902.

	$L_{n}^{up}$	$= W_{n}^{- 1} B_{n}^{⊺} W_{n + 1} B_{n},$
	$L_{n}^{down}$	$= B_{n - 1} W_{n - 1}^{- 1} B_{n - 1}^{⊺} W_{n},$

	$Fn\bion$	$=ˆ\bion,$
	$Fn\bisn$	$=ˆ\bisn.$

Hyperharmonic analysis for the study of high-order information-theoretic signals

Abstract

1 Introduction

2 High order information-theoretic measures

2.1 Networks and hypergraphs based on pairwise statistics

2.2 High-order statistical effects

2.3 O-information

Definition 1

2.4 Other metrics of high-order effects

3 Hyperharmonic analysis

3.1 Simplicial complexes

Example 1

3.2 Higher-dimensional signals

Example 2

Example 3

3.3 High-order Laplace operator and Fourier basis

Definition 2

4 Proposed workflow

4.1 Steps of the analysis

4.2 Variations

5 Proof of concept: Haydn symphonies

5.1 Method description

5.2 Results

6 Conclusion

Appendix A Control employing randomly generated bases

Appendix B Construction of Bn

References

Appendix B Construction of $B_{n}$