Information Confusion Reveals an Innate Limit of the Information Processing by Neurons

Information experiences complex transformation processes in the brain, involving various errors. A daunting and critical challenge in neuroscience is to understand the origin of these errors and their effects on neural information processing. While previous efforts have made substantial progresses in studying the information errors in bounded, unreliable and noisy transformation cases, it still remains elusive whether the neural system is inherently error-free under an ideal and noise-free condition. This work brings the controversy to an end with a negative answer. We propose a novel neural information confusion theory, indicating the widespread presence of information confusion phenomenon after the end of transmission process, which originates from innate neuron characteristics rather than external noises. Then, we reformulate the definition of zero-error capacity under the context of neuroscience, presenting an optimal upper bound of the zero-error transformation rates determined by the tuning properties of neurons. By applying this theory to neural coding analysis, we unveil the multi-dimensional impacts of information confusion on neural coding. Although it reduces the variability of neural responses and limits mutual information, it controls the stimulus-irrelevant neural activities and improves the interpretability of neural responses based on stimuli. Together, the present study discovers an inherent and ubiquitous precision limitation of neural information transformation, which shapes the coding process by neural ensembles. These discoveries reveal that the neural system is intrinsically error-prone in information processing even in the most ideal cases. Author summary One of the most central challenges in neuroscience is to understand the information processing capacity of the neural system. Decades of efforts have identified various errors in nonideal neural information processing cases, indicating that the neural system is not optimal in information processing because of the widespread presences of external noises and limitations. These incredible progresses, however, can not address the problem about whether the neural system is essentially error-free and optimal under ideal information processing conditions, leading to extensive controversies in neuroscience. Our work brings this well-known controversy to an end with a negative answer. We demonstrate that the neural system is intrinsically error-prone in information processing even in the most ideal cases, challenging the conventional ideas about the superior neural information processing capacity. We further indicate that the neural coding process is shaped by this innate limit, revealing how the characteristics of neural information functions and further cognitive functions are determined by the inherent limitation of the neural system.

Human brain features the capacity to process the external world information [1,2]. This 2 information processing process is triggered by external inputs and consists of various 3 forms of information coding in neural ensembles [3][4][5]. As shown by vast amount of 4 neuroscience studies, neural coding begins with the neural response initiation that is 5 characterized by neural tuning properties [6][7][8]. Then, the coding process is essentially 6 involves with the spike propagation in neural clusters [9][10][11][12], creating ubiquitous 7 information transformation between neurons. Given with the extensive presence of the 8 inter-neuron information transformation, a pivotal and meaningful question is how 9 precisely the information can be transmitted. 10 Considering the precision limit of neural information transformation, the precision 11 reductions implied by bounded, unreliable and noisy transformation processes are taken 12 into account naturally. It has been unveiled that a significant loss of information is 13 implied when the information amount exceeds the channel capacity of the 14 synapse [13][14][15][16][17][18], leading to a precision reduction by incomplete information. Moreover, 15 various information errors (e.g., errors caused by the channel unreliability [19] or 16 external random noises [20][21][22]) are discovered during the transformation, reflecting the 17 fact that precision is frequently limited by the nonideal transformation environments in 18 neural systems. These progresses, however, might lead to a misconception that the 19 precision limit only exists when the transformation is nonideal. Till now, it still lacks of 20 evidence to characterize the neural system as an inherently error-free information 21 processing system. 22 In this work, we put an end to this controversy by indicating the widespread 23 presence of a kind of innate and noise-independent error, namely information confusion. 24 The confusion happens when different information cases elicit the same receiver 25 response. Different from the information losses and errors during the nonideal 26 transformation, the confusion originates after the transmitted information arrives at the 27 receiver and does not rely on external noises or boundaries. We demonstrate that neural 28 information confusion is caused by the interactions between synaptic connection states 29 and neural tuning properties. Therefore, the precision limit implied by information 30 confusion is intrinsically determined by elementary attributes of the neural system. 31 Inspired by Shannon's information theory, we approach this precision limit in the 32 essentially shaped by the innate neural information processing limit. 48

49
Neural information confusion 50 Neural information channel 51 Using a leaky integrate-and-fire model [7,8,28], we can simulate the electrodynamics of 52 a neural population (see the section "Leaky integrate-and-fire networks" in Methods). 53 Then we present a model to define the channel between an arbitrary neuron N i and its 54 receptive field RF (N i ). 55 In neural systems, the intra-system receptive field for a given neuron is defined as 56 the set of all its pre-synaptic neurons. At any moment, each of those pre-synaptic 57 neurons has its own membrane potential state, and there would be an action potential if 58 the membrane potential reaches the spiking threshold. For the given neuron, those 59 states can determine its pre-synaptic inputs, so it is reasonable to treat the ensemble of 60 them as a message sent from the receptive field to this neuron. In this process, the 61 channel is defined as the set of all synaptic connections between the given neuron and 62 its pre-synaptic neurons (Fig. 1a). In a neural population, the stimulus inputs may not be received by all neurons at 64 the same time. We refer the neurons that receive stimuli directly as input neurons. For 65 these neurons triggered by other neurons in their receptive fields but not receive stimuli 66 directly, we refer them as intermediary neurons. The response preference of each input 67 neuron is characterized by a bell-shaped tuning curve G (r max , s pre , σ), where r max is 68 the maximum response rate, s pre is the preferred stimulus and σ represents the width of 69 the tuning curve. As for intermediary neurons, we do not define explicitly their 70 activities by presetting their tuning curves since they are not directly triggered by 71 stimuli. Theoretically, we can estimate their tuning properties based on the tuning 72 curves of the neurons in its receptive field and the synaptic connections between them 73 ( Fig. 1b and the section "Tuning curve and neural response" in Methods). In Fig. 1b, 74 it can be seen that the estimated tuning curves share similar trends with the observed 75 neural responses in our experiment, which accords with the common interpretation that 76 neurons tend to make stronger responses at the peaks of tuning curves [6,[29][30][31]. With 77 a given tuning curve, each neuron N i can response to stimuli S (Fig. 1c). 78 Neural information confusion in the information space 79 The neural information transmits in discrete and finite form. We define the neural 80 information space of neuron N i as IN (N i ), which contains all possible variant cases of 81 the neural information that N i can receive.

82
To give a clear vision, we mainly present our theory on the spike-based neural 83 information [32,33], where the neural response is either 1 for spiking or 0 for 84 non-spiking (we also demonstrate our theory can be applied on the potential-based 85 neural information [34,35] in the section "Spike-based information space" in Methods). 86 In the neural network, the neural responses of one neuron can also be the bases of 87 synaptic inputs of its post-synaptic neurons. More specifically, in our simplification, the 88 transmitted spike-based information is the excitatory or inhibitory postsynaptic 89 potential. To simulate it, we define the binary vector P i for the neural spiking states of 90 neurons in RF (N i ), and the non-recurrent connection strength matrix C that indicates 91 the synaptic connections between N i and other neurons in the neural network (see the 92 section "Spike-based information space" in Methods). Then, let C i be the i column of 93 C, the neural information can be represented by the Hadamard product of P i and C i 94 ( Fig. 2a and the section "Spike-based information space" in Methods). Thus, for each 95 given neuron N i , the neural information space IN (N i ) (see the section "Spike-based 96 information space" in Methods, equation (12)) is defined as where the matrix C can be set as either constant 98 for simplification or dynamical to fit into the plasticity mechanisms (see the section 99 "Neural zero-error capacity definition for the dynamical transformation process" in 100 Methods). Visualizations of neural information space and graph. (a) We define that RF (N i ) consists of 2 types of neurons with unique tuning curves. Assume that each j type includes only one neuron, which is marked as N j . We know that |IN (N i ) | = 4 and all the possible cases are given above. (b) Assume that both of the 1st and 2nd type of neurons can activate N i independently. Thus, the connected component of G (N i ) contains only the neural information cases where one or both of the 1st and 2nd type of neurons spike. As for the rest cases, they are isolated nodes. (c) The construction of G (N i ) n based on G (N i ). if and only if both I x and I y can make N i spike when N i is not in the refractory period 120 (see the section "Potential-based information graph definition" in Methods). Fig. 2a-b 121 show an example of G (N i ). In our framework, the proposed information confusion is 122 naturally determined by the interactions between synaptic connection states and neural 123 tuning properties, which does not rely on noises.

124
Neural zero-error capacity 125 In information theory, zero-error capacity is used to indicate the supremum of 126 information transformation rates with zero probability of error in a given channel.

127
Following the classical definition, we define the neural zero-error capacity Θ (N i ) as where α is an independence number (see the section "Zero-error capacity definition of As for K * i , since there is an isolated node (marked by a green box) in the graph and it can be only included in one clique, which is K 8 , we can conclude that µ = 1. Similarly, θ = 1 since there is no other clique in this connected component. (c) For the G (N i ) in Fig. 2b, we randomly search the zero-error transformation cases in G (N i ) n with n = 1, · · · , 11 by solving the independent set searching problem. For each G (N i ) n , we randomly pick its independent set 1000 times to obtain a set of zero-error information rates. Moreover, based on our theory, an upper bound of Θ (N i ) can be predicated (marked by red circles), which equals 1. (d-e) The upper bound is marked as "UB". N i is independent means that its spiking state is independent from spiking states of the neurons in RF (N i ), which indicates that N i must be an input neuron rather than an intermediary neuron.
An important property of equation (1) is that the neural zero-error capacity of any 143 given neural information graph has a close relation with the maximum clique 144 assignment λ of the graph. In our work, we suggest that any neural information graph 145 G (N i ) has a maximum clique assignment λ i determined by the properties of RF (N i )

146
(see the section "The maximum clique of neural information graph" in Methods, (24) 147 and (25)). Then, because of the independence number α satisfies α ≤ λ −1 , we can tell 148 that λ can offer an upper bound for the neural zero-error capacity, which is given as ). Thus, combine the results in (24) and (25) where τ j indicates if I j can activate neuron N i . τ j equals 1 when I j can make N i spike 153 and equals 0 when I j can not. 154 We need to pick one node from G (N i ) that has minimum degree (the number of 155 cliques that include this node is smallest), let K * i be the set of all cliques that contain

167
In real neural systems, the situations with E = ∅ are ubiquitous. Thus, we suggest 168 that the limited neural zero-error capacity described by (2)  During neural coding, information is transmitted between neurons and frequently but 188 not necessarily involved with the information confusion. Apart from that, it is known 189 that the neural response preference contained in the neural information is summed and 190 transmitted as well, which usually implies a selectivity generalization along the neural 191 pathway (e.g., the selectivity generalization from the V1 neurons to MT neurons in the 192 visual system [36][37][38]). Due to the similarity between information confusion and We demonstrate a dyeing experiment involves with a network of 500 neurons. In the 197 experiment, the network is assumed to code a stimulus sequence consists of different 198 vectors. Each type of input neuron with unique stimulus selectivity is marked with a 199 specific color and all intermediary neurons are initialized with no innate preference and 200 marked as white (Fig. 4a). For simplification, we use numbers to index those vectors 201 in our results (Fig. 4c). In each iteration, the intermediary neurons are dyed We know that the existence of neural information confusion means that different 213 neural information implies same neural response of the receiver neuron. In the dyeing 214 experiment, each spiking intermediary neuron is dyed with the color averaged from its 215 previous color and the colors of the lately spiked neurons in its receptive field. So, if 216 there is only one message that makes this neuron spikes, then its color will straightly 217 approach to the averaged color of the spiking neurons described by this information 218 (with a straight trajectory). Otherwise, if the variation trajectory of its color is winding 219 or oscillating (not only one message can activate this neuron), then we suggest that Supporting information. As for the selectivity generalization, it only requires the color 225 has at least two non-zero color components (Fig. 4b) and the variation trajectory can 226 be either straight or winding. Together, we conclude that the information confusion is a 227 special case of the selectivity generalization with a winding color variation trajectory 228 and thus they are not equivalent conceptions.  The smaller H * s is, the less noise the coding process of s produces. In other words, the 241 neural activities can be better explained by s. Thus, we define the stimulus subset that 242 can better explain the neural activities as S = {s | s ∈ S, H * s < H * }. Then, we define 243 the coding scope as ζ = | S| |S| (see the section "Definition of the coding scope" in Supporting information.

287
To sum up, the information confusion has both detrimental and beneficial effects on 288 neural coding. On the one hand, the total variability H of neural responses is reduced, 289 which limits the mutual information H * * ; On the other hand, it is highly possible that 290 H * decreases when the stimulus-irrelevant neural activities are controlled and both ζ 291 and H * * H increase, which means that the interpretability of neural responses is improved 292 and the activities can be better explained by a wider range of stimuli. In the present study, we reveal an innate limit of the information processing in neural 296 system by indicating the pervasive information confusion phenomenon during the 297 information transformation between neurons. 298 We first define the neural information confusion in the content of neuroscience and 299 propose a practical method to work out the upper bound of the information 300 transformation rate with zero error of any given neuron (neural zero-error capacity).

301
This systematic theory can be applied to any kind of discrete and finite neural 302 information (e.g., spike-based or potential-based). For a neuron, if the neural zero-error 303 capacity is reached, then there must exist neural information confusion; if it is not 304 reached, there still exists the possibility of neural information confusion. 305 We then propose a practical method to analyze the effects of information confusion 306 on neural coding. The results suggest that the effects of information confusion can be 307 either detrimental or beneficial. On the one hand, it controls the total variability H of 308 neural responses and limits the mutual information H * * . On the other hand, it 309 improves the interpretability of neural responses as that the stimulus-irrelevant neural 310 activities are controlled.

311
As unveiled in our work, the precision limit caused by information confusion features 312 widespread presence during the neural information processing, and is intrinsically 313 determined by neural tuning properties and synapse states. This innate limit plays a 314 critical role in characterizing the neural coding process, leading to the variation of coded 315 information along the information transformation pathway. In sum, we demonstrate 316 that neural system is not an inherently error-free information processing system even 317 under ideal conditions, and its essential limit in information transformation creates 318 significant effects in neural coding.

319
Neural information confusion and relevant topics 320 Besides information confusion, there are three other resources related with the 321 information limitation of neural systems. The first one is the information loss caused by 322 channel capacity [13][14][15][16][17][18]. Information loss happens when the entropy of information 323 exceeds the channel capacity (the maximum transfer rate that the channel supports). Information loss concerns about the limited information transfer rate rather than the 325 transfer precision. So it is more related to the reduction of the total response entropy H 326 in our research [19][20][21][22].

327
The second one is the information limiting correlation observed experimentally. The 328 origin of it still remains controversial. A classical perspective argues that the similarity 329 in tuning properties implies the correlations among neurons for a target stimulus and 330 affect neural coding significantly [39][40][41]. These positive correlations are inevitably 331 caused by the shared input connections between neurons with similar tuning 332 characteristics, which suggests that shared connections between neurons might lead to 333 limited information [42]. Recent studies have contradicted this hypothesis in several 334 aspects. Rather than originate from the shared connections, the limitations are 335 suggested to be caused by the correlations proportional to the product of the derivatives 336 of the tuning curves. They spontaneously emerge in the finite information encoding and 337 storage process of a sufficiently large neural population [43][44][45]. Despite the controversy 338 on the origin mechanism, it is confirmed that the correlation can limit information in 339 neural coding (similar with the information loss). A similar finding is proposed in our 340 research, which suggests that compared with the pre-synaptic variability (information) 341 of neural responses, the post-synaptic one is frequently controlled.

342
The third one is the information error (e.g., errors caused by the channel 343 unreliability [19] or random noises [20][21][22]). Compare with information confusion, there 344 are two main differences. The first difference lies in that the information confusion does 345 not focus on the noises added to the neuron, confusion is determined congenitally by the 346 tuning properties, even in an ideal situation with no external noise. The second 347 difference is that the transfer losses and errors happen during the transformation while 348 the confusion happens after the information arrives at the receiver. In our paper, the recurrent network of leaky integrate-and-fire neurons is used to create 355 the electrodynamics involved in the neural information transformation process. Rather 356 than actuate all neurons directly based on the stimulus as previous researches did [43], 357 we distinguish input neurons from the neuron set and define the stimulus as the 358 synaptic drive for those neurons.

359
The basic element in our simulation is the differential equation for the membrane potential time evolution of leaky integrate-and-fire neurons with current synapses, which is defined as In this equation, item (I) is the leak current. Λ denotes the Heaviside step function, 360 τ p is the leaky membrane time constant and V is the resting potential. If the membrane 361 potential V i is not less than V , then neuron N i will be involved in a hyperpolarization 362 process described based on the leaky mechanism. 363 (II) is the recurrent item, which consists of the spiking mechanism and the synaptic 364 input. In this item, W is the weighted adjacent matrix that defines the connection 365 strength between neurons, where W min and W max are respectively the minimum and maximum connection strength, and V is the spiking threshold. And Q ij t − t n j + t j→i is the synaptic response to the spike, which is given as in which τ m is the membrane time constant. And ξ is given as In equation (5), t n j ≤ t denotes the time of the nth existing spike of neuron j. And t j→i 368 measures the time cost, if j = i, then t j→i is the refractory period; If j = i, then t j→i is 369 the average transmission delay of the spike. Based on equations (4) and (6),

376
(III) is the stimulus drive item, whose definition is learned from [43]. In this item, I 377 is the indicative function and N is the index set of all input neurons.

378
If neuron i is an input neuron, then F i is the synaptic drive of it based on the 379 stimulus S. The definition of F i can vary based on research targets. In our research, the 380 synaptic drive is defined to represent the characterized response preference of input 381 neurons. More specifically, each input neuron N i has its own tuning curve G i , which 382 decides its response strength to given stimuli (see the subsection in Methods for 383 details). Then, we let F i (S (t)) = G i (S (t)), thus the stimulus preference of each input 384 neuron is defined.  In this section, we propose a practical method to measure the upper bound of neural 391 zero-error capacity directly based on the properties of the given neuron.

392
Tuning curve and neural response 393 In our research, the stimulus S is set as a sequence where each S (t) is selected 394 randomly and uniformly from the stimulus interval [S min , S max ].

395
The tuning curve of any given input neuron is defined as G (r max , s pre , σ), in which 396 the preferred stimulus s pre is randomly selected from [S min , S max ] based on a random 397 distribution F . In our research, F is set to be an uniform distribution for simplification. 398 And the maximum response rate r max is randomly selected from an empirical interval 399 [40,60] based on an uniform distribution. σ represents the width of the tuning curve, ].

401
In detail, the mathematical definition of the tuning curve is given as As for the intermediary neuron, its estimated tuning curve can be calculated based 403 on the recurrent connection strength matrix W and the tuning curves of input neurons. 404 Assuming that each input neuron N i has a tuning curve G i , the tuning curve G j of a 405 given intermediary neuron N j is estimated as where I is the indicative function and − → R ij is the shortest route from N i to N j . Based 407 on equation (8), we actually let the tuning curve of intermediary neuron be shaped by 408 the postsynaptic potentials of input neurons.

409
Neural information space definition 410 There are two kinds of widely used neural electrical information forms. The first one is 411 the spike-based and the second one is the potential-based. The spike-based information 412 has 2 symbols (spiking and non-spiking) while the potential-based information has information will be given as following.

417
Spike-based information space To define the spike-based information space, it 418 is necessary to define the neural spiking states in the receptive field of each given 419 October 14, 2020 12/25 neuron. For a neural population with n neurons, we define that each neuron N i is 420 equipped with a spiking state P i (t) at moment t, in which 421 P i (t) = 1, N i emits a spike at moment t, 0, N i emits no spike at moment t.
Then, the spiking states vector P i within the intra-system receptive field of any neuron 422 N i is given as In equation (13), we exclude the spike state of N i itself to leave out the recurrent 424 information. Analogously, we can also define a non-recurrent connection strength matrix 425 C to indicate the synaptic connections between neurons by exclude the recurrent items 426 in W . More specifically, we leave out the all elements on the primary diagonal of W Let C i be the i column of C, by calculating the Hadamard product of P i and C, we can represent any possible case of the spike information that N i can receive, based on which, the spike-based information space IN (N i ) is defined as Potential-based information space The main difference between spike-based 428 and potential-based information lies in that the first one only indicates whether a 429 neuron spikes or not while the second one can not only show the spiking state, but also 430 indicate the membrane potential.

431
For a given neuron N i , based on (3), its potential state is V i (t). When the potential 432 state reaches to V , there will be a spike. In real experiment, the measurement 433 technology can not realize arbitrary precision, there must be a precision limitation of it. 434 So, we can treat V i (t) as discrete.

435
Following the idea used in the definition of the spike-based information space, we 436 define the potential states vector V i within the intra-system receptive field of any 437 October 14, 2020 13/25 neuron N i is given as And the corresponding potential-based information space is given as Measure the number of elements in the neural information space usually result in the similar responses to given stimuli, we can reasonably measure the 443 total variation of neural spiking states based on the characterized response preference. 444 More specifically, we use the Wasserstein distance to indicate the differentiation 445 between the tuning curves of two neurons N x , N y , which is given as where Π G x , G y is the set of all possible joint distributions of G x and G y . For each 447 given joint distribution Σ, equation (15) takes one sample (a, b) from it at each time and 448 eventually works out the expectation of L 2 norm · 2 of all samples. Then, we use the 449 infimum of all possible expectation values to represent the distance between G x and G y . 450 Based on this definition, we define an equivalence relation ∼ W that N x ∼ W N y if 451 and only if d W G x , G y ≤ γ, in which γ is a given threshold. Thus, for neuron N i , its 452 intra-system receptive field RF (N i ) can be classified into RF (N i ) / ∼ W , where each 453 element is a neuron type with specific characterized response preference. In our research, 454 we mark that If we let γ approach to 0, 455 we know that the homogeneity between same type neurons will be increased. Thus, 456 under ideal conditions, we assume that for each type of neurons, they can either all emit 457 spikes or all keep silence when a given stimulus comes in. As a result, the total variation 458 of neural spiking states in the intra-system receptive field of neuron N i is defined as The cardinal number of potential-based information space Following the 460 idea discussed previously, we continue to use the equivalence relation ∼ W to obtain N i . 461 We assume that the measurement accuracy limitation of the membrane potential is ∆V . 462 Thus, for any given neuron N i , the number of all possible potential states of it can be it is clear that (18) means that both of the postsynaptic potentials correspond to I x and 476 I y can reach to the spiking threshold V of V i .

477
To seek for a better representation of the confusion relation, we propose the 478 definition of spike-based information graph. For each given neuron N i , its spike-based 479 information graph is defined as where then we treat the membrane potential states described by I x and I y are same as each 492 other.

493
Then, we can define that Based on the above definition, we can follow the method in (19) to define the 497 potential-based information graph.
Zero-error capacity definition of the neural information space 499 Up to now, we have obtained the neural information space and the corresponding graph 500 (both the spike-based and the potential-based). We then turn to measure the zero-error 501 capacity based on the graph. Note that what we will discuss is independent from the 502 selection of neural information type, so we do not distinguish between different types of 503 neural information.

504
In the information theory, if we treat G (N i ) as the representation of relation 505 between symbols (nodes in the graph), then G (N i ) n represents the relation between the 506 sentences consists of n symbols. We know that if there is an edge between two nodes, 507 then the symbols or sentences represented by them are easy to be confused with each 508 other. For informatics, confusion is a kind of error. So, a meaningful question is to ask 509 about how many symbols or sentences from a given system can be transmitted with 510 zero-error at most. To answer this question, Shannon defined zero-error capacity [23], 511 which is given as where α is the independence number, which indicates how many nodes can be included 513 in the maximum independent set of G (N i ).

514
In this section, we will not go deeper into the discussion of how Shannon create this 515 concept step by step. What we want to emphasize is that Shannon and later 516 researchers [23,26,27], discovered that for any graph G (N i ) where λ (G (N i )) is the maximum clique assignment. Based on this inequality, when it is 518 hard to work out Θ (N i ) directly, we can still obtain a bounded measurement of it.

519
The maximum clique of neural information graph 520 A relevant concept of the neural information graph is the maximum clique assignment. 521 In graph theory, the maximum clique assignment of a graph G is defined as where X is any random distribution X = {X v | v ∈ V } and K is the set of all cliques in 523 the graph.

524
In our research, the maximum clique assignment λ i of any given G (N i ) satisfies • if E = ∅, then there is where we pick one node from G (N i ) that has minimum degree (which also means 527 that the number of cliques include this node is smallest). We then let K * i be the 528 set of all cliques that contain this node, K * * i be the set of all cliques in the same 529 connected component with this node. Next, define |K * i | = µ and |K * * i | = θ. Apart 530 from that, V is the spiking threshold and τ j = τ I j is used to indicate whether 531 I j can activate neuron N i , which is defined as It is clear that τ I j equals 1 when I j can make N i spike and equals 0 when I j 533 can not.

534
To provide a better understanding for (24) and (25), we give the following proof for 535 this theorem. • if E = ∅, we can know that j τ j > 1. Then, based on equation (23), we assign 2 −|Ni| to the nodes in G (N i ), and assign 0 to the edges in G (N i ), meaning that Then, we can also consider the dual problem of the definition of λ, which is given by where Y is a any random distribution Y = {Y K | K ∈ K i }. The assignment of Y 538 is a little tricky, which is given as 545 Based on the assignment described above, it is easy to know which can be rewritten as Thus, based on equation (27) and (30), we can prove that (25) is right when E = ∅. 547 • If E = ∅, we know that j τ j ≤ 1 and there is no edge in G (N i ). So, the maximum clique exists in this graph is each individual node itself. Then, following the second assignment method we use above, we can also prove where |K * i | = 1, |K * * i | = 1 and |G (N i ) / ∼ C | −1 = 2 −|Ni| . Thus, (24) is correct 548 when E = ∅. Here we don't repeat the proof in detail.

549
Specially, to make the calculation easier, for (30), we can have following analyses for 550 µ 551 • j τ I j < 2 |Ni| means that not all neural information cases can make N i spike 552 and G (N i ) is not a complete graph. Under this condition, we know there must be 553 at least one isolated node in the graph, and it implies that since the isolated node is included in only one clique, which is the node itself.

555
• j τ I j = 2 |Ni| means that any neural information case can activate N i , so G (N i ) is a complete graph. Under this condition, every node is included by same amount of cliques. For convenience, we mark 2 |Ni| = n. We then randomly pick one node from G (N i ), as for its corresponding K * i , we know that where K * i,j ⊂ K * i is the set of all the cliques that include this node and have j 556 nodes in total. 557 We can also tell that the θ in equation (30) can be worked out by where G * * is the given connected component and deg (v) measures the degree of v. An 558 important property is that if µ = 1, which means that the connected component is a 559 isolated node, we can know θ = 1 as well since there is only one clique in the connected 560 component (the node itself).

561
Upper bound estimation method 562 Based on (24) and (25), we can simply deduce that for any given G (N i )

563
• if E = ∅, then • if E = ∅, then so combine (36), (37) with (22), we can know An important thing is that (38) is actually the supremum of the neural zero-error 567 capacity if G (N i ) is not a complete graph. The following is the proof is not a complete graph, then we know j τ I j < 2 |Ni| , which can 569 be divided into two cases 570 • j τ I j = 0, which means that E = ∅ and there is no edge in the graph. Under 571 this condition, the biggest independent set in G (N i ) is itself, which deduces that 572 α = 2 |Ni| . So, based on (36), we can know that combined with (22), (39) implies that thus the upper bound predicted by us is the supremum when j τ I j = 0.

575
• 0 < j τ I j < 2 |Ni| , which means that E = ∅ and there must be at least one 576 isolated node in the graph. It is clear that the isolated node has the smallest degree, 577 being 0. Based on (33), we know that µ = 1 under this condition. Similarly, θ = 1 578 can be obtained by (35). Then, (37) can be written as where we know that j (1 − τ j ) measures the number of isolated nodes. And each 580 isolated node is a connected component. As for the un-isolated nodes, they are 581 connected with each other and belongs to one connected component. Thus, the 582 number of connected component number of G (N i ) is 1 + j (1 − τ j ).

583
Since the maximum independent set contains only one node from each connected 584 component, we can know α = λ −1 . Thus, by following (40), we can prove that the 585 upper bound is actually the supremum when 0 < j τ I j < 2 |Ni| .

586
To conclude it, we know that when j τ I j < 2 |Ni| , the upper bound is the 587 supremum.

588
As for the case when G (N i ) is a complete graph, we can directly know that 589 Θ (N i ) = 0 since you can never pick two nodes that have no edge between them in the 590 graph. So, the measurement of upper bound is no longer needed.
Detect the neural information confusion based on the dyeing experiment 633 result 634 To offer a detection for the neural information confusion in real experiments, we propose 635 a method that can be applied directly to the results of the dyeing experiment.

636
Given that in the dyeing experiment, if an intermediary neuron spikes, then it is 637 dyed with the color averaged from its previous color and the colors of the lately spiked 638 neurons in its receptive field. So, it is easy to know that for any intermediary neuron N i 639 • if G (N i ) contains no edge (there is no confusion), then there are two possible 640 cases:

641
-N i never spikes, so the color of it always reminds to be the initial color. The 642 variation trajectory of the color of N i in the color space is a point;

643
there is only one message that can activate N i , so the color of N i gradually 644 approaches to the averaged color of the lately spiked neurons described by 645 this message. The variation trajectory of the color of N i in the color space is 646 straight.

647
• if G (N i ) contains at least one edge (there is confusion), then the color of N i will 648 approach to the averaged color of the lately spiked neurons described by different 649 messages in different iterations. So, the variation trajectory of the color of N i in 650 the color space is winding.

651
Based on those discussions, the detection of the neural information confusion can be 652 realized by verifying whether the color variation trajectory is winding. variation of neural responses that can be explained by stimulus). In [6], researchers have 659 proposed a practical method to calculate them, which are also used in our paper.

660
The first step in this method is to obtain the conditional probability distribution 661 P (r | s) of each neuron, which denotes the probability of that the spiking rate of the 662 neuron is r when the stimulus input is s. Of course, this distribution can be obtained 663 directly in a real neural coding experiment. As for the computational experiment used 664 in our paper, this distribution can be worked out based on the tuning curve, which is 665 given as where G i is the tuning curve of N i and T is the duration.

667
Then, the second step is to work out the total response distribution P i (r), which 668 measures the probability of that the spiking rate of N i is r. In our paper, since the 669 probability distribution of the stimulus P (S) has been given, so we have 670 P i (r) = s P i (r | s) P (s).

671
Finally, we can calculate the parameters respectively as 672 H (N i ) = − r P i (r) log 2 P i (r) , where H * s = − r P i (r | s) log 2 P i (r | s) denotes the noise entropy of neural coding for 678 s (measures the variation of neural response that can't be explained by s). 679 Then, we define the coding scope as Based on those definitions, it is easy to know that the coding scope ζ measures the 681 proportion of the stimuli that can better explain the neural responses.