Cognitive Domains Function Complementation by NTNG Gene Paralogs

Gene duplication was proposed by S.Ohno (1) as a key mechanism of a novel gene function evolution. A pair of gene paralogs, NTNG1 and NTNG2, sharing identical gene and protein structures and encoding similar proteins, forms a functional complement subfunctionalising (SF) within cognitive domains and forming cognitive endophenotypes, as detected by Intellectual Quotient (IQ) tests (2). Both NTNG paralogs are associated with autism spectrum disorder (ASD), bipolar disorder (BD) and schizophrenia (SCZ), with unique nonoverlapping segregation among the other 15 cognitive disorders (CD), emphasizing an evolutionary gain-dependent link between advanced cognitive functions and concomitant neurocognitive pathologies. Complementary expression and human brain transcriptome composition of the paralogs explains the observed phenomena of their functional complementarity. The lowest identity among NTNGs is found in a middle of encoded by them proteins designated as uknown (Ukd) domain. NTNG1 contains anthropoid-specific constrained regions, and both genes contain non-coding conserved sequences underwent accelerated evolution in human. NTNG paralogs SF perturbates “structure drives function” concept at protein and gene levels. The paralogs function diversification forms a so-called “Cognitive Complement (CC)”, a product of gene duplication and subsequent cognitive subfunction bifurcation among the NTNG gene duplicates.


INTRODUCTION
Complex behaviors arise from a combination of simpler genetic modules that either have evolved separately or co-evolved.Many genes and proteins they encode have been found to be involved in cognitive information processing with a single variant or a single gene generally accounting for only a partial phenotypic variation of a complex trait, such as cognition.Cognition as a quintessence of brain functioning can be viewed as a product of intricately interlinked networks generated by deeply embedded into it gene-nodes with specific or partially overlapping functions.The robustness of the cognitive processing towards its single elements genetic eliminations (to study their function) and its simultaneous fragility expressed in the multiple forms of neurological disorders manifest the existence of cognitive domains interlocked but SF within a unit of cognition formed upon these domains interaction.Previously, we have described a function of a pair of gene paralogs, NTNG1 and NTNG2, involved in human IQ tests performance, and underwent hominin-specific evolutionary changes (2).Hereby, we report on these gene paralogs features focusing on underlying mechanisms of their function segregation and complementation within the cognitive domains.

RESULTS
The previously observed phenomena of functional complementation among the NTNG paralogs within cognitive domains (2) is also manifested in NTNG-associated human pathologies diagnosed in most cases (if only not in all) by a cognitive decline (Figure 1A-1 and A-2).Both genes are associated with BD and SCZ -devastating disorders sharing similar etiology (3) with genetic correlation by multivariate analysis of 0.590 (4), linked to human creativity (5), and characterized by impulsiveness as a common diagnostic feature (6).
Recently found associations of both paralogs with ASD (7) supports the reported genetic correlation of 0.194 ASD/SCZ pair (4) and shared module eigengenes detected by PC1 among these two disorders (8).12 NTNG1-linked CDs, ranging from AD to TS, span a broad spectrum of clinical features frequently involving reduced processing speed (PS) and verbal comprehension (VC, Figure 1A-1).As for NTNG2, working memory (WM) deficit and inability "to bind" events (perceptual organization, PO) are the most prominent diagnostic traits for the SLE and TLE patients (Figure 1A-2), with PN also being characterised by indolent behavior in 90% of the cases (9).Interestingly to note that association of the synapse-expressed NTNG2 with both SCZ and autoimmune pathology (SLE) correlates with a recent finding that human complement component C4 is involved in the synapse elimination and SCZ development (10).Thus, both NTNG paralogs are associated with a variety of CDs and mostly in a non-overlapping manner, except for ASD, BD and SCZ characterized by shared and wide spectrum of cognitive abnormalities.The clinical etiology of the aforementioned diseases supports the deduced by IQ functional complementation existing among the NTNG paralogs (2) with (VC/PS) and (WM/PO) endophenotypic deficits being uniquely segregated within the associated cognitive pathologies.
Since both gene paralogs are expected to have identical gene exon/intron compositions but different in their intron lengths (11) we have reconstructed both paralogs 4 transcriptomes by re-processing the publicly available RNA-seq dataset (12) from healthy and SCZ human subjects superior temporal gyrus (STG) post-mortem brain tissue (Supplementary Table 1a=ST1a).A difference is noted instantly at the total expression levels (genes, exons, individual RNA transcripts) when two gene paralogs are compared (Figure 1B-1 and B-2).NTNG2 amount (as a whole gene) is 5 times larger comparing to NTNG1; exons (2-5) are 3 times, exons (8)(9) are 18 times and exon 10 is 4 times higher expressed for NTNG2 than for NTNG1.The only two exons outlaying the prevailing amount rule for the NTNG2 mRNAs are exons 6 and 7, expressed nearly at the same absolute level as for the NTNG1 exon paralogs, making them highly underrepresented within the NTNG2 transcriptome.Next, distinct non-alternating splicing modules are formed by exons (2-5) for NTNG1 (Figure 1B-1), while exons (4-5) and exons (8)(9) for NTNG2 (Figure 1B-2).Two structurally identical RNA transcript paralogs (NTNG1a = G1a and NTNG2a = G2a) have been found to exist in both NTNG transcriptomes with G2a being expressed at 8-9 times higher level than G1a.NTNG1 is uniformly presented across the all analysed 16 human samples by 2 more protein coding RNAs (G1c and G1d, detected previously in mice brain, Nakashiba et al., 2000) and by 2 non-coding intron (9-10) derived transcripts (Figure 1B-1).
At the same time, NTNG2 transcriptome is comprised of one extra potentially coding RNA (G2a-like with exon 2 spliced out but in-frame coding preserved) and 2 assumed to be noncoding RNAs with exons 6 and 7 retained along with preceding and following them introns.
Quite interesting that these two latter transcripts are the only RNA species with NTNG2 exon 6 and 7 retained (Figure 1B-2).Two more coding (G1f and G1n) and 4 more non-coding for NTNG1 and 9 extra non-coding for NTNG2 RNA species have been also assembled from the available reads but due to inconsistency in their appearance across all 16 STG samples they are not presented on the figure but summarized in the table (Figure 1C, for details refer to ST1d).Summarising above, quantitative and qualitative complementary differences is a 5 prominent feature characterising the brain RNA transcriptome of human NTNG paralogs.However, no significant changes at the transcription level of neither whole genes, nor individual exons, nor reconstructed RNA transcripts have been found when SCZ and healthy subjects samples are compared.
Upon calling the presence of IQ-affecting SNPs (2) across all STG samples (ST1c) it has been revealed that 15 out of 16 subjects were positive for the T-allele of rs2149171 (exon 4-nested), shown above to attenuate the WM score in SCZ patients, and making a comparison among the allele carrier vs non-carrier impossible.Four healthy and three SCZ samples carry a T-allele of rs3824574 (exon 3-nested, non-affecting IQ), and 1 healthy and 1 SCZ sample each contains a C-allele of rs4915045 (exon10, non-coding part-nested, and non-affecting IQ).
Thus, among the eleven cognitive endophenotype-associated SNPs only 3 were possible to call out of the available NTNG transcriptome.
Distinctly complementary nature of the NTNG paralogs segregation within neurological disorders and RNA transcriptome usage in STG (Figure 1) has prompted us to analyse both genes expression across the entire human brain.We have reconstructed both genes expression profiles in the human brain areas over the life span from conception (pcw = post-conception week) to mature age (30-40 yrs old) using the RNA-seq data from BrainSpan (www.brainspan.org).Similarities and differences are easily noted when the age-dependent phases of NTNG1 and NTNG2 expression profiles are matched (Figure 2).Based on the visual inputs three distinct classifiers have been elaborated: 1. predominantly synchronous (Figure 2A(1-4)), characteristic mostly for the cortical areas; 2. predominantly mixed and asynchronous (Figure 2B), characteristic for the cerebellar cortex and subcortical formations; and 3. anti-phasic (complementary, Figure 2C), characteristic for the MD of thalamus and hippocampus.All analysed brain areas demonstrate an elevated level of NTNG2 expression in comparison to NTNG1 except for thalamus (Figure 2C) with the largest difference observed 6 at the time of birth (35-37 pcw) or soon after (4 mo) for the synchronous classifiers (Figure 2A), oscillating increment values across the life span for the mixed (Figure 2B) and antiphasic (Figure 2C) classifiers.It is quite intriguing to note that essentially all brain areas show a trend towards the expression difference being negated between the paralogs by reaching the mature age of 30-40 yrs old (nearly or above the mean age used for the IQ testing), except MD where the expression discrepancy is increased.Thus, the observed functional complementation among the NTNG paralogs is supported by the anatomical distribution of the genes in human brain and their expression pattern modality over the human subjects lifetime.
A direct comparison of the NTNG paralogs shows not only identical intron-exon gene structure (Figure 1B-1, 2B-2) but also closely matched exon sizes (Figure 3A).There are three exons of identical sizes (exons 4, 8 and 9), another three exons differed by one encoded amino acid (exons 3, 5 and 6) and there are exons of different sizes (exons 2, 7 and 10).In terms of size the largest difference among the genes is visually presented by the introns: intron (9-10) of NTNG1 is 52.7 times larger its NTNG2 paralogous intron with intron (6-7) of NTNG1 being only 1.43-times larger pointing towards non-equilibria process of non-coding elements elaborations as the process of gene paralogs SF proceeded.Nevertheless, it can be generalised that in average all NTNG1 introns are several times larger their NTNG2 analogs (Figure 3A).We have shown previously that exons 6 and 7 are differentially used within the brain NTNG transcriptome (Figure 1B-1 and B-2) and to explore their potential contribution into the paralogs SF we have built identity matrices with these exons being excluded and included (but still producing in-frame existing transcripts, Figure 3B-1 left and right panels, respectively).Exclusion of both exons from the full-lengths transcripts (thus converting NTNG1m to NTNG1a and NTNG2b to NTNG2a, respectively) increases the identity of DNA on 2% (a relatively large effect since both exons together represent only 7.22 and 9.69% of 7 the total coding part of the full-length RNA transcripts, NTNG1m and NTNG2b, respectively).This effect becomes even stronger when the encoded by these transcripts proteins are also compared (Figure 3B-2).The spliced out Ukd protein domains (encoded by the exons 6 and 7) increases the proteins identity on 3.8% thus making the middle of both genes (and encoded proteins) substantially more different among the both gene paralogs.To corroborate this observation and to explore the importance of other protein parts we have directly compared the sequences encoded by the full-length transcripts and producing Netrin-G1m and Netrin-G2b (Figure 3C).Similarly to what has been shown on Figure 3B-1 and 3B-2, the lowest identity (17.5%) is represented by the Ukd domain (encoded by the exons 6 and 7) and by the preceding it exon 5 (a 3'-part of the LE1 domain).Two other areas also show a substantially low identity, namely the N-terminus (it includes the protein secretory signal indicated by an arrow) and the outmost C-terminus responsible for the unique feature of Netrin-Gs -the GPI attachment.Thus, based on the percent identity comparisons among the Netrin-G paralogs it can be predicted that there are several potential protein parts contributing to the paralogs SF.
As it has been reported by Seiradake et al. (13), identical gene and protein domain compositions result in the identical structural motif with differences only in the spatial arrangement of the loops facing the post-synaptic Netrin-G's interacting partners, NGL-1 and NGL-2, respectively (Figure 3D).Loop I binding surfaces alignment (Figure 3C Involvement of the pre-synaptically expressed axon-localised NTNGs in SCZ diagnosis supports the established view of SCZ as a product of distorted trans-synaptic signaling (14), with a recent study also proving that axonal connectivity-associated genes form a functional network visualisable by fMRI (15), and that brain connectivity predicts the level of fluid intelligence (16,17).Both NTNGs have been found to participate in the brain functional connectivity found by the parcellated connectome reconstruction (18).Most of the reported disease associations link NTNG1 to SCZ with a variety of other neurologic pathologies (15 in total, Figure 1A-1), while NTNG2 pathologic associations (6 in total, Figure 1A-2) are quite limited to those affecting WM or PO.Among them is SLE frequently characterized by WM deficit (19) and also known to represent schizoid-type abnormalities characteristic for autoimmune pathologies (20,21).Immune activation is known to lead to altered pre-pulse inhibition (a key diagnostic trait for SCZ) reversed by antipsychotics (22).The three diseases associated with both paralogs (ASD, BD and SCZ) are also a primary focus of the recently initiated PsychENCODE project (23).It is also worth to mention the resemblance of the reported disease associations with the behavioral phenotypes of Ntng1 and Ntng2 gene knockout mice (24).
A gene content associated with attenuated IQ score often relates to numerous diseases, such as SCZ, ASD, depression, and others (see (25) for ref. ; 26) with several of them also undergone positive selection following the human brain evolution (27).Despite the fact that global network properties of the brain transcriptome are highly conserved, among the species there are robust human-specific disease-associated modules (28) and human accelerated regions (HARs) -highly conserved parts of genome that underwent accelerated evolution in humans (29).HARs can serve as genomic markers for human-specific traits underlying a recent acquisition of modern human cognitive abilities by brain (30) but that also "might have led to an increase in structural instability… resulted in a higher risk for neurodegeneration in the aging brain" (31), rendering our intellectual abilities genetically fragile (32) and resulting in a variety of CDs.The role genomic context, epistasis (33), plays in the evolution and pathology is manifested by frequently found disease-causing alleles present in animals without obvious pathological symptoms for the host (34).Any CD is characterized by general intellectual disability (GID) plus psychiatric symptoms.A genetic perturbation-exerted behavioral cognitive deficit (BCD) in an animal model organism is a poor match to a human CD per se due to very poor contextual resemblance between the human GID and animal BCD together with the absence of interpretable psychiatric symptoms.
Usefulness of animals as psychiatric models is also compromised by the fact that transcriptome differences within species tissues is smaller than among the homologous tissues of different species (35,36).No wonder that the compounds that "cure" mice models consistently fail in human trials (discussed in (37)).
NTNG paralogs brain transcriptome intrinsic complementarity and possible mechanism for the IQ-affecting mutation alleles effect.There is no global change at the mRNA level between healthy subjects and SCZ patients (Figure 1B).This conclusion is supported by previously published works stating that globally altered mRNA expression of NTNG1 or NTNG2 is unlikely to confer disease susceptibility, at least in the temporal lobe (38), and Brodmann's area (39).However, the original paper-source of the STG samples RNA-seq along with many other genes (>1,000) found that NTNG1 (but not NTNG2) falls within the group of genes with significant alternative promoter usage ((12): ST6, p<9.05E-10 at FDR <0.5) and NTNG2 (but not NTNG1) clusters with genes (>700) with significant alternative splicing change ((12): ST7, p<6.15E-12 at FDR<0.5) when SCZ and controls are compared.
Such GWAS observation adds an extra layer of complementary regulation to both NTNG paralogs on a top of the described in the results section complementary usage rule for the exons, formed by unspliced splicing modules, resulting transcripts and comprising them exons (Figure 1B).Based on the available RNA-seq dataset it was almost impossible to detect RNA with the matched position of NTNG SNPs used for the IQ testing (ST2c) except for two coding exons located (rs2149171 and rs3824574) and exon 10 non-coding area located but transcribed rs4915045 (in 2 out of 16 samples).This fact points towards indirect effect of the IQ-affecting mutation alleles potentially associated with shorter (secretable) isoforms generation (Prosselkov et al., unpublished) lacking two of the most prominent NTNG features: GPI-link and the Ukd domain through an aberrant splicing factor binding.
The GPI-link is a hallmark of Netrin-G family members (40, 41) and lacking it the aberrant Netrin-G isoforms are likely to mimic the action of their releasable ancestry moleculesnetrins, still being able to bind to their cognate postsynaptic ligand -NGL but without forming an axonal-postsynaptic contact and potentially dominant negative consequences.The Ukd domain of Netrin-G1, despite its so-far unknown function, is involved in lateral binding to the pre-synaptically localised LAR modulating the binding strength between NGL-1 and Netrin-G1 (42).Work is currently underway in search for a similar lateral interaction partner for the Netrin-G2 Ukd domain.Inclusion of the Ukd-encoding domain exons 6 and 7 is regulated by the Nova splicing factor (43) affecting the cortex Netrin-G1 exon 7 but not exon 6, and, simultaneously, Netrin-G2 paralog exons exhibiting an opposite pattern.In general, it is tempting to speculate that deregulation of NTNG transcripts processing may have a role in the brain-controlled cognitive abilities and associated CDs.Supporting such notion, a decreased level of Netrin-G1c mRNA (exons 6-9 excluded, Figure 1B-1) has been reported for BD and SCZ (38) with Netrin-G1d (exons 6 and 7 included but 8-9 excluded, Figure 1B-1) and Netrin-G1f (a secretable short isoform consisted of domain VI only and lacking the Ukd and GPI-link) being increased in BD, but not in SCZ, in anterior cingulated cortex (44).
12 Higher Netrin-G1d mRNA expression in fetal brain but low for the Netrin-G1c isoform in the human adult (38) indicates different functionality of these two splice variants joggling with the Ukd domain inclusion/exclusion pattern.And, according to our unpublished data, if Netrin-G1 Ukd-containing isoforms are the dominant isoforms in adult mouse brain, Netrin-G2 Ukd-containing isoforms are present only at the trace level (Prosselkov et al., forthcoming), resembling the human STG samples transcriptome pattern (Figure 1B-1 and B-2).A similar "dynamic microexon regulation" associated with the protein interactome misregulation has been reported to be linked to ASD (45).
Synchronous and complementary expression of NTNG paralogs in the human brain supports the IQ-associated cognitive endophenotypes.Influential parieto-frontal integration theory (P-FIT, ( 46)) states that general intelligence ("g") is dependent on multiple brain cortical areas such as dlPFC, Broca's and Wernicke's areas, somatosensory and visual cortices (47).Despite "g" is widely accepted as the only correlate of the intelligence, its unitary nature was challenged by (48) claiming had indentified two independent brain networks (for memory and for reasoning) responsible for the task performance, the idea later criticised for the employed data processing approach (49).Higher IQ scores (a composite surrogate of "g") have been reportedly associated with the fronto-parietal network (FPN) connectivity (50,51).
Higher levels of NTNG paralogs expression within the cognition intensively loaded areas of the brain and the distinct patterns of expression profiles (synchronous, asynchronous/mixed, and complementary, Figure 2A) support associations of NTNG1 and NTNG2 with the recorded cognitive endophenotypes (2).Based on the expression patterning over the human life-span, among the total 16 analysed brain areas we found two falling under the same "antiphasic (complementary)" classifier (Figure 2C): HIP and MD.Adding more to that, MD is the only brain area (out of the 16 presented) where NTNG1 expression level exceeds that of NTNG2 making it a promising candidate for the phenomena of NTNGs SF explanation.Two 13 other brain areas classified by a synchronous paralogs expression deserve a special attention, dlPFC and mPFC (Figure 2A-4).PFC circuitry has been known as a "hub of the brain's WM system" (52,53), which acts through direct HIP afferents (54) and has many connections with other cortical and subcortical areas (55).mPFC may function as an intelligence-control switchboard and lPFC, part of the FPN global connectivity, predicts the WM performance and fluid intelligence (56).Interactions of the auditory recognition information fed by the vPFC stream with the sequence processing by the dorsal stream are crucial for the human language articulation (57; 58).The fact that both NTNG paralogs are extensively expressed across PFC (Figure 2A-2 and A-4) pinpoints this area as a key for future molecular studies of NTNGs and the human-unique symbolic communications.And PFC is not only implicated in many psychiatric disorders, including SCZ ((59); see also (55) for ref.
), but is also the only brain structure unique to primates without known homologs in the animal kingdom (60).
Evolution of the protein paralogs encoded by the NTNGs.Forkhead box P2 (FOXP2) -a ubiquitously expressed transcription factor that has been reportedly linked to the evolution of human language through T303N, N325S substitutions when compared to a primate ortholog (61), is 100% identical to Nea protein (62).FOXP2 regulates expression of multiple genes in human and chimpanzee (63), and among them is an M3 gene brain module representative responsible for general fluid cognitive abilities (26), LRRC4C, a gene encoding NGL-1, a post-synaptic target of Netrin-G1.Similarly to FOXP2, Netrin-G1 is a 100% conserved protein among the hominins with only 1 mutation found in chimpanzee which is absent in marmoset (and other primates) and mice proteins (2).On the other hand, extinct hominins' Netrin-G2 relative to modern human contains T346A point mutation (as per current version of hg19), also found in primates and mouse and known as rs4962173 (dbSNP missense mutation) representing an ancient substitution from Neandethal genomes found in modern humans and reflecting a recent acquisition of the novel allele around 5,300 yrs BC.Nothing 14 is known regarding the functional significance of this mutation allele but biochemically a substitution of alanine (A) on a polar threonine (T) could bring an extra point of posttranslational modification, e.g. a phosphorylation or glycosylation (NetPhos2.0(64) assigns a low probability score for the T346 to be phosphorylated but NetOGlyc4.0(65) robustly predicts it to be glycosylated, see SM).Another mutation, S371A/V, reflects a selective sweep in Netrin-G2 protein from primates to hominins within a similar to T346A functional context when a hydrophobic alanine (in chimpanzee, A)/valine (in marmoset, V) is replaced by a polar serine (S) with a strong positive predictions for glycosylation but not phosphorylation (see SM).This poses a question whether these two human-specific protein substitutions associate with advanced cognitive traits as they may represent a hidden layer of poorly studied so far protein glycosylation-associated regulatome known to affect the brain function and diseases (66,67).Adding more to this, T346 is nested on exon 5 just 20 nu away from the affecting WM score rs2274855 mutation allele (2), and, together with S371A/V, they are both located within the lowest percent identity area (exons (5-7)) of Netrin-Gs (Figure 3C) and, proposedly, contributing to the NTNG duplicates SF.There are at least three more protein parts potentially contributing to the gene paralogs specialised function subdivision (based on the low identity scores, Figure 3C): the secretory peptide, the GPI-link, and the outmost structurally elaborated unstructured loops (I-III) responsible for the reciprocal binding of Netrin-Gs to their post-synaptic cognate partners, NGL-1 or NGL-2, both containing a C-terminal PDZ-binding domain (68).An interesting finding was reported in (69) reported a presence of SH3(PSD95) domain binding site (required for the phosphatidylinositol-3-kinase recruitment) in mice Netrin-G2 (100% identical to human) but not in Netrin-G1.The detected SH3 binding site overlaps with the Netrin-G2-loop III responsible for the binding specificity to NGL-2 (12,70,71).A plausible working hypothesis would be that while internalised (and being GPI-link naïve/immature) the pre-synaptic Netrin-G2 is bound to SH3-PSD95 via loop III but as soon as being secreted extracellularly (and being attached to the membrane) it is bound to post-synaptic NGL-2.Corroborating this, in the absence of Netrin-G2 in the KO mice NGL-2 is unstable on the post-synaptic surface and gets quickly internalised (24).We can only speculate regarding the potential importance of PSD-95(SH3)-Netrin-G2-NGL-2 scaffolding loop interaction/competition but the ability for Netrin-G1 to bind to SH3 has not been reported.Following this logic, Netrin-G1 should have a similar binding partner via loop II.
The overall identical structural scaffold among the Netrin-G paralogs (Figure 3D) is likely to represent an anciently preserved one of the primordial protein (encoded by a single gene in the primitive urochordate C.intestinalis) and its contribution to the process of SF among the NTNG paralogs goes against the "structure drives function" concept.It looks like that it is not the "structure" but rather the "evolution" itself that drives a selection for the best structural (or unstructural in our case) fit out of the available frameworks provided by the gene duplicates to fulfill the emerged functional demand in a new ecological niche.The intricate variability of phenotype is grounded by the conserved nature of genotype and constrained by the "structure-function" limitations of the coding DNA and is only possible due to permissive evolutionary continuing elaborations of non-coding areas able to absorb the most recently acquired elements (having a potential to become regulatory at some point, e.g.like HAR5 (30)) and carried over by the neutral drift.At the same time, the multiple protein substitutions coinciding with the SF labor segregation phenomena among the Netrin-G paralogs question their neutral nature.Both of them undergo a purifying selection from mice to human through the reduction in size of non-coding DNA (introns) and encoded proteins (the mice Netrin-G2 is 2 aa longer its human ortholog) further contributing to the hostspecific SF.Thus while the non-coding sequences are used to explore the evolutionary space in time, the restrictive boundaries of the paralogs SF are determined by the protein (unstructured) elements.

Molecular evolution of the Cognitive Complement (CC). Appearance of the neural crest (72),
an event that "affected the chordate evolution in the unprecedented manner" (73), multipotent progenitor cells (74), and neurogenic placodes (suggesting a chemosensory and neurosecretory activities (75)) in the first primitive urochordates/tunicates coincides with the presence of Ntng precursor gene (ENSCING00000024925) later undergoing two rounds of duplication events in fish and found to affect human cognitive abilities (2).NTNG paralogs are expressed in the human neural crest-forming cells with NTNG2 10 times stronger than NTNG1 (76), both are differentially expressed in human comparing to chimpanzee and rhesus monkey with NTNG2 expression model showing stronger probability than NTNG1 (77), and both are stronger expressed in human telencephalon comparing to chimpanzee and macaque (78).NTNG1 has been classified as a brain module hub gene "whose pattern fundamentally shifted between species" (18).Belonging to distinct modules of brain expression regulation (78,79), NTNGs are classified as "genes with human-specific expression profiles" (79).The nearby gene ~260 kbp upstream of NTNG2 is MED27 (mediator of RNA polymerase II) has been proposed to be associated with the evolution of human-specific traits (80).NTNG1 has also been reported among the "adaptive plasticity genes" (81) potentiating rapid adaptive evolution in guppies (NTNG2 was not found in the input RNA).
Complementarity among the NTNG paralogs and encoded by them proteins has been reported previously: brain expression complementary pattern (in almost self-exclusive manner) defined by the 5'-UTR-localised cis-regulatory elements (82); complementary distribution within the hippocampal laminar structures (83); axon-dendrite synaptic ending resulting in differential control over the neuronal circuit plasticity (84); mutually-exclusive binding pattern to post-synaptic partners, NGL-1 and NGL-2, dictated by the protein unstructural elements (13); alternative promoter usage vs alternative mRNA splicing (12) and increased coefficient of variation (CV, ST1d) for NTNG1 expression but not NTNG2 in SCZ patients (also reported in ( 85)); KO mice behavioral phenotypes and subcellular signaling partners complementarity (24); "differential stability" brain modules expression (NTNG1 is expressed in the dorsal thalamus (M11) as a hub gene (Pearson's 0.92) while NTNG2 is in neocortex and claustrum module (M6, Pearson's 0.65)) (18); hypocretin neurons-specific expression of NTNG1 (but not NTNG2) as a sleep modulator (86); top-down vs bottom-up information flows gating in mice and differential modality (Prosselkov et al., forthcoming), and human IQ-compiling cognitive domains complementation (2).The current study reports on NTNGs complementarity association with the CDs (Figure 1A); mRNA splicing pattern complementary at the quantitative and qualitative levels via differential use of the middlelocated exons (Figure 1B); brain complementary oscillatory expression over the human life span observed in the intensive cognitively loaded brain areas (Figure 2); AE of the paralogssegregated unique non-coding elements (Figure 3A); complementary pattern of the protein orthologs (mice-to-human) protein sequence evolution.Such multi-level complementation is likely to reflect a shared evolutionary origin from a single gene in a primitive vertebrate organism 700 mln yrs ago and its subsequent functional segregation among the evolutiongenerated gene duplicates in jawless fish, such as lamprey.
Occupying independent but intercalating functional niches, NTNG1 and NTNG2 do not compensate but complement each other's function forming a "functional complement" of genes.Half a billion yrs ago the doubled gene dosage led to the gradual SF and manifested in a function complementation within the cognitive domains, at least in human.We would like to coin such gene pair as a Cognitive Complement (CC).

CONCLUSION
The emerged functional redundancy, as an outcome of gene duplication, leads to function 18 subdivision and its bifurcation among the gene paralogs resulting in the paralogs SF.A functional compensation is known to exist among the evolutionary unrelated genes but has not been reported among the gene paralogs, more frequently characterized by the function complementation.Gene paralogs structural identity (at both, gene and protein levels) does not provide a substrate for function compensation but rather for complementation, perturbating "structure drives function" rule.A gene duplication event of a tunicate NTNG primordial gene and the subsequent process of its function specialisation (driven by the new ecological niches appearance and evolution) among the gene duplicates made them to SF into distinct cognitive domains in a complementary manner forming a CC.In our forthcoming work we are to describe how Ntng mice genes function resembles that of human orthologs (Prosselkov et al., forthcoming).

MATERIALS AND METHODS
Human brain NTNG transcriptome reconstruction.Relates to Figure 1B and 1-C.The original source of the dataset was produced by (( 12): E-MATB-1030) and the downloaded .bamfiles used for the re-processing are listed in ST1a.All reconstructed transcripts are presented in ST1d standalone Excel file.Two samples were excluded from the analysis due to failed "per base sequence quality" measure, and zero expression level for NTNG1a and NTNG1int (9)(10) otherwise consistently expressed throughout other samples (ST1b).SAMtools software was used for the SNPs calling from the available RNA-seq datasets (ST1c).For details refer to SM.
Human brain expression profiling for NTNGs across the life span.The original source of data was www.brainspan.org.All available samples were initially included into the analysis but two of them excluded at a later stage (MD for 12-13 pcw and mPFC for 16-19 pcw) due to high deviation (6-7 times) from the mean for other replicas.The mean expression values per each brain area as RPKM were plotted against the sampling age.Profiles classification was done visually considering the trend over the all plotted points as an average.NTNG1 (NTNG1m) and NTNG2 (NTNG2b) full-length mRNA transcripts assembly.Relates to Figure 3B.Human NTNG1m brain transcript has been reported previously (125) and we have also confirmed its ortholog presence in the mice brain via full-length cloning (42).Since NCBI contains only its partial CDS (AY764265), we used the RNA-seq-generated exons (Figure 1B) to reconstruct its full-length and to generate an ORF of the encoded Netrin-G1m.
Similarly, human NTNG2b was reconstructed from the RNA-seq dataset and from Ensemble as follows.Exon 5 sequence was deduced from ENST00000372179, other exons were from ENST00000467453 (no longer available on the current version of Ensemble) except for exon 6 deduced by running three independent alignments against the human genomic DNA with the mice 3'-intron (5)(6), exon 6, and 5'-intron (6)(7)

9 DISCUSSION
, blue color)shows a high level of conservation (with at least 5 amino acids 100% conserved) among the Netrin-G paralogs, indicating that it is unlikely to be responsible for the cognate ligand binding specificity.Neither Loop II (Figure5C, yellow color) nor Loop III (Figure5C, orange color) display a single conserved amino acid shared among the paralogous binding interfaces (as it has originally been described in(13)).Thus the complementary pattern of the pre-postsynaptic interactions mediated via specific Netrin-G/NGL pairs is reflected in the Complementary contribution of NTNG paralogs into human cognitive pathologies.

88 Netrin-G2b 77 exon 5 +Figure 3 .
Figure 3. Human NTNG paralogs DNA and protein sequence comparisons and "structure-function" rule incongruency.(A) Identical gene structures with different sizes of introns.RNA-seq data from Figure 1B were used to precisely deduce the exon/intron junction boundaries.The sizes of exons 1, 10 and introns (1-2) are not indicated due to observed among the splice transcripts lengths variability (see ST1a for details).Arrows indicate location of CNS = conserved non-coding sequences underwent accelerated evolution in human compare to mice (mCNS) and chimpanzee (chCNS), as per (124); and ASC = anthropoid-specific constrained regions in human compare to marmoset (maASC), as per (121).(B) Identical exonal composition of the longest NTNG encoded RNA paralog transcripts and corresponding proteins with relatively high percent of identity among them dependent on the included/excluded Ukd domain (B-2) encoded by the exons 6 and 7 (B-1).Notably, the protein sequence represents higher percent of the paralogs difference than encoded it DNA.The matrices were obtained by GeneJockey II (Biosoft).(C) Protein alignments for the longest human NTNG encoded proteins, Netrin-G1m and Netrin-G2b, with Loops I-III highlighting binding sites for their cognate post-synaptic binding partners NGL-1 (Lrrc4c) and NGL-2 (Lrrc4), respectively, as determined by Seiradake et al. (13).Arrow indicates a putative secretory cleavage site location, as calculated by SignalIP (122), the blue rectangle delineates the area of the lowest identity (3'-domain LE1+Ukd domain); ω -denotes a point of putative GPI-attachment, as predicted by Big-PI (123).PSD-95 interaction site via the SH3-binding domain ((69), as determined for mice Netrin-G2) overlaps with the Loop III NGL-2 binding surface.Two stars indicate a modern human (T346A) and a hominin-specific (S371A/V) amino acid substitutions (2).(D) Identical structural motif of the Netrin-G1/NGL1 and Netrin-G2/NGL2 complexes as per (13).The figure's reproduction is covered by the Creative Commons license.!