CITATION
FULL PUBMED PDF
Pellionisz AJ. 2008;7(3):348-59 The Principle of Recursive Genome Function. The Cerebellum (Springer), 7(3) 348-359, DOI 10.1007/s12311-008-0035-y, PMID: 18566877


The principle of recursive genome function.

Pellionisz AJ.
Abstract
Cerebellum. 2008;7(3):348-59. doi: 10.1007/s12311-008-0035-y.

Responding to an open request, the principle of recursive genome function (PRGF) is put forward, effectively reversing two axioms of genomics as we used to know it, prior to the Encyclopedia of DNA Elements Project (ENCODE). The PRGF is based on the reversal of the interlocking but demonstrably invalid central dogma and "Junk DNA" conjectures that slowed down the advance of sound theory of genome function, as far as information science is concerned, for half a century. PRGF illustrates the utility of the class of recursive algorithms as the intrinsic mathematics of post-ENCODE genomics. A specific recursive algorithmic approach to PRGF governing the growth of the Purkinje neuron is sketched, building the structure in a hierarchical manner, starting from primary genomic information packets and in each recursion using auxiliary genomic information packets, cancelled upon perusal. The predictive power of the principle and its experimental support are indicated. It is argued that genomics is no longer an exceptional instance of the applicability of recursion throughout the sciences.

PMID: 18566877 [PubMed - indexed for MEDLINE


NOTES ON SIGNIFICANCE

1) REMOVAL OF TWO DOGMAS PREVAILING FOR MORE THAN HALF A CENTURY:
"The Principle of Recursive Genome Function"

(Pellionisz, 2008, Google Tech Talk YouTube [~ 16,500 views, correctly predicting the "deluge" of sequences, threatening the sustainability of Industrialization of Genomics, if Sequencing Industry is not met with Analytics Industry])

2) RECURSIVE, SOFTWARE ENABLING ALGORITHMIC APPROACH SUPPORTED BY INDEPENDENT EXPERIMENTAL PROOF OF CONCEPT PAPERS (2012):
"The Decade of FractoGene: From Discovery to Utility-Proofs of Concept Open Genome-Based Clinical Applications"
(Pellionisz, International Conference on Cybernetics, Systemics, Informatics (Hyderabad, 2012 February 15) .pdf in full Supplementary Material)

3) RECURSIVE, SOFTWARE ENABLING ALGORITHMIC APPROACH UNIFIES NEUROSCIENCE AND GENOMICS:
"Recursive Genome Function of the Cerebellum: Geometric Unification of Neuroscience and Genomics"

(Pellionisz et al, 2013 (full pdf) submitted Oct. 20, accepted Nov. 1, 2011 by Springer for Handbook, supplementary material providing theoretical advances and road-map for Analytics Industry)

See more detailed development of FractoGene theory at Professional Webpage of Dr. Pellionisz


See pre-publication overview of the paper below, with clickable references.



The Cerebellum (Springer Verlag, Heidelberg & New York), 2008;7
Received: December 7, 2007
Accepted: December 18, 2007
Corrected Proof Returned: June 12, 2008
Online Publication Date: June 20, 2008

See Full Free Text (in the author's html rendering, with clickable references below)

Supplementary material (free full text)
Press release on paper, fully hyperlinked
Press release on paper, original on PRWeb
Announcement of paper
General context of the Principle on YouTube
Citation of the paper within 6 months after publication 1
Citation of the paper within 6 months after publication 2


The Principle of Recursive Genome Function

ANDRÁS J. PELLIONISZ#

Correspondence:
Dr. András Pellionisz,
holgentech@gmail.com

Keywords: Cerebellar - Circuit - DNA - RNA - Proteins

Introduction: Scope of the Call for Rethinking the Axioms of Genomics

Upon publication of the results of project Encyclopedia of DNA Elements (ENCODE), a 4-year research effort led by the US Government, its architect issued a mandate: "the scientific community will need to rethink some long-held views" [1]:


What views require our revision? The sequencing of the human genome engendered the idea of genomics as information science [2]. New avenues must be explored that are opening up to scientific research once breakthroughs from decades-old theoretical cul-de-sac lead to theoretical and experimental advances. "DNA has two types of digital information - the genes that encode proteins, which are the molecular machines of life, and the gene regulatory networks that specify the behavior of the genes" [2]. This paper reviews the dichotomy of genomics regarding historical conflicts regarding gene and gene regulation and offers a guiding principle for their synthesis. Introduction of this principle is made possible by a long-delayed but now respectful removal of two pragmatic dogmas, replacing them by a sound information-theoretical axiom.

In response to the open request in the post-ENCODE era that welcomes increasingly rigorous theoretical and mathematical foundations, this note comes forward with a principle learned from data in the course of an attempt to mathematize biology. In early efforts1 recursive algorithms came to the fore in explaining the function of neural networks.2 However, since such an idea ran counter to the central dogma of molecular biology [10] that then prevailed in genetics, wherein only a "forward growth" mindset obtained (see Figs. 1, 2, and 3), this author subdued his claim, stating that "establishing a rigorous relation of these 'code sequences' to the genetic code that underlies the morphogenesis of differentiated neurons may be far in the future [5].

----
1 Neuronal Modeling (cf. [3]) and Artificial Intelligence to Neural Networks (cf. [4]).
2 See a specific recursion on page 365 of a paradigm of Neural Growth: Structural Manifestation of Repeated Access to Genetic Code ([5], paragraph 3.1.3), and a collection of algorithms [6,7,8,9] in [4].


Fig. 1. Nascence of the "Central Dogma of Molecular Biology"; the original concept diagram by Francis Crick in 1956 (not known to have been published, but acknowledged by Crick [61])


Now, however, with the ENCODE report in hand and with the field of neural networks blossoming, the time seems ripe to advance a recursive principle whereby the genome governs growth of organelles, organs and organisms.

In post-ENCODE genomics; the issue of "genome regulation" takes its long-deserved place. The concept, aimed at controlling elements, began with Barbara McClintock in the 1940s. Before the double helix was revealed [11], she discovered transposable elements in maize. She called them controlling elements because they altered gene expression. She published in the same year [12] when the Crick's central dogma was conceived (1956), wherein key feedback pathways were arbitrarily excluded. Gene regulation advanced to the operon theory of Jacob and Monod [13], for which they received the Nobel Prize in 1965. Jacob reminisced in his Memoirs about "... one of the oldest problems in biology: in organisms made up of millions, even billions of cells, every cell possesses a complete set of genes: how then, is that all the genes do not function in the same way in all tissues?" [14], This profound question is examined below.


By 1969, the field of gene regulation was growing at a healthy rate, to result in the work of Britten and Davidson [15]. However, in 1970 a wound opened up in genomics that has not yet healed. On one hand, the first major failure on Crick's doctrine was revealed [16]. Both Crick and Watson responded, but in different ways [17, 18]. To keep the establishment together, Ohno declared the same year that all but the genetic DNA was garbage DNA [19], along with the slightly modified term of junk DNA [20] - a notion which prevailed for a generation.

Fig. 2. Watson's simplified rendering of Crick's central dogma states what is certainly a fact and strips the dogma of its controversial prohibitions ([62], p. 298)

The objective of this paper is to provide an historical review of the bifurcation and to offer a theoretical synthesis to remedy. With post-ENCODE genomics now removing obsolete impediments, principle of recursive genome function (PRGF) is expected to rapidly evolve, especially since some workers have been laboring for years in a clandestine fashion, quietly disregarding obsolete views (Pellionisz, 2002; see in Fig. 3. and Ref #18 in [21]).

In a general sense, the profound impact of changing axioms should be outlined; such as in medicine, bioenergy, nanotechnology, and synthetic biology - and even in philosophy. It is possible, as it was in physics [22], and even in neuroscience (Neurophilosophy, [23]) that the reversal of long-held views may have philosophical implications, giving rise to what might be called genome philosophy [24]. For instance, based on the ENCODE results, a synthesis may be necessary to integrate some goal-directed Lamarckian notions of evolution [25] - not confronting, but rather surpassing, simplification of original Darwinian notions [26] where, as we all learn in school, natural selection suffices for the emergence of species. So saying, we are led to a few key conclusions concerning algorithmic approaches vis-á-vis the ENCODE study [27].



Fig. 3. Gene Paradigm for Forward Growth as of 2003. The left diagram of the double helix with a "genic" region highlighted is modified from the cover of Scientific American (April, 2003). Diagram on the right side is a brain cell. The diagram depicts the oversimplification as if 1.3% of the DNA could determine, in a forward-growth manner, not only the Purkinje neuron shown but, given enough genes, all phenotypes resulting from separate genotypes. The 98.7% of junk DNA is a no man's land to which there is no recursion.



Core idea: The Principle of Recursive Genome Function Reverses the Double Lock on our Understanding of the Double Helix

The main body of this essay aims at accomplishing two goals: The first and far lesser task entails a brief review of history, since it has been attempted and to different extents attained numerous times before (see brief reviews below), in order to put to rest a long-standing but increasingly controversial theoretical double lock on the understanding of the function of "double helix" from the viewpoint of what we might call genome informatics. The second, far more important and difficult goal is, since "data never kill theories, only better theory can kill less tenable theories" that this note should leave no vacuum by removal of key dogma but should replace two discarded, obsolete conjectures regarding the central dogma of molecular biology, along with the notion of junk DNA and, by means of their reversal, synthesize into a single principle (PRGF), more completely grounded in empirical data and withstanding more scrutiny from the viewpoint of information theory.

The recursive genome function is expressed by a process of already-built proteins, iteratively accessing sets of first primary and ensuing auxiliary information packets of DNA to build hierarchies of protein structures.

In abstraction, recursion is meant as a process of defining functions in which the function being defined is applied within its own definition.

Applying these postulates to the genome, the most concise formulation is as follows (see equation 1):

Every 1-m finite state (Z) of the protein system (e.g., the n + 1st state, denoted by Z n+1) relies on the previous state of the protein system (e.g. Z n) by applying a recursive function (f). The process is bounded by the limitation to the maximal number of states (m), where there is a function (f) from the n-th state Z n to be executed on Z to yield Z n+1:


The diagram below (see also Fig. 4.) pictures, in simplest terms, the principle of recursion of genomic function. Here we see the cardinal role of the main path of recursive processes play in the construction of protein systems:

DNA>RNA>PROTEIN>DNA>RNA>PROTEIN >DNA> ...

These recursive feedback processes then snowball into evolving (protein) structures, governed by DNA.

Purkinje brain cells provide an illuminating example of building a protein structure by means of an L-string replacement recursive algorithm [28]. The application of that algorithm is given elsewhere [5]. Experimental support of the quantitative predictions of the recursive approach is also readily available [21].

Readers will note that the PRGF is consistent not only with the recursive algorithms used in neural networks but is conceptually akin to a particular recursive formula, viz., the Mandelbrot set (see equation 2 below, from [29]):


Further, fractal sets (see also the Julius Ruis set, [30]), are representatives of just one of the class of recursive algorithms.

The postulated principle of recursive genomic function opens new avenues by way of a class of recursive algorithmic functions. Just one (fractal) example is given to show that the formulation of experimentally testable hypotheses for genomic function is plausible, and supported by experimental results [21, 31].


Deep Background: Recursion is a Well-accepted Process in Science; Why Should it Not Become a Principle of Genome Function?

Recursion is a well-established concept in the sciences, ranging from pure mathematics to biological neural nets (cited in Introduction). A common linchpin between the two given domains is the least squares algorithm, which minimizes errors; see the recursive mathematical and neural network basics [32]. From the viewpoint of information theory, it is particularly notable that informational recursion from proteins (that are exposed to the external world) drastically alters the conception of genomic function as a closed system (to which the 2nd principle of thermodynamics applies, with entropy increasing) to a system open to the world. Note that such recursion - involving outside factors - helps resolve the paradox between random mutation and natural selection theory, where it is questioned whether the genome, featured as a closed system, can cope with an out-of-bounds increase of genomic entropy [33], once we consider that entropy can be regularized, given an open system [34]. Genomic function has hitherto been a strange exception to the widespread modeling of living and non living systems in terms of recursion. This singularity is especially peculiar since great physicists of the last century already predicted that our times would become the century of biology and that their physics-minded thinking processes, as given in Wiener' Cybernetics [35] explicitly invoked feedback as a primary principle in animal and machine. Schrődinger's What is Life? [36], von Neumann's The Computer and the Brain [37] and Szilárd's A theory of aging [38] argued in unison that information-theoretic aspects would become key to a future understanding of biology. However, biology is a very young science - it is a mere 231 years since its coinage [39]. Genetics, as we knew it in pre-ENCODE genomics, just slightly exceeds a single century [40, 41]. Thus, the mathematical rigor that has characterized physics for over two millennia since Aristotle ([42], ca. 400 B.C.) could not be hastily enforced on unripe subjects who were, moreover, for a long time somewhat unready and occasionally unwilling.

The above does not mean, however, that recursive algorithms have not been applied in conjunction with genomic systems, e.g., for extrinsic description and construction. In fact, the plethora of extrinsic applications makes it curious that the ideas forming PRGF have not heretofore overcome resistance and declared a breakthrough a fact attributable to doctrines protected by the bulwark of scientists who underwrote pre-ENCODE genomics.

Recursive algorithms have encircled genomics, and not only in respect of neural networks. Noted representative examples are genetic algorithms [43]; recursive PCR [44]; algorithms using DNA sequences as templates for encryption [45]; construction of DNA structures by recursive algorithms [46, 47]; and reconstruction of the genome by recursive assembly [48, 49, 50, 51].

The above encirclement of genomics by recursive approaches greatly facilitates a dignified removal of both the central dogma and junk DNA conjectures. A breakthrough from Fig. 4A to B completes the liberation of post-ENCODE genomics, to better and more fully embrace the principle of recursion. Significantly, the inherent mathematics of genomic informatics would no longer be perceived as running against the establishment.

Fig. 4. PRGF breaks through the Double Lock of central dogma and junk DNA barriers (shown in A by triple lines), to yield PRGF (shown in B by checkered circle). The background figures in both A and B is from Fig. 2. from Crick, 1970, see in this paper as right side of Fig. 5) permits only a "forward growth" from DNA that dead-ends in proteins. By removal of both the central dogma that arbitrarily forbids information feedback from proteins to DNA as well as disposing the "Junk DNA" conjecture that claimed that (even if there was a path back to DNA), zero information would be found in the "Junk DNA", a main recursive path (PRGF, checkered circle and arrows) is not only available in principle - but it is the principle of recursive genome function.

Obituaries of the central dogma and junk DNA are offered elsewhere. In brief, the demise of the obsolete axioms has been a yearly event in recent times, summarily refuted by leaders [52, 53, 54, 55, 56, 57]. Specific factual anomalies contradicting doctrine have been reported for decades (see in a separate section). One should appreciate that, even at its outset, serious reservations were voiced; see Jacobs, e.g., in his Memoirs (pp. 288 in [14]). Jacob and Monod [13] provided Nobel Prize-winning evidences within half adecade of Crick's concept that operon-regulation exerts a demonstrable feedback on DNA gene activity. Likewise, the junk DNA misnomer was summarily voided of its scientific validity as recently as in the suggested formal abandonment of the term as a scientific notion (International PostGenetics Society, 2006 a paper of 20 Founders rejected without review), and later in the ENCODE report stating that "the DNA is pervasively transcribed" [58]. The conjecture was finally put to rest by Mattick [59].

In all fairness, upholding vague and even controversial axioms in the nascent stage of biology (compared to over ten times older and thus much more mathematical physics) were necessities dictated by practical constraints. This is perhaps best described by Brenner in his Nobel Lecture [60]:

"In 1985, when the first suggestions were made to sequence the human genome, I thought that the sequencing techniques, even with incremental improvements, would not be equal to the task, and would require a factory scale operation to do it. I had also come to the conclusion that most of the human genome was junk, a form of rubbish which, unlike garbage, is not thrown away. My view at the time was that we should treat the human genome like income tax and find every legitimate way of avoiding sequencing it. ... I was puzzled by the enormous variations in the amounts of DNA in different organisms. Indeed, whereas most physicists thought that organisms did not have enough DNA to specify their complexity, it was clear to me that many organisms had too much. I discovered from Hinegardner that one group of fish, the Tetraodontidae, which included the Japanese pufferfish, Fugu, had very small genomes, with a haploid content of about 400 megabases as opposed to the 3000 megabases of mammalian genomes. Although teleost fish are distant from humans they are still vertebrates, with the same body plans, development and physiological systems as ourselves. Because of these basic similarities it seemed unlikely that Fugu, with a haploid DNA content one eighth that of mammals would have eight times fewer genes, making it much more probable that what was missing in Fugu was junk DNA." [60].

This pragmatic consideration is also explicit in Crick's 1970 revision ([17], c.f. his Fig. 1, reproduced in left side of Fig. 5. of this paper).

"The principal problem could then be stated as the formulation of the general rules for information transfer from one polymer with a defined alphabet to another. This could be compactly presented by the diagram of Fig.1. [of Crick, 1970] (which was actually drawn at that time [1958], though I am not sure that it was ever published) in which all possible simple transfers were represented by arrow. The arrows do not, of course, represent the flow of matter but the directional flow of detailed, residue-by-residue, sequence information from one polymer molecule to another. Now if all possible transfers commonly occurred it would have been almost impossible to construct useful theories. [Emphasis added, Pellionisz]. Nevertheless, such theories were part of our everyday discussions. This was because it not occur. It occurred to me that it would be wise to state these preconceptions explicitly" [17]

Fig. 5. (From Figs. 1. and 2. of Crick 1970). Left side permits an infinite number of possible recursive paths. Right side arbitrarily prohibits the main path of recursion by "dead-ending" proteins. The DNA>RNA>PROTEIN>DNA recursion is just a single obvious recursive path, and the fractal approach already elaborated to some extent is just one of the possible recursive algorithms.

The saga (life, death and obituary) of junk DNA is treated elsewhere. Suffice it here that both the central dogma and the interlocking junk DNA dogma were finally and officially put on hold only by the conclusions of ENCODE pilot project, on June 14th, 2007 [58].


A Historical Recount: Specific Review of the Nascence and Demise of the Central Dogma and Junk DNA Conjectures

The central dogma in its various renderings held that transfer of information from proteins and RNA back to DNA never happens. The central dogma of molecular biology was put forward in Crick's talks from 1956 (Fig. 4, c.f. recollection of Jacob, [14], p. 286), and published two years later [61].

This concept might be called "don't look back" or "no feedback permitted" postulate. Further, proponents of junk DNA [20] claimed that, even if a process could be found, a recursion from proteins and/or RNA to DNA could not retrieve information from functionless junk DNA (that is 98.7% of the human DNA). This conjecture may be called "even if you look back, you find only junk" postulate. Recursion for information was not only forbidden but in addition was pre-judged as useless because of an assumed void of information.

It may be Watson's reduced version of the central dogma [62], wherein he emphasized what was surely found in genomic function and avoided needless and unsupported prohibitions, that helped the central dogma receive common acceptance from 1965 to 1969-1970, skirting sharp personal criticisms that were already present at its conception (e.g., questions as to its dogmatic stance; see Jacob's Memoirs, [14] pp. 288).

A factor in the prevalence of the central dogma during this period might be that, since the 1965 Nobel Prize for Jacob and Monod's work on operon regulation [13], it had to be evident to Watson that, given the operon regulation of gene expression (as a function of the level of produced proteins), the prohibition of a protein-to-DNA information channel need not be in Watson's textbooks. His simplified and convenient view (avoiding controversial prohibitions and their refutation by data) is called here the concept of forward growth - which is, without recursion, only half the loop.

This view, backed by Watson, prevailed so strongly that even the 50th anniversary issue of Scientific American, celebrating the discovery of the structure of DNA by Watson and Crick (1953) depicted as the general understanding that gene expression of DNA results, through RNA, in the construction of protein structures (see Fig.3, a composite illustration of the modified cover page figure of Scientific American of April, 2003, where on the right side the model of a Purkinje brain cell is pictured, from [5]):

Reality was, as might be expected, much more complicated than any simplification. As early as 1969, in the Britten and Davidson theory of gene regulation [15], and by 1970 (cf. philosophical reflection by Darden [63]), the central dogma was squarely confronted by the discovery of reverse transcription, later called retroviruses, from RNA to DNA [16, 64].

Both Watson and Crick responded promptly but separately to the challenge posed by the discovery of the new enzyme that flagrantly violated their views. In the June 27, 1970 issue of Nature, the reverse transcriptase discovery was announced, and an anonymous "News and Views" article claimed: "Central Dogma Reversed".

(Quote from Darden [63]): "Watson, in the 1970 second edition of his Molecular Biology of the Gene [18], said: 'The concept of a DNA provirus for an RNA virus is clearly a radical proposal. It overturns the belief that flow of genetic information always goes in the direction of DNA to RNA and never RNA to DNA. [Emphasis added, AJP] On the other hand, it offers an even greater variety of ways for cells to exchange genetic information. Considering the enormous complexity of biological systems, it would not be surprising if this device were uniquely advantageous in some situations.'" ([18], pp. 621-622)

Crick (1970) also responded immediately to the challenge [17] but in a different way. (Unfortunately, the dogma wasn't allowed to gracefully expire.) Crick published a paper in Nature. His version of the central dogma, he contended, had not been reversed, as the anonymous Nature article had claimed. Crick stated, correctly, that in 1958 he had framed the central dogma in terms of the general transfer of information from nucleic acids to protein - but not the reverse (Crick 1958). That abstract claim had not yet been challenged. "If it were shown that information could flow from proteins to nucleic acids, he said, then such a finding would "shake the whole intellectual basis of molecular biology" ([17], p. 563). (Quote from Darden [63], emphasis added, Pellionisz).

Thus, by 1970 the intellectual split between Crick, the originator of "The Central Dogma" (1956, [61]) and its promoter Watson (1965, [62]), threatened the collapse of the genomics establishment. The shaky ground of "The Central Dogma" was not really firmed up by the confession; "Dogma was just a catch phrase" (Crick, quoted in [65]).

In the same year of the split, Ohno's junk DNA idea came to the rescue. Ohno first referred to garbage DNA in the human genome ([19], p. 62). Meaning that, even if there was recursive information access to DNA from proteins or from RNA, e.g. there was supposedly no information in the intronic and intergenic regions to be found and retrieved. Although the term "garbage DNA" floated in 1970, it did not take hold, but by 1972, in his presentation he began using the more suitable term "junk DNA", which did stick [20]. One should appreciate that, immediately after his presentation, the first person to rise in the discussion vehemently objected to the basis of the junk DNA conjecture (see "Discussion" by Boyer, in [20]): "It thus seems to me that the permissible number of structural loci is - as yet - a somewhat suspect way to arrive at figures of 1% structural utility to 99% junk."

Why is it that Nobel Prize-winning experimental work (c.f. [66], Jacob's Nobel Lecture on his Prize with Monod, 1965) was available as early as within five years after Crick's conception of his dogma - yet no theoretical confrontation developed? Their operon regulation [13] clearly demonstrated that the protein-level, viewed as a result of genetic activity, did have an information-feedback mechanism on the genes in the DNA, such that down- or up-regulated DNA-RNA activity resulted, in accordance with the amount of protein already generated:

"
Experiments on genetic transfer by conjugation not only led to a revision of the concepts on the mechanisms of information transfer which occur in protein synthesis; they also made it possible to analyze the regulation of this synthesis. ... the operator is not transcribed into messenger and repression can be exerted only at the level of DNA. ... Gene expression was then usually believed to consist in the accumulation of stable structures in the cytoplasm, probably the RNA of ribosomes, which were assumed to serve as templates specifying protein structures... Such a scheme, which can be summarized by the aphorism 'one gene-one ribosome-one enzyme', was hardly compatible with an immediate protein synthesis at maximal rate." [13]

In retrospect, and judging from Jacob's Memoirs [14] it seems evident that Jacob was fully aware of theintellectual conflict between the Jacob-Monod finding and Crick's central dogma, even at its birth (not shying away from direct criticism of the label "dogma", however):

"In an acute sense of publicity, to baptize Central Dogma - that is to say, incontestable truth - a hypothesis that was unsupported by any serious argument." ([14], pp. 288)

However, it appears that Jacob did not directly confront Crick on the latter's conception of the dogma [61], since that was prior to the publication of the Jacob-Monod operon concept. Jacob and Monod (1961) published their "operon regulation" work soon after, but did not receive their Prize until 1965; whereas in the following year, Watson and Crick (1962) received their award. Thus it was arguably more politic to avoid a direct conflict among the foursome. Almost simultaneously, however, in 1965 Watson [62] distanced himself from the central notion by putting forward his "simplified version" emphasizing what was undeniably true, although he did not dwell upon issues already controversial; see Fig. 2.)

The point of this paper, however, is neither iconoclastic (discarding pragmatic doctrines while simply leaving a theoretical void in their place), nor to merely cite evidence for the widely reported means of feedback processes from proteins to DNA (and RNA to DNA). The point is to fill the void. PRGF is proposed to break through the "double ceiling" of the central dogma and junk DNA that impeded theoretical advances for half a century.


Theoretical and Factual Breakdown of the Central Dogma and Junk DNA

Further flogging of two dead horses is avoided as much as possible. This abbreviated section merely supplies evidence, gives credit where most needed and points to the most powerful reviews.

From the theoretical viewpoint of informatics, the "double lock" on recursive information has long been suspect. First, there is no more information for hereditary material than that present in the DNA. In humans, if 98.7% of the DNA is arbitrarily closed to access and in addition its information voided, the remaining information that 1.3% of the (human) genome harbors is deemed simply insufficient to govern development of such advanced organisms as vertebrates. It was painfully experienced, for instance, that when constructing a computer model of 1.68 million brain cells of the frog cerebellum, algorithmic approaches had to be invoked, rather than pretending that an impossible amount of information was available to specify the vast neural network in its every detail [67].

As for factual contradictions, the array of evidence - against both the central dogma and junk DNA conjectures - is staggering; see respective reviews [52, 53, 56].

Beyond the early factual evidence belying the validity of the central dogma ("Operonic Regulation by Feedback from Proteins to DNA"), another large assortment of facts is available, as follows.

For an account of the (forbidden) information-transfer from RNA back to the DNA, see the very recent review by Mattick [59]; with background about the RNA world [68, 69, 70, 71, 72].

Major issues are gene silencing (or "turning genes on and off", e.g., by so-called LINE way stations, and switching via SINE-s). It is noteworthy that the PRGF is fully consistent with the currently vague notion of "turning genes on and off" - but goes further by invoking recursion not only the sign(s) of the "parallel feedback" are meaningful, but much more important information is the set of algorithmic values of recursive signals. [73, 74, 75, 76].

For the "orbidden" protein-to-DNA interaction, see the work on protein binding with the DNA, and methylation of the DNA by proteins - rendering DNA transcription reversibly or permanently impossible [77, 78, 79, 80, 81, 82].

For the (also forbidden) protein-protein interactions see Prions [83] and a detailed and philosophical review [52].


A Specific Process for Purkinje Neuron Growth Governed by a Recursive Genomic Function

In its simplest form, PRGF can be metaphorically described as the manner whereby an assembler employs a user's manual with respect to a streaming supply of parts. First, the assembler looks at Step 1 of the manual, and, operating according to instructions in the primary information packet, the assembler puts together the indicated components, taken from the supply of parts.

Next, the assembler compares the emerging structure to Step 1 as depicted in the instruction manual by referring to the primary source of information (in our case, the DNA, the "genes"). It is noteworthy that it is useful for the assembler to mark as done the just completed instruction step, in order to avoid its repetition by mistake.

Next, the assembler proceeds to Step 2 in the instructions. Accessing the next auxiliary information packet, the assembler puts together the indicated components into the next layer of the hierarchy. Comparing the emerging structure in the second hierarchy with Step 2 by looking back at the manual, the assembler marks Step 2 as done.

The process goes on in a recursive fashion through the manual, until all the finite steps are taken. The assembler then runs out of instructions (all instructions are marked done.)

For the DNA, RNA and protein (and so on), in the circular chain depicted in Figs. 4. and 6., the term "recursion" means that, by reversing the central dogma, a main path opens for recursive algorithms to be applied as the intrinsic mathematics of genome function. That is, not only an information transfer back from proteins to DNA is allowed, but this feedback mechanism is (again) relied upon as the enabling feature of a circular process, thus:

DNA > RNA > PROTEIN > DNA > RNA > PROTEIN > ...

Further, by reversing the notion of junk DNA, the principle of recursion postulates that the above main recursive path accesses functional (as opposed to junk) DNA, vi., information from intronic and intergenic regions that were formerly regarded as useless.




Fig. 6. Sketch of recursive genomic government of the growth of Purkinje neuron. Starting from a primary information packet highlighted, a Y-shaped protein template is built by the "forward growth" process in accordance with the simplified (Watson) picture through transcription of DNA to RNA and, in turn, RNA building nucleic acids that form a structural protein. During the construction of the Y-shaped template, the primary gene is in a "turned on" condition. Thus, the most primitive primary part of the process retains Watson's simplified scheme. In other words, the postulated process does not contradict to the process of "DNA makes RNA that makes proteins", but goes beyond it, by violating both the forbidden feedback mechanisms and the notion of junk DNA. In each recursive stepthe perused auxiliary information packet (formerly "junk DNA" or "regulatory DNA") is cancelled (methylated) upon perusal.


PRGF claims that the overall function is expressed by a recursive process determined and governed by repetitive access to information packets contained in the DNA through the channel depicted in the schema above. The postulated process is active and bounded. By active, it is meant that information packets accessed may be rendered inaccessible (in reversible and/or a permanent manner by de novo methylation) and that the consumption of information governing growth leads to an eventual death of the organism.

Despite tremendous advances, the full genomic government of Purkinje cell assembly still remains largely unknown [84, 85]. This paper illustrates the applicability of PRGF in a sketch of the development of this brain cell (cf. [5, 21]). It is expected that the recursive framework provided will contribute to further advances in revealing genomic government of developments of Purkinje neurons and other structures, making it an eminent platform for post-ENCODE genomics
.

To recapitulate and expand, Fig. 6 pictures the recursive process as follows:

Structural proteins are generated by a DNA primary information packet (pre-ENCODE "gene"), growing a Y-shaped template. Completion of proteins of Step 1, however, is not a dead end, as formerly asserted by dogma. Completion may be reported by a completion marker protein that, in its simplest version, binds with the DNA. Specifically the completion marker protein can "turn off the gene". Completion of this first step shuts down the first stage of growth. Evidence is available for microRNAs (and interfering micro-RNAs, small inhibitory RNA-s (SiRNA), (see [86, 87]) that can signal completion by turning off the primary information packet (formerly "gene").

The auxiliary packet of information (formerly "junk" DNA or regulatory DNA) is turned off by de novo methylation upon perusal of retrieved information; each such auxiliary information packet, once perused, is rendered temporarily or permanently unreadable. This provides a framework to explain the oldest problem in biology, the fact that the differentiated cells of an organism are no longer omnipotent. Their methylation pattern permits specific and limited further growth, as much as permitted by the remainder of unmethylated auxiliary information to be accessed by further recursion. This framework is consistent with the argument of the theory of aging [38], the proposition that genetic damage leads to progressive degradation of the ability to make necessary proteins.


[Caulifower Romanesca enlarged, in color - not in the paper, Pellionisz]

Fig. 7. An example of a recursive-looking organism (Cauliflower Romanesca). It is possibly grown by a Lindenmayer L-string replacement recursive algorithm, e.g. governed by the DNA>RNA >PROTEIN>DNA ... recursion, a massively parallel process executed repeatedly.


Generalizations of the Principle of Recursive Genome Function

PRGF obviously transgresses the once forbidden feedback mechanism and also relies on auxiliary genomic information-packets - previously regarded as junk. It is noteworthy that, prior to the publication of Cybernetics [35], which made feedback mechanisms a most conspicuous aspect of biological processes, most philosophies in biology (including traditional evolution, [26]), assumed a similar, rudimentary forward-growth process via random mutations and natural selection. In view of the interaction of protein structures (organisms) with the environment, recursion would seem to enable non-random development.

Another comment towards generalization is that the depicted recursion - not unlike neural reflex arcs in the early history of neuroscience - neural networks at first glance looked like a chain [3]. However, just as with neural networks, genomic recursion is inherently parallel, since the recursion is not limited to a single primary information packet and its auxiliary information packets. In the case of genomic function, the three main layered sets of elements (DNA, RNA and proteins) operate in a massively parallel manner; inviting neural network algorithms (see in Introduction).

In this context, it is noteworthy that there is no separate or even separable operating system [55] as the recursive genome is self-governed (unsupervised). In fact, the view here is much in agreement with the concept first touched upon in "What is life?" [36]

"
But the term code-script is, of course, too narrow. The chromosome structures are at the same time instrumental in bringing about the development they foreshadow. They are law-code and executive power - or, to use another simile, they are architect's plan and builder's craft' in one." [36]

Von Neumann [37]also stated his principle that there is no difference between the two kinds of information (code and data) so far as its repository (memory) is concerned. Primary information packets ("genes") and auxiliary information packets are all nucleotide sequences; it is the PRGF recursive process of access that distinguishes them.

Readers familiar with the "fractal approach" [21] may note the applicability of PRGF to that particular paradigm pursued by this author to describe fractal development of a Purkinje cell in the "Pre-ENCODE wilderness years". The fractal approach is especially encouraged by the measurements that "pykon-like elements" [88] of the whole genome of Mycoplasma genitalium [89, 90] apparently follow the Zipf-Mandelbrot Parabolic Fractal Distribution (personal communication by Pellionisz in [31, 91]) also pointing out a found Pareto-distribution, that is a truncated Taylor series to the Zipf-Mandelbrot Parabolic Fractal Distribution.

Readers will also note the generalization that PRGF opens a way not only for Fractal Recursive Iterations (e.g. a fractal DNA resulting in the fractal structure of the Romanesca vegetable photographed for illustrative purposes on Fig.7.) but to an entire class of recursive algorithms; with the recurson in principle certainly not limited to the main recursion via DNA>RNA>PROTEIN>DNA> (etc) but possibly involving an infinite number of recursive loops pictured in Fig. 5A (Fig. 1. of [17] reproduced).

Just for one more intricate example:

DNA > RNA > DNA > RNA > PROTEIN > PROTEIN > RNA > DNA > RNA > PROTEIN >

From the viewpoint of an overriding pragmatism of genomics, one may note that the theoretically infinite possible variations of pathways of recursions in Fig. 1 of Crick [17] justifiably frightened workers (including Crick, [17] ) away from the theoretical problem at the time of 1960-1970, rushing to reduce genomic function into a 2-step procedure (transcription and translation; [62]), and also putting a second lock on recursion by way of the junk DNA conjecture. It has long been known in physics, that the two-body problem of two masses interacting can be described exactly, while the three-body problem presents formidable mathematical challenges. Crick, who later ventured into neural networks himself, did not seem to favor heavy use of mathematics [23].

Indeed, early in the second century of genetics (now in its post-ENCODE era, i.e. PostGenetics), the rudimentary recursive sketch of the fractal growth of a single Purkinje cell is a far cry from fully defining (and, through experimentation, verifying in every detail) a complete mathematical model of even the genomic governance of even one of the best known single-cell platform of the well-familiar multicellular organ of the cerebellum, the single Purkinje neuron.

Our best immediate hope is to gather support for revealing first the most rudimentary genomic regulation present in the smallest DNA of a free-living organism (Mycoplasma genitalium, where intergenic sequences total a mere 50,000 nucleotides, and thus not only actual fractal structures could be revealed in the DNA but "fractal defects" were corroborated with glitches in regulatory intergenic sequences; [92]). Next, we can target more complex (multicellular) organisms, to start with Purkinje neurons, later proceeding for instance to the quite obviously fractal-looking Cauliflower romanesca, that visibly evolves in a massively parallel manner (Fig. 7).

To emphasize that the class of both massively parallel and recursive algorithms of neural nets led towards this school of thinking [6, 5] and will increasingly be applied in implicit denial of the central dogma, the most recent article from the flagship of NN R&D is cited [93].

The utilization of PRGF is expected to lead to predictable implications - as well as others, unforeseen. As indicated at the outset, new advances in applications of PRGF are likely to include epigenetic medicine (for diagnosis, identification of factors that block PRGF - such as defects in regulatory sequences - and for therapies that block PRGF in order to create new types of antibiotics, etc.). For bioenergy, nanotechnology and synthetic biology it seems essential to first understand genomic function, including genes but also their regulatory mechanisms. As suggested by Fig. 7, our plate is full for the second century of (postmodern) genomics.

References

1. Collins F. New Findings Challenge Established Views on Human Genome, June 14, 2007
2. Hood L, Galas D. The digital code of DNA. Nature. 2003; 421(6921): 444-8.
3. Pellionisz A. Modeling of neurons and neuronal networks. In: Schmitt FO, Worden FG, editors. The Neurosciences, IVth Study Program. MIT Press, 1979. pp 525-46. #37
4. Anderson J, Pellionisz A, Rosenfeld E. Neurocomputing-2. MIT Press, 1989.
5. Pellionisz AJ. Neural Geometry: Towards a fractal model of neurons. In: Cotterill RMJ, editor. Models of brain function, Cambridge: Cambridge University Press. 1989. pp 453-464.
6. Werbos PJ. The Roots of Backpropagation, NY: John Wiley & Sons:1974/1994, Includes Werbos's Harvard Ph.D. thesis; Beyond Regression, 1974.
7. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. PNAS.1982; 79(8): 2554-58.
8. Kohonen T, Barna G, Chrisely R. Statistical pattern recognition with neural networks: benchmarking studies; Proc. IEEE International Conference on Neural Networks, San Diego, 1988. pp. I-61-1-68 Anderson J, Pellionisz A, Rosenfeld E. Neurocomputing-2. MIT Press, 1989.
9. Carpenter G, Grossberg S. ART-2: Self-organization of stable category recognition codes for analog input patterns: Applied Optics, 1990: 26; 4919-4930. In: Anderson J, Pellionisz A, Rosenfeld E. Neurocomputing-2. MIT Press, 1989.
10. Crick FHC. On protein synthesis. Symp. Soc. Exp. Biol. 1958;12:138-163. Early draft version at http://profiles.nlm.nih.gov/SC/B/B/F/T/_/scbbft.pdf (see 62.)
11. Watson JD, Crick FHC. Molecular structure for deoxyribose nucleic acid. Nature.1953(171) 737-38.
12. McClintock B. Controlling elements and the gene. Cold Spring Harb. Symp. Quart. Biol. 1956; 21:197-216.
13. Jacob F, Monod JJ. Genetic regulatory mechanisms in the synthesis of proteins: Mol. Biol. 1961;3: 318-56.
14. Jacob F. The Statue Within - An Autobiography. Cold Spring Harbor Laboratory Press, 1995
15. Britten RJ, Davidson EH. (1969) Gene regulation for higher cells: A theory. Science: 1969;165 (3891): 349-357.
16. Baltimore D. RNA-dependent DNA polymerase in virions of RNA tumour viruses. Nature.1970; 226:1209-1211.
17. Crick FHC. Central dogma of molecular biology. Nature. 1970;227:561-563.
18. Watson JD. Molecular Biology of the Gene (2nd Ed.), Publisher: Benjamin WA, 1970.
19. Ohno S. Evolution by Gene Duplication. Springer-Verlag, New York, 1970
20. Ohno S. So much junk in the human DNA. Brookhaven Symposia: 1972, pp.366-369.
21. Simons MJ, Pellionisz AJ. Genomics, morphogenesis and biophysics: Triangulation of Purkinje cell development. The Cerebellum. 2006;5(1):27-35.
22. Heisenberg W. Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik, Zeitschrift für Physik, 1927(43), pp. 172-198. English translation: J. A. Wheeler and H. Zurek, Quantum Theory and Measurement, Princeton Univ. Press, 1983, pp. 62-84.
23. Churchland PS. Neurophilosophy: Toward a Unified Science of the Mind-Brain; Cambridge, MA:Bradford Books/MIT Press, 1986.
24. Dupre J. Darwin's Legacy: What Evolution Means Today; Oxford: Oxford University Press, 2005.
25. Lamarck JB. Systeme des animaux sans vertebres, ou tableau general des classes, des ordres et des genres de ces animaux; presentant leurs caractres essentiels et leur distribution, d'apres la consideration de leurs, Paris, Detreville, 1801(8):1-432.
26. Darwin C. On the Origin of Species. London, UK: Murray, 1859.
27. Wang T, Zeng J, Lowe CB, Sellers RG, Salama SR, et al. Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. PNAS. 1970; 104: 18613-18.
28. Lindenmayer A. Mathematical models for cellular interaction in development I. Filaments with one-sided inputs. Journal of Theoretical Biology. 1968; 18: 280-289.
29. Mandelbrot BB. The fractal geometry of nature. New York: WH Freeman; 1977.
30. Ruis, J. The Julius Ruis Set; 2004; http://www.fractal.org/Julius-Ruis-Set.pdf
31. Simons MJ, Pellionisz AJ. Implications of fractal organization of DNA on disease risk genomic mapping and immune function analysis, In: Australasian and Southeast Asian Tissue Typing Association 30th Scientific Meeting; 2006; 22-24 November, Chiangmai, Thailand
32. Widrow B., Stearns SD. Adaptive signal processing. Prentice-Hall, Englewood Cliffs, NJ. 1985.
33. Sanford JC. Genetic Entrophy, New York, Elim Publishing, 2005
34. Erdogmus D, Yadunandana N, Principe RJC, Fontenla-Romero O, Alonso-Betanzos A. Recursive least squares for an entropy regularized MSE cost function. ESANN-2003 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 23-25 April 2003, pp.451-456.
35. Wiener N. Cybernetics or Control and Communication in the Animal and the Machine, John Wiley & Sons, Inc. New York, 1949
36. Schrődinger, E. (1944) What is life? The Physical Aspect of the Living Cell. Based on lectures delivered under the auspices of the Dublin Institute for Advanced Studies at Trinity College, Dublin, in February 1943.
37. von Neumann, J. The Computer and the Brain New Haven and London, Yale University Press, 1958.
38. Szilárd L. Theory of Ageing: Nature. 1959; 184 (4691):957-8.
39. Hanov MC. Philosophiae naturalis sive physicae dogmaticae: Geologia, biologia, phytologia generalis et dendrologia. 1766
40. Bateson W. Letter to Sedgwick, April 18, 1905. In William Bateson, F.R.S.: His Essays and Addresses (ed. B. Bateson), pp. 93. Cambridge University Press, Cambridge 1928.
41. Bateson W. 1906. A text-book of genetics. Nature. 1906; 74:146-147.
42. Aristotle (cca 400 B.C.) De memoria et reminiscentia (Aristotle on Memory), English translation in Anderson J, Pellionisz A, Rosenfeld E. Neurocomputing-2. MIT Press, 1989, pp. 1-10.
43. Koza JR. Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press, 1992
44. Prodromou C, Pearl LH. Recursive PCR: a novel technique for total gene synthesis. Protein Eng. 1992; 8: 827-9.
45. Barua R, Misra J. Binary Arithmetic for DNA Computers, Lecture Notes, Springer, 2003; 2568:124-132.
46. Carbone A. Seeman NC. Coding and Geometrical Shapes in Nanostructures: A fractal DNA-assembly. In: Natural Computing, Kluwer Academic Publishers. 2003
47. Winfree E. Algorithmic self-assembly of DNA: Theoretical motivations and 2D assembly experiments. J. Biol. Mol. Struct. & Dyns. Conversation. 2000; (11/2):263-270.
48. Fields CA, Soderlund CA.A practical tool for automating DNA sequence analysis. Comput. Appl. Biosci. 1990; 6: 263-270.
49. Dong Q, Wilkerson MD, Brendel V. Tracembler - software for in-silico chromosome walking in unassembled genomes. BMC Bioinformatics; 2007; 8:151.
50. Lander ES., Linton LM, Birren B, Nusbaum C, et al. Initial sequencing and analysis of the human genome. Nature. 2001; 409: 860-921.
51. Venter JC, Adams MD, Myers EW, Li PW, et al. The sequence of the human genome. Science. 2001; 291: 1304-1351.
52. Boussard AE. (2005) A scientific revolution? The prion anomaly may challenge the central dogma of molecular biology; EMBO. 2005; (6/8): 691-4.
53. Goodman AF, Bellato CM, Khidr L. The uncertain future of the Central Dogma: The Scientists. 2005;19/12:20
54. Mattick JS. Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms. BioEssays. 2003; 25: 930-939.
55. Mattick JS. The hidden genetic program of complex organisms. Sci. Am. 2004; 291: 60-67.
56. Henikoff S. Beyond the Central Dogma; Bioinformatics. 2002; 18: 223-5
57. Thieffry D. Forty years under the central dogma. Trends Biochem. 1998; 23:312-316.
58. Brinley E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Marguiles EH, et al, Identification and analysis of functional elements in 1% of the human genome by the ENCODE Pilot Project, (14 June 2007), Nature. 2007; 447: 799-816.
59. Mattick JS. The human genome: RNA Machine - Contrary to current dogma most of the genome may be functional: The Scientist. 2007; 21(10):61
60. Brenner S. Nobel Lecture 2002: http://nobelprize.org/nobel_prizes/medicine/laureates/2002/brenner-lecture.html
61. Crick FHC. (1956, Unpublished but acknowledged by Crick in 1958, see 10.): http://profiles.nlm.nih.gov/SC/B/B/F/T/_/scbbft.pdf
62. Watson JD. The Molecular Biology of the Gene (1st Edition). W.A. Benjamin. Inc. New York, 1965
63. Darden L. Reasoning in Biological Discoveries. Cambridge: Cambridge University Press, 2006.
64. Temin HM, Mizutani S. RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature.1970; 226:1211-13.
65. Judson, H.F. The Eighth Day of Creation: Makers of the Revolution in Biology. Cold Spring Harbor Press, 2004.
66. Jacob, F. (1965) Nobel Lecture (on the Prize with Monod) http://nobelprize.org/nobel_prizes/medicine/laureates/1965/jacob-lecture.pdf
67. Pellionisz A, Llinas R, Perkel DH. Computer Model of the Cerebellar Cortex of the Frog. Neuroscience: 1977; 2:19-35.
68. Aravin AA., Naumova NM, Tulin AV, Vagin VV, Rozovsky YM, and Gvozdev VA. Double-stranded RNA-mediated silencing of genomic tandem repeats and transposable elements in the D. melanogaster genome. Curr. Biol. 2001; 11: 1017-27.
69. Carthew RW. Gene silencing by double-stranded RNA. Curr. Op. Cell Biol. 2001; 13: 244-248.
70. Eddy SR. Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2001; 2: 919-29.
71. O'Gorman W, Akoulitchev A. What is so Special About Oskar Wild? Sci STKE. 2006;365:51.
72. Woese CR. Translation: in retrospect and prospect. RNA. 2001; 7:1055-67.
73. Lyon MF. X-chromosome inactivation: a repeat hypothesis. Cytogenetic Cell Genet. 1998; 80:133-37.
74. Bailey JA, Carrel L, Chakravarti A, Eichler EE. Molecular evidence for a relationship between LINE-1 elements and X chromosome inactivation: the Lyon repeat hypothesis. PNAS. 2000; 97: 6634-9.
75. Mlynarczyk SK, Panning B. X inactivation: Tsix and Xist as yin and yang. Curr. Biol. 2000;10: R899-R903.
76. Matzke M, Matzke AJM, Kooter JM. RNA: Guiding gene silencing. Science. 2001; 293:1080-1083.
77. Lunyak VV, Prefontaine GG, Náez E, Cramer T, Ju BG, Ohgi KA, et al. Developmentally regulated activation of a SINE B2 repeat as a domain boundary in organogenesis. Science. 2007; 317(5835): 248-51.
78. Lilley DMJ. DNA-Protein: Structural Interactions. Oxford: IRL Press at Oxford University Press, 1995
79. Yoder JA, Walsh CP, Bestor TH. Cytosine methylation and the ecology of intragenomic parasites. Trends Genet. 1997; 13: 335-40.
80. Heard E, Rougeulle C, Amaud D, Avner P, Allis CD, Spector DL. Methylation of histone H3 at Lys-9 is an early mark on the X chromosome during X inactivation. Cell. 2001; 107:727-738.
81. van Steensel B, Delrow J, Henikoff S. Chromatin profiling using targeted DNA adenine methyltransferase. Nature Gen. 2001; 27:304-08.
82. Weidman JR, Dolinoy DC, Maloney KA, Cheng JF, Jirtle RL. Imprinting of opossum Igf2r in the absence of differential methylation and air. Epigenetics. 2006: 1(1): 49-54.
83. Lindquist S. Mad cows meet psychotic yeast: the expansion of the prion hypothesis. Cell. 1997: 89; 495-498.
84. Barski JJ, Lauth M, Meyer M. Genetic targeting of cerebellar Purkinje cells: History, current status and novel strategies. Cerebellum. 2002; 1(2): 111-18.
85. Oberdick JD, Jankowski J., Holst, MI, Liebig C, Baader SL. Engrailed-2 negatively regulates the onset of perinatal Purkinje cell differentiation J. Comp. Neurol.. 2004; 472: 97-99.
86. Fire A, Xu SQ, Montgomery MK, Kostas, SA, Driver SE, Mello CC: Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature. 1998;391:S.806-811.
87. Mattick JS. The functional genomics of noncoding RNA. Science. 2005; 309 (5740): 1527-28.
88. Rigoutsos I., Huynh T, Miranda K, Tsirigos A, McHardy A, Platt D. Short blocks from the non-coding parts of the human genome have instances within nearly all known genes and relate to biological processes: PNAS. 2006;103: 6605-10.
89. Fraser CM, Gocayne JD, White O, Adams MD, Clayton RA, Fleischmann RD, et al. The minimal gene complement of Mycoplasma Genitalium. Science. 1995; 270(5235): 397-403.
90. Peterson S, Camella C, Bailey S, Jorgen S, Jensens S, Martin B, et al. Characterization of repetitive DNA in the Mycoplasma genitalium genome: Possible role in the generation of antigenic variation, PNAS. 1995; 92:11829-33.
91. Csürös M, Noe L, Kucherov G. Reconsidering the significance of genomic word frequency, 2006, arXiv:q-bio.GN/0608022v1, http://arxiv.org/PS_cache/q-bio/pdf/0609/0609022v1.pdf
92. Pellionisz A. PostGenetics: Genetics beyond genes. The journey of discovery of the function of Junk DNA. BCII2006 Proceedings, 2006.
93. Xu R, Ganesh K, Venayagamoorthy K, Wunsch DC. Modeling of gene regulatory networks with hybrid differential evolution and particle swarm optimization, Neural Networks. 2007; 20: 917-27.