Keywords: Formal Languages and Automata: Language classes of the Chomsky hierarchy and some refinement: union-free, regular, linear, context-free, permutation, context-sensitive and recursively enumerable languages. Pattern languages, regulated rewriting, controlled grammars, programmed grammars, contextual hypergraph grammars, primitive words and permutations, cyclic words, 2-head automata, automata with translucent letters, CD-systems of restarting automata.
Computability, decidbaility, complexity.
New Computing Paradigms: Interval-valued computing, DNA computing, WK-automata, membrane computing, bio-motivated computing.
Parallelism in computing.
Genetic Algorithms.
Research partners:
Dávid Angyal (University of Debrecen, Hungary) Péter Battyányi (University of Debrecen, Hungary)
Jürgen Dassow (Otto-von-Guericke University, Magdeburg, Germany)
Adrian-Horia Dediu (Rovira i Virgili University, Tarragona, Spain)
Pál Dömösi (University of Debrecen, Hungary)
Ömer Egeciouglu (University of California, Santa Barbara, CA, USA)
Szilárd Fazekas (Kyoto-Sangyo University, Japan)
László Hegedüs (University of Debrecen, Hungary)
Géza Horváth (University of Debrecen, Hungary)
Masami Ito (Kyoto-Sangyo University, Japan)
Galina Jirásková (Slovak Academy of Sciences, Kosice, Slovakia)
Hans-Jörg Kreowski (Bremen University, Germany)
Manfred Kudlek + (Hamburg University, Germany)
Peter Leupold (Rovira i Virgili University, Tarragona, Spain)
Friedrich Otto (Kassel University, Germany)
Sándor Vályi (College of Nyíregyháza, Hungary)
György Vaszil (University of Debrecen, Hungary)
Formal languages and automata.
CD-systems of restarting automata, automata with tranclucent letters.
We have studied cooperating distributed systems (CD-systems) of restarting automata that are very restricted: they are deterministic, they cannot rewrite, but only delete symbols, they restart immediately after performing a delete operation, they are stateless, and they have a read/write window of size 1 only, that is, these are stateless deterministic R(1)-automata. We studied the expressive power of these systems by relating the class of languages that they accept by mode =1 computations to other well-studied language classes, showing in particular that this class only contains semi-linear languages. Our model can be viewed as a nondeterministic finite-state acceptor with translucent letters, that is, it processes its input in a different way than the usual left-to-right order [ic46]. In this way all commutative semi-linear languages, and in fact all rational trace languages, can be accepted [ij46]. In addition, we investigated the closure and non-closure properties of the class of languages accepted by our model and some of its algorithmic properties [ij62]. Although the component automata of a cooperating distributed system (CD-system) of stateless deterministic R(1)-automata are all deterministic, the CD-system itself is not. We have studied CD-systems of stateless deterministic R(1)-automata that are themselves completely deterministic. These CD-systems correspond to deterministic finite-state acceptors with translucent letters. We investigated the expressive power of these systems, studied the closure properties of the class of languages they accept, and showed that the inclusion problem for these systems is undecidable, while their universe problem is decidable [ij54,ij70]. These models were used for linguistic studies in [ic58,ij80].
A characterization of the class of word languages that are linearizations of context-free trace languages is given in terms of CD-systems of stateless deterministic restarting automata with window size 1 that are governed by an external pushdown store [ij52,ij59]. These systems can be viewed as (non-deterministic) pushdown automata with transcucent input (tape) letters. Their deterministic variations were investigazed in [ic47].
Pushdown automata with translucent pushdown symbols were also investigated [ic48,ic51]. Such a device is a pushdown automaton equipped with a `transparency relation'. In a transition step, the automaton does not read (and replace) the topmost symbol on its pushdown store, but the topmost one that is only covered by translucent symbols. It is proven that these automata accept all recursively enumerable languages. Then pushdown automata with translucent pushdown symbols that work in real-time or in quasi-real-time were analysed. The corresponding language classes are compared to other classical complexity classes, and closure properties for these language classes are also studied.
By replacing the basis automata of the original CD-systems of stateless deterministic R(1)-automata to R(2) automata, i.e., obtaining CD-systems of stateless deterministic R(2)-automata the systems proved to be more powerful than the original ones: all linear context-free languages and some non semi-linear languages can be accepted by them [ij60].
Research papers:
Union-free languages and union-free decomposition of regular languages, the union-normal form.
The union-complexity of regular languages was introduced in [ic12]. It is a kind of complexity measure of the regular languages. Regular expressions can be represented by expression tree and by flow diagram (syntax graph) [hc9]. Based on equivalences a possible normal form can be obtained. It has interesting properties in both graphical representations. The union-complexity of languages is defined. The union-complexity is 1 for the union-free languages. These languages can be given by regular expressions without the operation union. Some properties of them are described [ij17,ic44]. There is an analogy with the star-free languages and the star-height problem of regular languages. We show that each regular language has finite union-complexity based on its description by a finite union of union-free languages (union normal-form). For some special language classes, for instance for finite languages the union complexity is easily computed. For union-complexity of any regular languages lower and upper bounds are presented. This measure gives an infinite hierarchy of regular languages. The union-height can be defined and proved to be at most 1 allowing n-ary unions.
The union-free languages are charaterized by special type of finite automata [ij17]. A 1cfpa is a non-deterministic (maybe with transitions by the empty word) automaton with exactly one final state and with a special graph: there is exactly one cycle-free path from every state to the final state. The Parikh sets of these languages are characterized in [ij65]. The deterministic versions of 1cfpa are not so powerful, consequently the class of deterministic union-free languages are defined by them. In [ij65] it is proven that there are regular languages, e.g., the language ((a+b)(a+b))* that cannot be obtained as a finite union of deterministic union-free languages.
Research papers:
Linear languages, 2-head finite automata, special linear languages: k-rated linear and 2det-lin languages, 5' → 3' WK-automata, 5' → 3' WK countermachine.
The class of linear languages is between the classes regular and context-free. They share some properties of the regular languages. They can be accepted by special 2-head finite automata, where the heads starts form the opposite ends of the input and move opposite directions and the accepting process is finished when the heads meet. These automata can be viewed as special Watson-Crick (WK-)automata. We investigated several variants of Watson-Crick automata in which both heads read the doubled DNA strand from 5' to 3' direction and the heads sense when they meet. Some versions of these automata recognize exactly the linear context-free languages [ic26,ij32]. Deterministic automata are not so powerful: the language class of 2-deterministic linear languages is defined by them. It is shown that all fixed-linear, and specially, all even linear languages are in this class. In fixed linear (and k-rated linear) languages the heads read fixed lenght strings in each step but the final (accepting) step. A hierarchy of the accepted languages is presented using the all-final, simple, 1-limited and no-state versions and combined restrictions [ic36,ij69]. By full-reading of both strands some languages that are not context-free can also be accepted [ij69]. One full reading in both directions is called a run. It is proven that the expressive power of these automata increases with every additional run that they can make, both for deterministic and non-deterministic machines. This defines two incomparable infinite hierarchies of language classes between the regular and the context-sensitive languages. These hierarchies are complemented with classes defined by several restricted variants of 5' → 3' WK-automata like stateless automata. Finally we show that several standard problems are undecidable for languages accepted by 5' → 3' WK-automata in only one run, for example the emptiness and the finiteness problems [ic37,ij48].
We considered stateless counter machines which mix the features of one-head counter machines and the special type of two-head Watson-Crick automata (WK-automata). Our Watson-Crick counter machines are biologically motivated. They have two heads that read the input starting from the two extremes. The reading process is finished when there are no more symbols between the heads, i.e., every letter of the input is processed by either head. Depending on whether the heads are required to advance at each move, we distinguish between realtime and non-realtime machines. If every counter makes at most k alternations between nondecreasing and decreasing modes in every computation, then the machine is k-reversal. We have described the properties of both deterministic [ic43,ij61] and nondeterministic stateless WK-automata with counters and also state hierarchy results [ij55,ij58].
Pumping lemmas are created to prove that given languages are not belong to certain language classes. There are several known pumping lemmas for the whole class and some special classes of the context-free languages. In [ic45,ij49] new, interesting pumping lemmas for special linear and context-free language classes are presented. Some of them can be used to pump regular languages in two place simultaneously. Other lemma can be used to pump context-free languages in arbitrary many places.
Research papers:
Permutation languages.
The family of context-free grammars and languages are frequently used. Unfortunately several important languages are not context-free. A possible family of extensions is analysed, the permutation languages [ic33,ij42]. In our derivations branch-interchanging steps are allowed: language families obtained by context-free and permutation rules are analysed. In permutation rules both sides of the rule contain the same symbols (with the same multiplicities). The simplest permutation rules are of the form AB → BA. Various families of permutation languages are defined based on the length of non-context-free productions [bc2]. Only semi-linear languages can be generated in this way, therefore these language families are between the context-free and context-sensitive families. Interchange lemmas are proven for various families [ij42,bc2]. It is shown that the generative power is increasing by allowing permutation rules with length three instead of only two [bc2]. The family of permutation is not closed under intersection with regular sets, therefore the obtained family of languages is also interesting: some of its non-trivial properties are shown. Important languages of mildly context-sensitive classes are shown to be belonging to this class [ij43,bc7]. Closure properties and other properties are detailed. Relations to partial and semi-commutations and to parallel processes are shown.
Research papers:
Context-sensitive languages: derivation-tree and left-most derivation, shadow-pushdown automata, new normal form.
Normal forms play significant role in computer science, and specially, in formal language theory. There are well-known normal forms for context-sensitive grammars, but all of them allows (context-sensitive) recursions, e.g., AB→AC, AC→AB. These recursions may causes arbitrarily long derivations for finite words. Starting with Penttonen's one-sided normal form a new normal form is created for context-sensitive grammars, in which all context-sensitive recursions are removed, and therefore there is no sentential form that can be derived from itself by any positive number of derivation steps [ic35].
One of the main reasons that context-free grammars are widely used is the fact that the concept of derivation trees fits to them very well. The left-most derivations play important role both in theory and practice. Unfortunately with left-most derivations only context-free languages can be derived even if the rules of the grammar are not context-free. In [ic21,bc3] derivation trees are investigated for context-sensitive grammars based on Penttonen's one-sided normal form. The concept of the presented derivation graphs are a kind of extension of the well-known context-free derivation-trees. Moreover it allows to define left-most derivations in context-sensitive grammars without loosing the efficiency of the derivations. These left-most derivations are not derivations in the usual sentential form sense. They are the generalizations of the classical (context-free) left-most derivations, in the way of constructing a derivation tree. Some examples and a new type of ambiguity are also shown [bc3]. The concept of the pushdown automata is generalised, the so-called shadow-pushdown automata are presented which uses additional half-translucent shadow symbols [ic21,ij22,ij47]. The work of the automata is based on the left-most derivation of context-sensitive languages. The class of shadow-pushdown automata characterizes exactly the context-sensitive languages, other (restricted or extended) families of these automata characterizes the language classes of the Chomsky hierarchy [ij47].
Research papers:
Regulated rewriting, controlled and programmed grammars.
Programmed grammars, one of the most important and well investigated classes of grammars with context-free rules and a mechanism controlling the application of the rules, can be described by graphs. We investigated whether or not the restriction to special classes of graphs restricts the generative power of programmed grammars with erasing rules and without appearance checking, too [ij23]. It is obtained that Eulerian, Hamiltonian, planar and bipartite graphs and regular graphs of degree at least three are pr-universal in that sense that any language which can be generated by programmed grammars (with erasing rules and without appearance checking) can be obtained by programmed grammars where the underlying graph belongs to the given special class of graphs, whereas complete graphs, regular graphs of degree 2 and backbone graphs lead to proper subfamilies of the family of programmed languages.
Regulated rewriting can also be viewed as a generalization of finite automata [bc4].
Another way to control a derivations in a grammar when the sequences of the applications of the productions have a restriction: they must belong to a given language. These, controlled grammars, are defined by a grammar and a control language. When, for each word of the control language there must be at least one valid derivation, we talk about exact control. Grmmars with exact control were introduced in [ic69].
Research papers:
Distances of languages.
Various concepts of distances between formal languages are considered [ij27]. Distances based on symmetric differences of languages are shown such as distances based on descriptions of languages. Neighbourhood criteria among languages give the possibility to use distances based on neighbourhood sequences (for related digital distances see the page of R.G. of Digital Geometry and its Applications also). There are well-known distance functions among words, they also can be basis of distances of languages.
Research papers:
Primitive words and permutations.
Primitive words are analysed from the point of view of their scattered subwords [ij15]. The language of primitive words is the language of the words that are not proper powers of another word. We take a look at the Parikh-vectors of these words, that is, we consider the commutative closure of languages formed by primitive words. After first looking at the shortest (one letter) subwords we move on towards giving necessary conditions for a word to be non-primitive in terms of scattered subword multiplicity. Furthermore, we prove that knowing the multiplicity of every word in a fixed set of words as scattered subwords in some word is not enough to decide whether it is primitive or not.
Research papers:
Cyclic words.
Cyclic (or circular) words have no endpoints. They can be seen as sets of the conjugate classes of words. Strong and weak periods are defined according to the fact that the given number is a period of all conjugates or at least one conjugate, respectively. Their periodic properties are studied in [ia19,ic60,ij88]. Based on these properties representations are shown in [ic64].
Research papers:
Pattern languages.
A pattern is a finite string of constant and variable symbols. The language generated by a pattern is the set of all strings of constant symbols which can be obtained from the pattern by substituting (non-empty) strings for variables. There are erasing and non-erasing patterns are used. For non-erasing pattern languages the equivalence of languages is decidable but the inclusion problem for them is undecidable. In extended regular expressions one can use union, concatenation and Kleene star to make more complex patterns. It is also known, that the equivalence problem of extended regular expressions is undecidable. However, the problem, whether the equivalence is decidable for patterns using only concatenation and star was open. There are some interesting results about inclusion properties and equivalences of some kinds of erasing and non-erasing pattern languages [ij15]. It was shown that the equivalence problem of non-erasing patterns in some cases can be reduced to the decidability problem of some very special inclusion properties.
Research papers:
Contextual hypergraph grammars.
Contextual hypergraph grammars have been introduced as generalization of total contextual string grammars. We study the position of the class of languages generated by contextual hypergraph grammars in comparison with graph languages generated by hyperedge replacement grammars and double-pushout hypergraph grammars. Moreover, several examples show the potential of the new class of grammars [ij19]. Various trees are computed and self-assembly is modelled in this framework[ic27,ic29].
Research papers:
Chain codes.
Chain codes are representing drawings on grids. They can be generated by Lindenmayer systems. Picture languages on the triangular grid are introduced and analysed in [ic72]. Research papers:
New computing paradigms.
Interval-valued computing.
A new model for analog computations, namely interval-valued computations was introduced in [ic13]. Interval-valued computing paradigm is close to the classical paradigm, but dropping the bound of the number of bits in a byte. The basic data unit is the interval-value that is a finite union of subintervals (components) over the unit interval [0,1). Logical operations are straightforward generalizations of the classical operations, moreover shift operations is also used to carry some pieces of information to other parts of the unit interval. The product operation allows to raise the density of information. This operation is analogous with the zoom operation of the optical computing. Hard problems, e.g. SAT, tripartitate matching, qSAT (satisfiability of quantified Boolean formulae), can be solved in an efficient way (polinomial number of steps) deploying the massive parallelism of this paradigm [ic13,ic34,ic19,ij33]. It is shown that the validity problem of quantified propositional formulae is decidable by a linear interval-valued computation [ic19,ij33]. As a consequence, all polynomial space problems are decidable by a polynomial interval-valued computation. Furthermore, it is proven that PSPACE coincides with the class of languages which are decidable by a restricted polynomial interval-valued computation. The integer factorization problem can also be solved efficiently: there is an interval-valued algorithm that computes a proper divisor of the input number (or 1 in case the input is a prime) in a quadratic number of steps [ij56], the discrete logarithm problem can also be solved efficiently [ic56]. The complexity class NP was characterized in [ij84]. The visual feature of these computations are highlighted in [ic28,ic30]. Moreover a temporal logic connected to interval-values is shown in [ic30]. Interval-values can also be used in various ways of reasoning [ij20], and to simulate various many-valued and fuzzy logic systems [ic18]. In [ic62] computations with one dimensional cellular automata and the interval-valued paradigm were compared.
Research papers:
DNA computing. WK-automata, WK-countermachine.
DNA and membrane computing have been considered, both as theoretical models and as problem solving devices. The basic motivation behind these models of natural computing is using parallelism to make hard problems tractable. The concept of parallelism is analysed: it is shown that parallelism has very different meanings in these models. The terms 'or-parallelism' and 'and-parallelism' are introduced for these two basic types of parallelism [ic25,ij57].
We investigated several variants of Watson-Crick automata in which both heads read the doubled DNA strand from 5' to 3' direction (reverse WK-automata) and the heads sense when they meet. Some versions of these automata recognize exactly the linear context-free languages [ic26,ij32]. Deterministic automata are not so powerful: the language class of 2-deterministic linear languages is defined by them. It is shown that all fixed-linear, and specially, all even linear languages are in this class. In fixed linear (and k-rated linear) languages the heads read fixed lenght strings in each step but the final (accepting) step [bc6]. A hierarchy of the accepted languages is presented using the all-final, simple, 1-limited and no-state versions and combined restrictions [ic36,ij69]. By full-reading of both strands some languages that are not context-free can also be accepted [ij69]. One full reading in both directions is called a run. It is proven that the expressive power of these automata increases with every additional run that they can make, both for deterministic and non-deterministic machines. This defines two incomparable infinite hierarchies of language classes between the regular and the context-sensitive languages. These hierarchies are complemented with classes defined by several restricted variants of 5' → 3' WK-automata like stateless automata. Finally we show that several standard problems are undecidable for languages accepted by 5' → 3' WK-automata in only one run, for example the emptiness and the finiteness problems [ic37,ij48].
We considered stateless counter machines which mix the features of one-head counter machines and the special type of two-head Watson-Crick automata (WK-automata). Our Watson-Crick counter machines are biologically motivated. They have two heads that read the input starting from the two extremes. The reading process is finished when there are no more symbols between the heads, i.e., every letter of the input is processed by either head. Depending on whether the heads are required to advance at each move, we distinguish between realtime and non-realtime machines. If every counter makes at most k alternations between nondecreasing and decreasing modes in every computation, then the machine is k-reversal. We have described the properties of both deterministic [ic43,ij61] and nondeterministic stateless WK-automata with counters and also state hierarchy results [ij55,ij58].
Research papers:
Self-assembly. DNA-Prolog.
Simulation capabilities for contextual hypergraph grammars are presented in [ic29] showing that self-assembling DNA tiles computation of Boolean tree-like circuits can effectively be described by this approach.
A natural way of executing Prolog programs by DNA computations is presented in [ic41]. The philosophy of the programming language Prolog is very close to the classical idea of DNA computations. Based on Adleman's groundbreaking experiment a system is presented that gives answer to simple Prolog queries.
Research papers:
Membrane computing.
A comparison is provided between the membrane computing systems and the graphical interfaces of operating systems in [ij21]. A membrane computing system is a computing model using massive parallelism inspired by the functioning of living cells. The graphical schemes of these computing devices look like the windows of a graphical operating system representing programs running parallel on the computer. Both similarities and differences of membrane-systems and graphical operating systems are detailed as well as some possible simulation methods.
DNA and membrane computing, both as theoretical models and as problem solving devices are analysed. The basic motivation behind these models of natural computing is using parallelism to make hard problems tractable. The concept of parallelism is analysed: it is shown that parallelism has very different meanings in these models. The terms 'or-parallelism' and 'and-parallelism' are introduced for these two basic types of parallelism [ic25,ij62].
Several methods solving SAT by P-systems are reviewed in [ic54,ij71]. It is also showed that all valid SAT formulae over a fixed number of Boolean variables define a regular language.
Linguistic applications of membrane systems were shown in [].ij76
Research papers:
Parallelism in computations, parallel computing.
The concepts of the graph of generative grammars and Lindenmayer systems are investigated [bc5]. These special and-or graphs represent all information about the grammar. For regular grammars the new concept is equivalent to the finite automata. In context-free case it is an extension of the dependency graph. We also analyze how it is related to programmed grammars (using context-free rules). Moreover the graph of the grammar is more general; it can be defined for generative grammars to any recursive enumerable languages containing all information of the grammar. By the help of tokens the new concept can effectively be used to follow derivations. Moreover parallel derivations can be simulated in non-linear grammars. While in regular and linear case only sequential derivations are going, in context-free case a kind of maximal parallelism can be involved. In context-sensitive case communication/synchronization is also needed in these processes. The Lindenmayer systems are well-known parallel rewriting systems; our concept also can be defined and used for various Lindenmayer systems. Especially the main difference is not in the graph form, but in the simulation process.
The appearance and the role of the parallelism is analysed in some algorithms and approaches of problem solving in Artificial Intelligence and in Computational Intelligence. Well-known that parallel algorithms can solve the problems in a more effective way than the sequential algorithms. Two basic types of parallelism are presented. The so-called 'or'-parallelism uses the parallelism to do brute-force search. In the so-called 'and'-parallelism the solution is built up by several parts given by the parallel branches of the computation [ic22]. DNA and membrane computing have also been considered, both as theoretical models and as problem solving devices. The basic motivation behind these models of natural computing is using parallelism to make hard problems tractable. The concept of parallelism is analysed: it is shown that parallelism has very different meanings in these models: both 'or-parallelism' and 'and-parallelism' appear [ic25,ij57].
A textbook on various approaches of parallel computing has been appeared both in English and Hungarian[be1,bh2].
Text books:
Research papers:
Complexity.
It is shown that the set of valid SAT-formulae and n-SAT formulae (for any fixed n) over finite set of variables are regular languages [ij6]. We also construct a deterministic finite automaton to accept the Boolean tautologies in disjunctive normal form. So the language of tautologies in disjunctive normal form is regular as well. Moreover using arbitrary many variables the language of tautologies is not even context-free, but context sensitive.
Interval-valued computing paradigm is close to the classical paradigm, but dropping the bound of the number of bits in a byte. The basic data unit is the interval-value that is a finite union of subintervals (components) over the unit interval [0,1). Logical operations are straightforward generalizations of the classical operations, moreover shift operations is also used to carry some pieces of information to other parts of the unit interval. The product operation allows to raise the density of information. It is shown that the PSPACE-complete qSAT is decidable by a linear interval-valued computation [ij33]. Furthermore, it is proven that PSPACE coincides with the class of languages which are decidable by a restricted polynomial interval-valued computation. The class NP is also charaterized by interval-valued computing in [ij84].
Research papers:
Genetic and evolutionary algorithms.
Genetic algorithms are frequently used for optimization. In [ic52],the Painting problem is solved by using some specific knowledge about the problem itself and by interval-values to show that they improve the effectiveness of a genetic algorithm.
A memetic algorithm is presented in [ic49] for the reconstruction of binary images represented on a triangular grid using three lanewise directions. Crossover and mutation operators are introduced on this grid. A discrete tomography algorithm is proposed which uses a switch and compactness operators for the triangular grid to improve the quality of the image in each generation. The algorithm was tested on some images for its effectiveness. Another evolutionary algorithm is presented in [ic55]. The reconstruction is done by diamond-chain directions in [ic66], while it is extended also to six directions in [ij97]. Results by genetic algorithm and simulated annealing are compared in [hc14,ic68].
Research papers:
Selected research grants:
Contact:
If you have any problem with downloading any of these papers please email me for a copy.