Omega Experimental Language

written 1989 being revised May 2006

Google site search


1   Introduction        

Omega is a series of Experiments in Language Design. The language is designed over a strict mathematical foundation. The idea behind this being that mathematical theorems and results can be used and reapplied to programs written in Omega. Also a rigourous extensibility should result.

Historically when computer languages and programs have been designed attention has been focused on the operation be it addition or multiplication. But a mathematical function is defined nit only by its operation but also by its domain (inputs) and range (outputs). In the past these have been fudged in virtually all languages. This has been caused by a love affair with the LALR parser which enabled the definition of complex syntactical computational expressions. The rules of the operation symbols were extended wildly to mix boolean and arithmetic values for instance, with little regard to what an expression might actually mean. These expressions were seen rather as a tool for manipulating anything at hand into as much as possible. Mathematical integrity was the casualty and this loss impacted on the reliablility of implementation of such notions as inheritance, functional composition and transitive completion. The result was that OOP (Object Oriented Programming) was invented to fill this gap. OOP allowed inheritance but the loss of integrity within the languages themselves continued to make functional composition and transitive completion a reality. These are the building blocks of mathematics and the only way to reliably build up complex structures and then be able to manipulate them.

The various stages of Omega incrementally develop these ideas within the framework of a simple language, basically just complex enough to usefully illustrate the ideas it is exploring. At some points the explorations may branch out in several directions at once.

Because Omega is concerned with symbolic representations most of its work is comcerned with manipulating string objects. These string objects however are all representations of mathematical algebraic objects such as sets ,integers, lattices, constrained directed graphs, and so on up the scale of complexity. These objects are not an intrinsic part of the language but are included where they are useful to illustrate features of Omega. The core of Omega is a namespace of known objects, the functions defined over these objects and their ranges, and the algebraic properties of the functions ( associative, commutative etc ). Omega includes an engine which determines the transitive completion of existing functions and the properties of functions resulting from composition according to the transitive completions.

Because every function within Omega has functional integrity, that is its domain, range and operation are all precisely defined, it is a functional language. Such languages have the additional property that they can be manipulated according to the results of function theory. After the basic engine is developed, Omega will be equipped with functions whose ranges and domains are functions, in other words functions become objects within the language. This should happen at about Stage 5.

Because Omega is concerned with string representations it uses Tcl, a simple string oriented scripting language as a host to provide all the basic string manipulation functions that may be required. Tcl is an interpretive language. Initially Omega will exist as Omega statements embedded in Tcl host programs. Omega statements will be executed by invoking Tcl to provide the underlying functionality. This avoids the need to compile anything. Tcl itself uses the concept of string as its basic and virtually only object type. Tcl recognises a list form in which its own statements are structured and provides functions which manipulate lists, so fundamental Tcl objects could be described as lists of strings. Tcl recognises arrays as a notational convention and has namespace conventions for them, but Tcl arrays are not fundamental to the language. Likewise Omega uses strings as its "universe of discourse" and can literally borrow Tcl procedures recasting them as functions of a domain of product strings, with a range of string. String is the ultimate object into which all other objects can be devolved.

Omega is so named because it is the last letter in the Greek alphabet. If Omega succeeds in its design goal to faithfully implement mathematical structures then it should in theory be the last computer language that ever needs to be designed.

Omega is not expected to ever be used in general applications. Its basic design will be highly volatile for some time. It will exist in perhaps 10 varieties at any point in time as various avenues of development are explored. Its main use is in exploring which ideas of object inheritance and composition can be developed and what techniques are useful to acheive these properties. The ultimate application of Omega is to explore the field of function theory. Given a system of rules, it should be possible for Omega to go for a walk and discover its own functional identities, in other words, solve mathematical problems that we have not thought about or conceived of. This leads into the area of artificial intelligence.

2   Functions        

Functions are at the core of mathematics. They describe a relationship between inputs and outputs that guarantee time-invariance. In other words, given the same inputs a function always produces the same outputs. This is of course what we would like computers to do and it seems so basic that we are left wondering "what can possibly the problem?". There are five problems and a failure to account for each of these five aspects is a source of bugs.

2.1   The Domain        

The Domain of a function is the set of input values for which a function is defined. This should be well defined so that somebody using the function knows what type of inputs to supply to the function inorder to get correct and consistent results. This sounds very obvious but consider the Tcl expr command:
% expr 3 + 4
All well and good, but:
% expr "" + 4
So this (expr +) function accepts "" in its domain as well as numbers. In theory addition is commutative, that is a + b == b + a, so lets try
% expr 4 + ""
syntax error in expression "4 + "
So apparently "" is not allowed in the second element of the domain. Now addition is also a binary operator. That is to say the Domain should be a cross product of just the one set, but here this is not the case. The Domain for (expr +) is neither binary nor commutative as one might expect. What is this function (expr +) then? Well, we really dont know.

Is this important? Does it matter if the Domain is fudged. The Domain is a set of values and a set must have the important property that, given a candidate element you can tell if the candidate is a member in the set. You can keep yourself amused for hours trying to second guess Tcl as to whether {} {0} {{}} {{0}} are acceptable. When you are done you can try exprs of the form expr { a + b }, that is with the curly braces. It is an interesting question whether one would expect adding the braces should alter the Domain of the (expr +) function, but obviously (expr +) is different to ( expr {+}).

In order to compose two functions the Range of the first must be a subset of its candidate component in the Domain of the second function. If we arent sure what that second Domain is we have big problems.

2.2   The Range        

Like the Domain, the Range is a set and we should be able to easily determine whether a candidate is indeed an element of the set. Often we write a program to acheive a particular result, but if we dont even know if the result is in the Range of the function we are trying to use, how can we construct the function? For instance, is "" in the Range of (expr)? Here are some trivial examples:
% expr ""
syntax error in expression ""
% expr { "" }
indicating that an empty string was indeed returned. What other expressions returning "" are there?

A quick reference to the Tcl documentation will clarify why all these odd results are occurring. (expr) submits its arguments to initial processing before evaluating the expression. I have been using a special feature of the Tcl expr command to illustrate the kinds of problems that can occur in most any programming language which impacts on our ability to build up complex structures.

Is 1000000000 in the range of expr? or 5000000000?

% expr 2000000000 + 3000000000
Maybe 5000000000 is not in the Range, again it is hard to tell. In theory the Range of all Tcl commands is the set of all Tcl strings ( or lists of strings ). The subset of this Range that any Tcl command is capable of producing is called its Image in the Range and does not have to be the whole Range. The implication here is that when we compose two functions, we match the Range of one with the Domain of the next. But in Tcl for example, the Domains of most procedures are not the set of all Tcl strings and so writing Tcl is an intricate handbuilt affair, testing out what works.

2.3   Functional Integrity        

A fundamental quality of a function is that the value of the function for any point is unique and consistent. That is if F(a)->b and F(a)->c then b=c. It is this reliablitiy that functions always return the same value that makes them so useful. In the 50's and early 60's computer hardware was not absolutely reliable and sometimes the electronics failed to produce consistent results. These problems were quickly sorted out and this issue was not a major problem until the 1990's with the development of networking computers. The question becomes "will my Tcl code produce the same results on somebody elses machine as it does on mine?" There are a lot of environmental factors which effect the answer to this question. Obviously this result is desirable but it is far from clear-cut. For instance do you get the same result from this expr as I do?
% expr 3000000000

2.4   Looping        

This and the next section lie outside the mathematical notion of function and are concerned about whether a function value can be computed.

It is possible for a function to go into an infinite loop. Most programmers are familiar with this computational bug. It is possible to trap such loops and terminate them and produce a valid result documenting the nature of the loop. Calculating 1/7 is a simple task, but computing the exact decimal expansion of 1/7 will produce a sequence of .142857142857142857142857... . However this can be trapped and a notation like .142857(142857) used to indicate that the 142857 is an infinitely repeating group of digits

2.5   Effective Computation        

A different kind of problem is that the function computation may simply take an unacceptably long time or require unavailable resources. And finally we have the result from Logic Theory that some computations can continue interminably without looping but still never produce a result. It is relatively easy to write a function with this property and another result from Logic Theory is that there is no sure test for these "rogue" functions. Fortunately we tend to only want to write effectively computable functions so this problem seldom arises, but some applications may be prone to this problem, for instance applications concerned with DNA combinations.

3   Order of Evaluation        

From the usual point of view, the evaluation of a function resembles a complex lattice or network of trees, viewed in a static manner. Evaluation is viewed as being done from the bottom up, all at once, so to speak. This is the algebraic or mathematical sense of solution. e.g.:
           -b +- sqrt(b*b - 4*a*c)
       x = -----------------------
This equation has a static, time invariant quality. It does not matter whether a or b or c is substituted first. When b is substituted, it is substituted all at once, not one reference at a time.

At present computers in practice do only one thing at a time. Because of this, matters of efficiency arise, short cuts are taken to aid the evaluation process, and time dependencies arise. Modular style isolates these dependancies, but we are left with a number of situations.

3.1   1) Consecutive Evaluation        

A list of statements are executed in sequence. The time sequence is important as later statements are dependant on earlier ones.

3.2   2) Projection        

Only one of a list of statements is executed. e.g. if statement. It is important that alternative statements are NOT evaluated (why?).

3.3   3) Parameter List        

This list of statements is time invariant and may conceivably be evaluated simultaneously, a candidate for parallel processing. However the need to evaluate one parameter can be conditional on the values of others (see next item).

3.4   4) Operators with Zeroes        

Statements are evaluated conditional on a ZERO not being found. The operation is then evaluated.

This procedure allows for certain expressions not to be evaluated if it is already known that their evaluation cannot effect the final value of the statement.

e.g. or expr1 expr2

In this case, boolean or, expr2 is not evaluated if expr1 has already been evaluated as "1".

The example of "or" is a special case of an associative binary operator. Such operators may have particular values which act as ZEROEs. That is to say:

op expr1 ZERO  ==> ZERO
op ZERO expr2 ==> ZERO
no matter what the other expression is.

If expr1 cannot be evaluated at the time, there is no reason why a value of zero cannot be returned by the function. This is called "lazy evaluation" in some circles when it is applied to (and) or (or) functions, but it is really a "special feature", the product of convenience in generating machine code for boolean expressions. * / and % also have algebraic zeroes but few languages define lazy evaluation for them because the underlying machine instructions do not do lazy evaluation.

One may ask, what does lazy evaluation matter in theory, it is an efficiency convenience. If we are dealing with strictly effectively computable functions yes, this is the case. However even in this case there are readily computable functions, and not-so-readily computable functions. If the readily computable function obviates the need to compute the resource gobbler functions, then all the better.

AS well as ZEROs it is also possible for a language to recognise UNITs. This means that if a multiplicand, for instance, is 1, then no multiplication needs to take place. The result is identically the other multiplicand. Now the other multiplicand may not be able to be evaluated, but that does not prevent the multiplication from being evaluated.

4   Recursive Function Theory and Lambda Calculus        

Recursive function theory admits an identity function, a countably infinite set of projection functions, and a recursion function extended over countably infinte domains.

Compare this with the lambda calculus where the symbolic abstraction provides within its makeup the following projection functions.
$x.x, $x.$y.x, $x.$y.y, $x.$y.$z.x, $x.$y.$z.y, $x.$y.$z.z, ....
and also provides for recusion via the Y=(\$h.(\$y.h(y y))(\$y.h(y y))) function.

While recursive functions provide a more limited functional basis for development, the basis is in its definition more complicated and not so satisfying, especially as it is not finite. The lambda calculus provides a singular simple rule which encompasses everything that is required. This is a more satisfactory computational basis as only one basic operator is required, and this is readily manageable. The recursive definition for the recursive function theory is not particularly easy to use and not as amenable to algebraic manipulation as the lambda abstraction which has such virtues as meta-reduction.

Lambda calculus is a one-dimensional symbolic solution to what is essentially a graphical problem which exceeds one-dimensionality. This problem is even more naturally treated in the computer using pointers, than can be treated in the linear form of writing. The lambda abstraction does not require a name for the abstract symbol but simply a locality reference which binds a symbolic slot in an expression to the locality of that expression.

Thus the expression (x y) is no different to the expression $x.$y.x y except that the pointers for the first cannot be drawn in because the environmental context of x and y are not explicitly available. This means that the internal representation of these two expressions is the same. (x y) has symbols and no pointer connections, ($x.$y.x y) has the symbols replaced by the pointers.

The implication here is that the lambda abstraction is not an additional "rule" of the lambda calculus but simply a notation used to represent the natural extension of ordinary functional calculus. Indeed it makes functions "honest" by recognising the "environment".

As a result, the computer version of the lambda calculus is simpler than the written version. Since parameters have no names, no renaming rules are required and no name duplication confusion arises. However an I/O system is required. The lack of explicit names means that algebraic manipulation of functions is easier. e.g. I = $x.x has an absolute computer representation which is easily recognised.

This opens more doors than the recursive function theory possibly could and so lambda calculus is used as the basis for all further development.

©2000 - 2006 WEBSCOOL This page last updated 29 May 2006. All rights reserved - including copying or distribution of any portion of this document in any form or on any medium without authorisation. For more regarding the copyright.