Micro Relational Database Project

written 22 July 2001 being revised May 2006

Google site search

ContentsIntroductionBasicIntermediateAdvancedFuturePolicyInfrastructure

1   Introduction

Rationale, history, project structure, resources required, prerequisite knowledge

1.1   Rationale

Historically computer RAM was at a premium that allowed it to be used purely for processing. Information (data) was held on external media such as hard drives and tapes, and was accessed only when required. Today the amount of RAM available in a computer far exceeds the storage requirements to perform everyday computing tasks using large amounts of data. The emphasis has changed from carefully munipulating small amounts of data in a restricted environment, to the orderly management of large data structures which are constanty changing.

Today the demands of server architecture to be able to preform 1000s of similar tasks at once on the same body of information stretches the abilities of external media. As an example a stock catalogue available on the web may be accessed 1000 times per second from around the globe. Similarly search engines such as Google are hammered for instant retreival of simple searches. To solve these tasks the information is held in memory. RDB structure is still used to access the data, but a different emphasis apllies to the technology of database when the whole database if held in RAM. Basically, a lot of the tricky technical problems vanish. For most tasks today RDB's held in RAM are not only the simplest solution but also the best.

Whisking up a small RDB for an application is very easy once you have the basic tools. It is not necessary to spend tens of thousands of dollars on proprietary software and maintain elaborate database management for most tasks. The result of this change in emphasis is the Micro Relational Database. By using tools which are scaled to fit the complexity of the project, a much thinner more lightweight solution can be produced than can be by using proprietary products. In the case of applications which use RDB concepts this can be a huge difference - enough to make something feasible which might have appeared to be impractical, not because of its fundamental difficulties, but because of the resources which would have had to be harnessed in order to implement it.

This project places RDB concepts in the hands of everyone at the level at which they need to use them.

1.2   History

The problems of organising complex data structures becam apparent in the late 1960's when it was perceived that data had more uses in the environment than for simply tracking (modelling, accounting for ) physical processes. The concept of Managements Information System {MIS} was established when it was realised that accounting data could be used to project future sales trends, trends for future orders, future manufacturing plant requirements and so on. A businesses data structures had up to that point developed ad hoc as a model of the its operations, usually using one of two structures, hierarchical, and bill of materials. Bill of materials is a computationally complex data structure that is expensive to operate and can only be justified in high-value, mission critical or huge volume situations. The hierarchical structure was being found to be difficult to manage as computer systems were constantly enlarging. Changing an hierarchical database was a complex job. This prompted the search for a kind of data structure which had a solid theoretical foundation and which minimised the costs of data structure change.

In 1970 E. F. Codd peroposed a data structure based on a set of simple relations (tables). He also proposed conditions on the way the relations related to each other , called Normalization. These conditions guaranteed minimal upheaval to the structure when change was required. A data structure of relations which conform to these conditions are said to be in normal form.

This was a simple idea with a solid theoretical bvase, but not so easy to implement in practice in its day. it was about 1975 before the first commercially available RDB software was available. Two major products came forward, Informix and Oracle, which are today the heavyweights in this field. Informix was developed for the new Unix platforms which included midis and even minis and some of these original systems still operate today. Oracle addressed the IBM mainframe platform and went on to develop others by way of this liason and so ended up with the largest market share today.

The increase in power of PC's and the need by even small web-based applications to manage large amounts of data easily means that there is now a need for light-weight RDB's. There are a number of offerings mainly on Linux platforms such as MySQL The heavyweight products are now too large and expensive to be used for these appilactions.

1.3   Project Structure

The project develops along the theoretical path. This is explained largely because this is an educational environment in which knowledge precedes application ( but also conjoins it). Therefore the project deals firstly with relations which are specially structured lists. The idea of realtion is developed as an object, in order to guarantee structural integrity for inclusion of these fundamental building blocks into more complex appliaction such as an RDB.

Relation is an important mathematical object and a direct application of set theory.

Having developed the procedures required to handle relations, the database structure is developed introducing the ideas of table and keys.

Entities and their properties are then developed from the primitive idea of domain. User access to the RDB is then developed allowing users to interact with the database using two techniques, webpage application and tcl application.

Finally the structure of a collection of relations is developed.

1.4   Resources Required

A PC type computer and the freely available Tcl interpreter are required. A small slow computer is adequate for this project for a small group of students. Any of the UNIX, Windows or Mac OS's can be used and the product will port and run on all these systems.

Students may wish to extend the project using some of the megawidgets written for Tcl. These are avauilable in the Active Tcl standard "batteries included" release. [incr tcl] is also included in this system and may be used as a means of learning about the [incr tcl] extension.

1.5   Prerequisite knowledge

The students should already be familiar with the basic concepts of Tcl. They should have completed a small number of Tcl projects at both basic and intermediate levels. In particular they should have done the Application Base Project in the Advanced Section in order to familiarise themselves with concepts such as encapsulation.

Students must also have a good working knowledge of RDB, not only in how to use one to query information, but also how to structure, build and populate their own database using readily available RDB's such as Microsoft Access ( which is in fact the JetStream RDB). They should have successfully completed a project in which they create their own RDB and use it in an application.

Corequisites

This project can be developed to include many kinds of objects in the database including Lambda objects. It can also be extended cater for distributed tables, both over a computer file structure and also over a network. These aspects can be developed as the project proceeds.

1.6   Meta-Programming

This project involves writing Tcl code that writes other Tcl programs. For example, in the first Lesson the student writes a simple procedure and then generalises the concept and writes a generic program that writes a specific procedure for a specific table. Writing programs that create other programs or manipulate other programs is called meta-programming and involves a higher level of abstraction in ones thinking. It is quite possible that many students may not be able to readily comprehend this level of abstraction. Although it is no different to any other kind of programming, the idea that one creates a programs instead of an immediate result is a major springboard for intelectual development, and for new concepts in tool building. Basically programming is no different to any other tool building technology, eventually you find yourself building machines that themselves create a range of tools. Other examples are robots and mass production plants.

1.7   The Vehicle

Athough this is a meta-project which can be applied to many situations it is helpful to have a particular application as a test case to watch the result develop. For this project the e-shop is to be used. This means that the successful project will not only produce a suite of RDB tools but also an e-shop if the students pursue the project that far.

The e-shop is now an important application because the world marketplace is changing. Within 10 years all products will be distributed via global distribution networks. Anyone with a product to sell will hook into one of the major international distribution networks which supply retail stores throughout the world as well as providing delivery service for all goods purchased over the net. Of course, if your product can be delivered directly over the net then your e-shop is still required but you do not need to nook into a global distribution network: the net does that for you anayway.

The result of this change in the global marketplace is that selling products from your home over the net is a reality. In fact it is a reality that is being embraced by many people. These people set up their own website and supply a range of products, which may be a mix of cyber products which they have created themselves or which they sell on commision, plus a range of real product which they may make themselves or which they sell on commision, and which is delivered to customers worldwide via the global distribution network. Needless to say, products must be manufactured in large volumes to meet the demands of the global distribution network. Items in small or limited quantities will tend to be sold via systems such as e-bay.

It is likely that virtually everyone in the future will have their own personal e-store at some time in their life, and so it is appropriate that students get acquainted with this change in the global marketplace which is already becoming a ubiquitous part of their lives.

You may wonder why or even how so many e-stores could ever operate. Surely if Amazon.com can sell all the books published in the world, you only ever need one e-store selling books, namely Amazom.com. All sorts of factors determine what may or may not be sold on a particulare e-store site. Delivery issues, copyright, license, censorship, political and a whole host of issues determine what may be able to be sold in any particular e-store. For instance, it would be pointless to attempt to sell Taiwanese product on a mainland Chinese e-store.

For this reason there will always be many e-stores differentiated by their product range. But their areother reasons for an e-store establishing credentials for its existence. Many of these qualities apply to retailing, such as the "feel" of the shop, the quality of the products presented, the completeness of the product range in a specialty area, the uniqueness of the product, the way the product range ties together into a "style". All these qualities can be amplified in an e-store to provide a service which targets the perceived customer base for the store.

1.8   Program

2   Lesson 1 - Creating table references and a simple Table object

The e-store

What is an e-store?

An e-store is a shop on the web. It consists of pages displaying products for sale which may be real or cyber product. If the products are real you have to consider how you are going to deliver products to your perceived customer base. If the products can be delivered over the net that problem is trivial.

An e-store keeps track of each customers interaction with the store usually by means of a cookie kept of the customers computer, with a copy kept on the e-store's computer to match against in oprder to guard against corruption of the customers data. This data includes the "shopping basket", which is the list of products that the customer has selected to purchase. When the customer wishes to complete purchase he accesses the purchase page in which the total costs of the products including shipping costs are listed and payment details are entered. Each product may have its own individual product page as well as an entry in a page (shelf) of products.

The core of the e-store is the product table. Each product will have some basic fields:

This information is used to create the product pages on the e-store, either dynamically, or by resident pages which have been generated dynamically from the product table. The purchasing page is also created dynamically. The storefront may even be a dynamic page, as many e-stores feel that including current newsfeeds, weather forecasts and other services help to draw customers to their e-store.

Entertainment will become a more important factor in drawing customers to an e-store as will state-of-the-art imagery such as virtual worlds, and a sense of life, such as seeing other people shopping and creating a social hub within the e-store.

Creating the Table Object

The product table resides in memory as a Tcl array indexed by some useful unique identifier such as the product code used to uniquely identify each product for shipping. This same code might also be used for reordering. Entries in the product table will look something like:
set product(1001) {{Wicker Cage} {This wicker cage is just the right size to hold the bird.} 
                        cage.jpg 1 10 }
set product(1002) { Lamp {This lamp is very useful in dark caves.} lamp.jpg 1 20 }
set product(1003) { Flask {This flask holds 1 litre of water.} flask.jpg 1 15 }
How do we reference the fields in this array by name? We need to provide a procedure which returns values according to field names rather than using [lindex $product($code) $i] references.

A reference like [product image $code] is better. To create this kind of reference we need a product procedure:

 proc product {field code} {
    global product
    switch $field {
        code { return $code }
        name { return [lindex $product($code) 0] }
        description { return [lindex $product($code) 1] }
        image { return [lindex $product($code) 2] }
        stock { return [lindex $product($code) 3] }
        price { return [lindex $product($code) 4] }
    }
    return ""
 }

We can of course just write this code for this table, but a table is a generic object and we can write a program that generates the code we want for any table that we may wish to use. For this we will use a simple form of a table command. Simple because the paramteres to the procedure will be kept positional. The definition for the table command is:
table name keyfield {list of field names}

To invoke the table command for the product table we say:

table product code {name description image stock price}
The exercise for this lesson is to write the Tcl table procedure which will produce the correct code.

The "Answer"

This may appear to be a trivial exercise, but it is a trivial exercise in metaprogramming and meta-programming is never a trivial exercise. THere are always two realms of operation to keep in mind simultaneously, the variable space of the end result and the variable space used to get the end result. The answer to this exercise is not given as it is imperative that teachers perform this exercise themselves in order to get a good understanding of the processes and techniques involved in metaprogramming. The real trick with this lesson is not to smash your keyboard in frustration before you have completed it.

Using the test data provided above (obviously derived from an the original game of "adventure") the following lines should provide the expected results.

table product code {name description image stock price}
info body product
product code 1001
product name 1003

Assessment

This exercise is a pons asinorum (an asses bridge), meaning that if the student can successfully complete this exercise then he/she is capable of meeting all the future challenges along the way. Students who cannot complete this exercise should be directed to work in the areas of this project which do not require metaprogramming. For instance they can write test suites, do the basic program and widget designs, and other tasks.

This is a short simple exercise that can be done in about 30 mins if plenty of mistakes are made. If the student returns simply what has been asked for, that is a table procedure which produces the code just as it is asked for, then a good pass (B-) should be awarded. Some students however will not be able to work out how to even get started. This should be expected. For example they will not be able to work out what the command is that causes the product procedure to be executed. If the class has a good competitive spirit, cribbing may be avoided. It should be made clear that the exercise will be assessed on the individuals own ability to solve the problem and that each hint that is required in order to complete the exercise will result in a mark reduction.

Completeing this exercise using ones own resources to solve the problem will cause the students to review much of their knowledge of the basics of Tcl and stand them in good stead to contribute to the completion of the rest of the project. Students who cheat their way through this exercise will be a liability to the rest of the class for the rest of the project.

Some students will be unsatisfied with the source code provided to metaprogram, and quite justifiably so. It provides no error conditions for missing keys or wrong field names. Students who return code which includes either or both of these features should be awarded higher marks.

In addition some students may be unhappy with the specification and want to insert a get method after the table name. This would bring the object character of the table more strongly into view allowing for table add ..., table delete ... and other object methods to be defined.

3   Lesson 2 - Extending the Table object with update methods

Relational Databases provide the internationally recognised SQL (Statement Query Language) in order to manipulate redords in the database. This is a full and very complex language, particularly with regard to the logic of retrieving data from the table.

It is not the scope of this project to replicate that degree of functionality of SQL, but rather to create a table object which behaves as such within the context of Tcl and which provides enough functionality to enable the Tk widget functions which will be introduced as the project develops.

To begin with we want to be able to extend the table procedure to reference object methods. The code in exercise 1 is extended to insert a method ference so that it conforms to other object definitions used in Tcl, Tk objects in particular. So instead of table field key to get the value of a field, we use table get field-list key.

To add records to the table we use table add field-value-list

To delete records by key we use table delete key

The exercise for Lesson 2 is to implement these three extensions. The sample code we want to generate is:

proc <b>product</b> {<b>method args</b>} {
    global <b>product</b>
    switch $<b>method</b> {
        get {
            if { [llength $<b>args</b>] != 2 } { return {}}
            set fieldlist [lindex $<b>args</b> 0]
            set code [lindex $<b>args</b> 1]
            foreach i $fieldlist { 
                set vlist \"\"
                switch $field {
                    code { append vlist $code }
                    name { append vlist [lindex $product($code) 0] }
                    description { append vlist [lindex $product($code) 1] }
                    image { append vlist [lindex $product($code) 2] }
                    stock { append vlist [lindex $product($code) 3] }
                    price { append vlist [lindex $product($code) 4] }
                }
            }
            return  $vlist
        }
        add {
            if {[llength $args] != 1 } { return {}}
            set valuelist [lindex $args 0]
            set product([lindex $valuelist 0]) [lrange $valuelist 1 end]
            return \"\"
        }
        delete {
            if { [llength $args] != 1 } { return {} }
            set keyvalue [lindex $args 0]
            unset product($keyvalue)
            return \"\"
        }
    }
}

3.1   Assessment

Assessment is as for Lesson 1. Hints should not be necessary, but adequate error dedection added into the design should earn extra credit.

4   Lesson 3 - Creating a simple selection function statement for a simple Table object

In this lesson we create a command that will be used to action a table record widget. This command returns a list if keys of records which satisfy the constraints. In this case the constraints are applied simply, field by field. They may be of several types, relative operators (=, <, >, <=, >=) folowed by a value for comparison, or tilda (~) followed by a glob-style pattern to match against. If the type is blank then no comparison is made.

The format of the command is:

tablename find { {type0 value0} {type1 value1} ...}
The section of code within the switch statement required for this is:"
proc product {method args } {
     global product condition keylist
 #     find {}
          if { [llength $args] != 1 } {return {} }
          set j -1 ; set fieldlist [lindex $args 0]; set condition 1
          foreach i $fieldlist {
              if { [llength $i] != 2 } continue
              set op [lindex $i 0]
              set value [lindex $i 1]
              if { $op == \"=\" } { 
                  if { $j == -1 } { 
                      append condition \" && \$key == $value\"
                  } else { append condition \" && (\[lindex \$product(\$key) $j]\ == $value)\" }
              }
              if { [regexp {<|>|==|<=|>=} $op] } { 
                  if { $j == -1 } { 
                      append condition \" && ( \$key $op $value )\"
                  } else { append condition \" && (\[lindex \$product(\$key) $j]\ $op $value)\" }
              }
              if { $op == \"~\" } {
                  if { $j == -1 } {
                      append condition \" && \[string match $value \$key \]\"
                  } else { append condition \" && \[string match $value \[lindex \$product(\$key) $j\]\]\" }
              }
              incr j
          }
          # now have the search criteria in condition
          set keylist {}
          foreach key [ array names product] {
              if { [expr $condition] }  { lappend keylist $key}
          }
          return $keylist
       }

5   Lesson 4 - Creating a simple RDB object

6   Lesson 5 - Creating Data Types

7   Lesson 6 - Extending the select function

8   Domains

Within any database certain fields are used in a bnumber of tables. The field represents a special kind of thing such as a product (product code), a client (client code), phone number and so on. These notions of what a feild represents are called entities. So we talk about the product entity, which is represented by all the product code fields in all the tables and the way that they are used. Added to this notion of entity is the table which represents the entity. This is a table in which the entity field is a primary key.

Most databases revolve around a small number of key entities, such as client, supplier, product, job. Other entities which are commonly understood are attached to these such as phone number or address. So that the client's telephone number is considered to be a part of the clients entity. Entity is as much a database concept as a particular piece of code that you cam point to.

Because an entity field, and other kinds of fields too, appear in different tables throughout a database. it makes sense to centralise all the attributes associated with the field for all the table. This is done using a domain. The domain is the record which holds all the attributes for a feild hwereever it may appear in the tables. There is a domain table which has an entry for every domain type that you wish to create which contains information such as:

There are a large number of things that we wish to do with a field in a general application and the domain can contain most of this information.

Most key entities in a database will have their own domain, because they tend to have special behaviour, for validation, update etc. Many other fields are sort of generic, and we create some generic domains to start with.

The behaviour of a domain is defined in the database at three levels.

Any domain that is unspecified (not found in the domain table) deafults to the char domain.

  1. The char, text and image domains are hard coded with default attributes (WHAT ARE THESE ?) There typically default to the defaults for the Tk widget attributes. (the tk widget attributes have default hard coded attributes which can be overridden by entries in the attribute database using [ option add ] )
  2. The domain(char) table record can have different attributes applied, such as widget width.
  3. The dbfield record for the field in the table can have overriding attributes put in its attrs field.
At present a domain relies on having just type (comparison) and widget ( display) fields set.

8.1   Domain type

Domain type is used to determine how field values are compared. You can only compare fields which have the same domain type. i.e. they use the same algorithm for comparison. Only char type is programmed at present.

8.2   Domain widget

The domain widget is the widget or html equivalent used to display the field for input purposes. This will typically be a combination of widgets placed in a frame with the name .table$table.$field

To provide your own widget type for input you need to provide procedure for the widget type with following functions:

Programmed widget types are entry, text and image.

To have a different type enter the name of a procedure which takes two arguments { a b } which it compares. The procedure must return a value < 0 if arg a < arg b, 0 if arg a == arg b, a value > 0 if arg a > arg b.

©2000 - 2006 WEBSCOOL This page last updated 11 May 2006. All rights reserved - including copying or distribution of any portion of this document in any form or on any medium without authorisation. For more regarding the copyright.