CGI Programming with Tcl

written August 2000
Google site search

ContentsIntroductionBasicIntermediateAdvancedFuturePolicyInfrastructure

1   The Server

A CGI program is like a seperate program that is invoked by the webserver. Different platforms and servers require different ways for the CGI program to be started up. It has its own memory space independant of the server except via the environment discussed in the next section.

Under Windows it is easiest to write your Tcl CGI scripts as .tcl files so that they are automatically executed by wish although you usually do not use Tk features in CGI scripts.

Under Linux write your scripts as .cgi files with an invocation to exec tclsh. If they are written as .tcl they will likely be interpreted as plugin tclets. (Tclets expect to be displayed in a window and need to use Tk widgets to display information instead of [puts].) Put the script in the servers cgi directory, usually "cgi" or in the case of the tclhttpd server "cgi-bin". Look at any samples already in this directory for examples on how to start your script and access the environment. (First check that the scripts provided work.)

2   The Environment

When a CGI program starts it has data provided to it from a number of sources. What happens is that the client has done one of several things:

In the last two cases data will be associated with the names of the INPUT fields that were used. In all three cases a cookie resident on the client's computer may also be returned. However the cookie will have to have been created by one of your CGI scripts to start with.

The data is provided to the Tcl script through the env array. In particular we have the following array elements:
Environment VariableDescription
SERVER_SOFTWARE is the name and version of the Web Server answering the request.
SERVER_NAME is the server's hostname, DNS alias, or IP address as it would appear in self-referencing URLs.
GATEWAY_INTERFACE is the revision of the CGI sepcification to which the server complies.
SERVER_PROTOCOL is the name and revision of the protocol this request came in with.
SERVER_PORT specifies port to which the request was sent.
REQUEST_METHOD is the method with which the request was made: "GET", "POST" etc.
QUERY_STRING is defined as anything following the first '?' in the URL. Typically this data is the encoded results from your GET form. The string is encoded in the standard URL format changing spaces to +, and encoding special characters with %xx hexadecimal encoding.
PATH_INFO The extra path information, as given by the client.
PATH_TRANSLATED The server provides a translated version of PATH_INFO, which takes the path and does a virtual-to-physical maping to it.
SCRIPT_NAME is a virtual path to the script being executed.
REMOTE_HOST is the host name making the request. If DNS lookup is turned off, the REMOTE_ADDR is set and this variable is unset.
REMOTE_ADDR is IP address of the remote host making the request.
CONTENT_LENGTH is length of any attached information from an HTTP POST.
CONTENT_TYPE is the media type of the posted data (usually application/x-www-form-urlencoded).

The URL 'CGITcl/CGITcl1.cgi?name=webs cool' lists the values of these variables. Note that a parameter name with a value of webs%20cool appears in the location bar of your browser.

There are a lot of environment variables and not all of them are to do explicitly with the HTTP request for a web page. The particular ones of interest are:

When a form is submitted using the post rather than the get method, the parameter data arrives via a different variable. In this example the same program CGITcl1.cgi is being invoked, and it will do the same job. In this case the parameter name with the value webs cool has been hidden in an input field of type='hidden' so it does not appear on this page. All that you can see is the submit button.

Here is the xhtml source for the form.
<form action='CGITcl/CGITcl1.cgi' method='post' />
<input type='hidden' name='name' value='webs cool' />
<input type='submit' value='SUBMIT THIS FORM' />
</form>
Notice this time that the QUERY_STRING is empty and that the REQUEST_METHOD is POST and the CONTENT_LENGTH is 14. Now the CONTENT_TYPE is present which was not the case with the GET method. To get the posted data it is necessary to read it from the stdin file. The number of bytes in the CONTENT_LENGTH should be read and if that is not present just read one line from stdin. Here is a code fragment from a CGI package which covers all the bases and which you can copy if you ever need to do this kind of thing by hand.
if {[info exists env(QUERY_STRING)] && [set query $env(QUERY_STRING)] != ""} {
		# This comes from GET-style forms
	} elseif {[info exists env(CONTENT_LENGTH)]} {
		# This comes from POST-style forms
		set query [read stdin $env(CONTENT_LENGTH)]
	} else {
		# No content-length so it is only safe to read one line
		gets stdin query
	}
Notice that in this case webs cool has been passed to the program as webs+cool. In this case the information has been encoded, and one of the things that gets coded are spaces which are replaced with "+"s.

3   Using the cgi package

There is so much detail is checking the environment options and decoding the data that programmers use code already written to do this work. One such package of procedures is cgi written by Don Libes.

Decoding the query data is best done by using a cgi package such as cgi. You will probably have a procedure [Url_Decode] in your Tcl cgi package to decode the query values to their original form. Both encodings do similar things. The GET needs to remove all special characters from the url string, and so changes them to hexadecimal equivalents. The POST method also does a couple of special things. If you are using the GET method to send a string with special characters, you should encode the string first using [Url_Encode]. Here is the result of a [info body Url_Encode] command:


    global UrlEncodeMap 
    regsub -all \[^a-zA-Z0-9\] $string {$UrlEncodeMap(&)} string
    regsub -all \n $string {\\n} string
    regsub -all \t $string {\\t} string
    regsub -all {[][{})\\]\)} $string {\\&} string
    return [subst $string]

To see what UrlEncodeMap is about try [global UrlEncodeMap; array get UrlEncodeMap].
 %01  %02  %03  %04  %05  %06  %07  %08 {	} %09 {
} %0d%0a {} %0b {} %0c {
} %0d  %0e  %0f  %10  %11  %12  %13  %14  %15  %16  %17  %18  %19  %1a  %1b  %1c  %1d  %1e  %1f { } + ! %21 {"} %22 # %23 {$} %24 % %25 & %26 ' %27 ( %28 ) %29 * %2a + %2b , %2c - %2d . %2e / %2f : %3a {;} %3b < %3c = %3d > %3e ? %3f @ %40 ? %80  %81 ? %82 ? %83 ? %84 ? %85 ? %86 ? %87 ? %88 {[} %5b ? %89  %c0 \\ %5c ? %8a  %c1 \] %5d ? %8b  %c2 ^ %5e ? %8c  %c3 _ %5f  %8d  %c4 ` %60 ? %8e  %c5  %8f  %c6  %90  %c7 ? %91  %c8 ? %100 ? %92  %c9 ? %93  %ca ? %94  %cb ? %95  %cc ? %96  %cd ? %97  %ce ? %98  %cf ? %99  %d0 ? %9a  %d1 ? %9b  %d2 ? %9c  %d3  %9d  %d4 ? %9e  %d5 ? %9f  %d6  %a0  %d7  %a1  %d8  %a2  %d9  %a3  %da  %a4  %db  %a5  %dc  %a6  %dd  %a7  %de  %a8  %df \{ %7b  %a9  %e0 | %7c  %aa  %e1 \} %7d  %ab  %e2 ~ %7e  %ac  %e3  %7f  %ad  %e4  %ae  %e5  %af  %e6  %b0  %e7  %b1  %e8  %b2  %e9  %b3  %ea  %b4  %eb  %b5  %ec  %b6  %ed  %b7  %ee  %b8  %ef  %b9  %f0  %ba  %f1  %bb  %f2  %bc  %f3  %bd  %f4  %be  %f5  %bf  %f6  %f7  %f8  %f9  %fa  %fb  %fc  %fd  %fe  %ff
%08 is tab, %09 is newline and %0c is carriage return.

Notice that your webbrowser still fits all the normal text into the width of the browser window even though the <pre> block has a long line in it that does not fit.

4   Cookies

A cookie is a message which resides on the web users computer. These messages can be created, read and updated by webservers. Cookies are used to contain information that the user may need to "remember" between webbrowsing sessions, or to accumulate information during a session. Examples are:

A cookie is created by including a "Set-Cookie:" line in the http prologue of a webpage. This is a line generated by CGI programs along with the "Content-type:" line. A cookie has several important attributes which all need to be specified for the cookie to work properly.

5   Uploading Files

The basic HTML to upload files is:
<form action='CGITcl/upload.tcl' enctype='multipart/form-data' >
<input type='file'  name='file' />
<input type='submit' >
</form>
The enctype='multipart/form-data' must be included. The input type=file can have an accept atribute which can be used to restrict the filetype. e.g.
accept='application/msword, application/rtf'

Unfortunately Tcl 8.4.4 has a bug (now reported) in ::ncgi::import_file which is fixed in the file upload.tcl which is listed here.

puts "Content-type:  text/html; \n

<html><head><title>File upload</title></head>
<body><h3>File upload Example</h3>"

puts "[package require ncgi]"


###################
proc ::ncgi::import_file { cmd  var filename } {
    set vlist [ncgi::valueList $var]
   # pre - $vlist ; bug line
    set vlist [ lindex $vlist 0 ] ; bug fix
    array set fileinfo [lindex $vlist 0]
    set contents [lindex $vlist 1]

    switch -exact -- $cmd {
        -server {
            ## take care not to write it out more than once
            global ncgi::_tmpfiles
            if {$filename != {}} {
                ## use supplied filename 
                set ncgi::_tmpfiles($var) $filename
            } elseif {![info exists ncgi::_tmpfiles($var)]} {
                ## create a tmp file 
                set tmpfile [::fileutil::tempfile ncgi]
                if {[catch {open $tmpfile w} h]} {
                    error "Can't open temporary file in ncgi::import_file"
                } 
                fconfigure $h -translation binary -encoding binary
                puts -nonewline $h $contents 
                close $h
                set ncgi::_tmpfiles($var) $tmpfile
            }
            return $ncgi::_tmpfiles($var)
        }
        -client {
            return $fileinfo(filename)
        }
        -type {
            return $fileinfo(content-type)
        }
        -data {
            return $contents
        }
        default {
            error "Unknown subcommand to ncgi::import_file: $cmd"
        }
    }

}

###########


::ncgi::parse

if { [catch {  ::ncgi::import_file -client file  "" } client ]} {
    puts "error:$client</body></html>"
    exit
}
set fo [open [ file tail $client ] w ]
puts $fo [::ncgi::import_file -data file "" ]
close $fo

    puts "File $client saved in CGITcl directory."

puts "</body></html>
"

exit

©2000 - 2006 WEBSCOOL This page last updated 11 May 2006. All rights reserved - including copying or distribution of any portion of this document in any form or on any medium without authorisation. For more regarding the copyright.