Low-tech approaches to information sharing - general

(draft 17 May 2006)

 

 

Introduction

 

General theme of the advantages of simple, inexpensive, universal approaches to be elaborated here.

 

 

Spreadsheets for basic data storage and exchange

 

In a number of other items in the CS section of my menu, I have extolled the advantages of spreadsheets for storing, using, transmitting and sharing tabulated information in the electronic equivalent of a huge easily edited filing index card.

 

To recap:

 

● They are universally available and sufficient for data storage and exchange and for carrying out a lot of analysis and interpretation. Almost everybody will already have spreadsheet software as part of a general “entry-level” software package whereas “relational databases” (see Relational databases [forthcoming]) that do cleverer things in more elaborate ways come as more expensive add-ons that fewer other people can use.

 

● Reading spreadsheets involves essentially only the application of the knowledge of the text-processing command menus that someone who writes manuscripts in any Windows-based text software already has.

 

● Simple data entry in spreadsheets involves very little more on top of transferring across what you know from the text processing software.

 

For the “letter writer’s guide to spreadsheets” see Spreadsheets for people who don’t already know that they already know how to use them.

 

 

The clincher is that the spreadsheets can be easily imported into relational database software that does cleverer things but is more expensive, held by and known to only a proportion of potential collaborators and which involves a bigger learning curve than the transition from text processing to spreadsheets. Thus spreadsheets have all the advantages of universality, cheapness and simplicity but close off no options and can be easily incorporated into more complex applications.

 

It works both ways. The relational database software can export information to spreadsheets. Spreadsheets, available to everybody, can therefore be imported to relational databases to carry out operations that spreadsheets cannot do (or do as efficiently) and the results exported back into spreadsheets that can be used by everybody. Through this combination of software, several people who have only spreadsheet software and spreadsheet operating knowledge can collaborate with someone who is geared up for relational databases (perhaps the librarian or archivist at your local maritime museum with whom you want to share your work in any case, or maybe even me as the girth of the world is no great barrier with available inexpensive exchange media).

 

Accordingly, I recommend that people who have relational database software and are comfortable with it still develop and keep their basic data in spreadsheets in order to maximise information sharing opportunities. Alternatively, conduct experiments and keep your database file formats within the limits of what will work when exported to spreadsheets. It is always easy to make stuff more complicated and to add bells and whistles, bright primary colours and flashing lights; much harder to reverse the process. You don’t hear the KISS – Keep It Simple Stupid – mantra the way you once did but it is just as relevant as ever - arguably more so in these days of computer software that tries to be user-friendly by double-guessing what it thinks you probably want it to do to a degree that can be actually counter-productive.

 

In a later item in this section of my menu I provide a guide to using spreadsheets in conjunction with relational database software. (forthcoming)

 

 

The downside of Excel (Bill Gates please note)

 

A “fly in the ointment” with Microsoft Excel spreadsheets is that Microsoft apparently believes that history started in 1900. Many a genealogist and historian has complained bitterly because Excel does not permit you to write earlier dates as “datevalues” that you can use in computations. You can enter them as text but then they won’t sort as dates and you can’t subtract one from another (eg departure date from arrival date to calculate duration of passage). You can enter day, month and year in separate columns and sort by year/month/day but that isn’t very convenient especially as Excel only offers you a three-way sort (Quattro Pro gives you five). (A desperate expedient is to add 100 years to 19th century dates with a global warning to the reader to subtract 100 years which will only matter where leap years affect the use but one should not be driven to such expedients.) For most maritime applications the year alone is sufficient but it will be a problem with shipping arrivals and departures and port registration dates.

 

The problem has implications for copying to and fro between spreadsheets and relational databases and limits the use of all Microsoft products as a combined package for any historical purpose. Microsoft could surely fix this overnight if it wanted to. If you ever meet Bill Gates explain this to him nicely.

 

Quattro Pro

 

Quattro Pro is another spreadsheet package with the advantages of accepting pre-1900 dates as datevalues, permitting five-way sorts and Revealing rather than “Unhiding” hidden columns for those who speak English. My old QP copy will only accommodate 8,192 rows which rules it out for the 20-50,000 row files I now use routinely. The universality of Microsoft products makes it hard to exchange information as readily if you don’t speak Microsoft.

 

Excel and Quattro Pro will import from and export to each other but only in old formats that will not accommodate fancy formatting – not necessarily a bad thing. (Quattro Pro pre 1900 datevalues will only translate in Excel as “#VALUE” which is useless of course but Excel’s fault not Quattro’s.) If Quattro Pro works for you by all means stick with it. It will export to Excel at the v5/v7 level at least but Excel will only export to it at the DOS level. You can do it but sharing information this way is a bit like trying to persist with speaking and writing your dialect of English in another country where most other people speak another. For me, the clincher is the number of rows. There may well be a later version of QP that gives you as many. The last time I checked QP was sold bundled with WordPerfect by Corel.

 

It may be possible to export datevalues from Quattro Pro to Paradox relational databases (once if not now sold bundled with Quattro Pro) and to exchange datevalues between Paradox and Microsoft Access.

 

 

Designing your spreadsheets to be importable to relational databases

 

A few simple guidelines will suffice to ensure that spreadsheets can be incorporated in relational database applications without bother. The principal thing is to keep numerical data columns strictly numerical (apart from the column heading). In other words, put only numbers in year, tonnage, dimensions, horsepower and similar columns. If you need to qualify a number with some words – for example that you put the year in which it was rebuilt in the built column as that is the only build information you have – set up one or more dedicated “Notes” columns for such qualifications to segregate text from numbers. This isn’t only for the convenience of database software as it is equally important to numerical procedures within the spreadsheet itself. You cannot add 900(net) to other numbers or use it in calculating averages or subtract 1883(rebuilt) from a FATEYEAR of 1903 to calculate its life as 20 years. It is not a problem including numbers as well as text in columns like ship names although they will sort as text rather than numbers (thus FRED 10 will sort between FRED 1 and FRED 2 rather than after FRED 9).

 

 

CD-ROM’s for inexpensive information exchange

 

A following item in similar “low-tech” vein discusses this option. Click here to go to it directly.

 

to return to main maritime menu click here or use back arrow to return