Driving 3000
Information Over the DataBridge
Client-server
utility offers plenty of possibilities with powerful data moving
Review by Shawn Gordon
The product known for years as Warehouse
from Taurus Software has been slowly changing its name to DataBridge,
which refers to the Warehouse/Forklift combination. Forklift is
included with Warehouse, so there is no cost associated with the
change.
DataBridges purpose in life is to
allow you to easily move data around and reformat it. This can range
from the very simple to the very complex. It uses an easy-to-learn
scripting language to facilitate this process. With the addition of
the Forklift piece, you get a client-server front end that will allow
you to graphically build your maps and then generate scripts. See Figure 1 for a sample of a Forklift
mapping process.
By either writing scripts or using
Forklift to generate scripts you can easily (or not so easily)
generate code that will perform data movement and transformation.
There are plenty of built-in functions to do things like type
conversions, type checking and data parsing.
How does it work?
You can approach DataBridge in one of two
fashions. You can make use of the Forklift GUI client-server
interface to visually build your script see Figure 2 for an example of the main
screen. This method is really nice if you are doing something like
moving from one database to another because Forklift will pull the
record layout for tables and you can just draw lines between the
fields.
Forklift will generate a Warehouse script
that you can either execute on the PC and make remote connections
(which is slow), or upload to the HP 3000 and execute local (much
faster). You can also write the scripts yourself. The scripting
language is interpreted, but Warehouse makes a prepass at it first to
verify the syntax, so it seems to run faster than something that is
interpreting every line as it goes. The language is a kind of a
hybrid of features from C, Pascal and Basic. Dont let that put
you off, though; its pretty straightforward. Take a look at the
script generated from our mapping in Figure 1:
open FYIDB IMAGE fyidb PASS=READ MODE=5
open TEMP FIXED TEMP mode=w
format TEMP_FMT
MAIL-NAME : IMAGE X12
TERMINAL-NO : IMAGE I1
PRINTER : IMAGE X8
PHONE-EXT : IMAGE I1
DEPT : IMAGE X26
NODE : IMAGE X8
FLAGS : IMAGE X16
USER-PASS : IMAGE I1
USER-NAME : IMAGE X30
TIME-ON : IMAGE I2
DATE-ON : IMAGE I1
FLAGS2 : IMAGE X80
end
define
TEMP_REC : format TEMP_FMT
read USER-M_FLOW_R = FYIDB.USER-M for MAIL-NAME = "SMG"
setvar TEMP_REC.MAIL-NAME = MAIL-NAME
setvar TEMP_REC.TERMINAL-NO = USER-PASS
setvar TEMP_REC.PHONE-EXT = TERMINAL-NO
setvar TEMP_REC.NODE = NODE
setvar TEMP_REC.USER-NAME = PRINTER
setvar TEMP_REC.TIME-ON = TIME-ON
setvar TEMP_REC.DATE-ON = DATE-ON
copy TEMP_REC to TEMP
endread
The only reason we are naming fields in
the script is because we are not doing a straight map of all fields.
If you wanted to do something a little more complex with the
information, like total up a variable for all the records, you could
do something like:
setvar total-bill = total-bill +
(numeric(str(MY-BUFF[3],26,9)) * 100)
What we are doing here is using the
str function to pull out nine bytes, starting from byte
26. MY-BUFF is an array of four items, so we are getting the third
item from the array. Finally, we convert the string to a numeric and
multiply by 100 to get rid of the decimal.
Features
Forklift provides a very simple
point-and-click function for building maps. You need to know the name
and location of the data structure so you will supply a logon
in the case of the HP 3000, the IP address, the base name and
password. Forklift will then pull in the data source and you can pick
data sets and items from there. You can have a single source going to
multiple locations as well. Some of the terminology is odd at first,
but once you use Forklift for a bit, it makes sense. I gave it to a
contractor with the manual and after a day he knew almost as much
about it as I did, so its pretty easy to learn.
All the Warehouse functions are available
in ForkLift from a picklist, so you can build up as nasty or complex
set of nested functions as you wish, just by clicking. Its
pretty fun, actually. I was able to knock out useful scripts within
just a few minutes without having to type anything, and I like
that.
If you choose to write scripts directly,
then you should grab your favorite editor and start typing, then
execute it from Warehouse to run it. I like to have small scripts
lying around to test concepts, so I can make sure Im on the
right track. I like to do this with COBOL as well. Since Warehouse
can execute a script at any point in its processing, we like to
create global routines and record layouts as external files and then
include them in our script, like you would in COBOL with the $INCLUDE
statement. This allows us the obvious advantage of code and record
layout reuse.
Warehouse supports an ability to create
your own functions that can be used in expressions. While I
havent built any yet, I can see how powerful this would be.
Im a big fan of COBOL macros, and I can see creating custom
Warehouse functions in the same way.
Overall, Warehouse makes for nice
shorthand. One of the neat features is the way you can create record
structures, and this is one of its similarities to the C language.
You can define a record type and then declare a variable that is of
that type. So if you have a record structure that you need to reuse,
you can define it multiple times with different names. This is one of
those features that is really handy at times, but isnt so well
documented.
Installation and Documentation
Installation of the HP software is cute.
They restore one huge executable which you run. Its a
self-extracting archive that creates the correct accounting structure
and puts the files in place. Ive never seen anyone build a
self-extracting archive file on the HP 3000 before, so I found it
fascinating.
Documentation is probably the only weak
part of the product. There is a tutorial guide that does an admirable
job of giving you the basics, but the reference guide is rather oddly
arranged, and the examples, especially on the functions and how to
create user defined functions, are not so great. Some of the
functions are explained in great detail, with plenty of examples, but
a host of others are difficult to pull out from what is explained.
This is my only real source of complaint with the product.
The Test Drive
Ive actually used Warehouse for
several years now for different types of projects. My first project
allowed the developers to copy client-specific data off our main
system and database into a test database on a test system. I set up a
little MPE file where they could enter a range and/or list of client
numbers and a date range for history information. Using the network
interface of DataBridge, it would read the data across the network
and populate the local database. The extent of the code for each
table basically looked like this:
open TRACE REMOTE SMGANET user=mgr.smga
IMAGE TRACE PASS=READ MODE=5
open TRACEL IMAGE TRACE.DB PASS=READ
MODE=5
read RUNM = TRACE.RUN-M for CLIENT-ID =
PG
copy RUNM to TRACEL.RUN-M
endread
Now youve got to admit thats
easy. A neat feature that I ran into by accident allows you to change
datatypes in one of the data sets and do a copy. DataBridge will
figure it out and tell you and take care of the conversion. I had
converted all I2 fields to I4 in the target base and forgotten that
we had the change, and went to copy the data and got the message from
DataBridge, and everything copied nicely.
A new way that Im using DataBridge
is in manipulating data that is electronically transferred to us.
Everyone sends data in different formats, but we have to put it into
a single format. So we create custom scripts for each layout to put
it into a standardized format, then run a COBOL program to do the
load. We use a COBOL program at this point because we have some
serious edit-checking and reporting to do. Sometimes we go directly
to the database, depending on the data.
We also go the other direction on this and
have a COBOL program extract the data to a standard format, then use
client-specific scripts to put the data into the requested format. By
putting the script names in the database and writing a loop in the
job with MPEX, we are able to create processes that dont ever
need to be updated if clients come or go.
Weve come up with a variety of
really bizarre scripts, but weve been able to do pretty much
everything weve wanted to. I did find that using Forklift was
great for getting started, but in general you can move a lot faster
once you know what youre doing. I still go to it every once in
a while its a pretty handy tool and a decent way to
learn.
Conclusions
There are tremendous possibilities opened
up by using a tool like DataBridge. Initially you might look at it
and think that you can just write programs to do what it does, and
you would be right. The beauty of DataBridge is its ability to easily
move data between different types of storage, and how quickly you can
generate code. You could consider it a batch-oriented 4GL, but it
doesnt really carry the baggage of a conventional 4GL, in that
its performance is excellent.
There is a learning curve involved, but if
you are familiar with C, Pascal, or even CI programming, it will
help. What throws most COBOL programmers off is the nested functions.
Once they get over that hurdle, the light comes on and they start
smoking through it.
Hopefully Ive shown you some
non-data warehouse uses for DataBridge. Its certainly useful
for creating and maintaining data warehouses, and the BridgeWare
component for real-time data warehousing is quite a technical feat
that will make your historical data that much more useful. If you are
looking into data movement at all, or some of the examples Ive
described here, I would recommend that you check out DataBridge.
Its a powerful tool with lots of possibilities.
Shawn Gordon, whose S.M. Gordon &
Associates firm supplies HP 3000 utilities, has worked with 3000s
since 1983. |