January 2003
An MPE Journey Down the Linux Road
QSS pilots its education app onto Open Source
By John Burke
Second of two parts
Last month we looked at the project plan, goals,
objectives, and the initial decisions which HP 3000 software vendor
Quintessential School Systems (QSS) is undertaking to move its
software from MPE/iX to Linux. QSS provides software and consulting
services to K-12 school districts and community college systems.
School systems are understandably cost-conscious
so for competitive reasons QSS had already started
investigating migrating its software to Linux before HP even
announced its intention in late 2001 to leave the HP 3000 market.
This put QSS ahead of most ISVs and non-ISVs in determining how to
migrate traditional HP 3000 COBOL and IMAGE applications.
At HP World 2002, QSS principal Duane Percox talked
on Migrating COBOL and IMAGE/SQL to Linux with Open
Source. I have combined material from Percoxs
presentation with an interview to create this article. This month, we
will look at how well the vendors test migration worked and
what QSS discovered along the way.
Moving to SQL The companys key first step was to
translate IMAGE calls to SQL. Figure 1 above shows the typical IMAGE
Access Model. A COBOL program opens a database, does a find by some
key-value, and then in a loop, gets records, selecting the desired
subset of records by testing values. The database access is tightly
coupled with the program and operating system. Figure 2 (at left) shows SQL access from a
COBOL program on the HP 3000 to an relational database on a Linux
server. QSS took this step as an intermediate proof of concept. All
the programming remained on the HP 3000, but the relational database
(PostgeSQL) was on a Linux server.
QSS took out the IMAGE calls and replaced them with
calls to its SQL interface library (QDBLIB written in gnu
c). This was all possible because PostgreSQL has been
ported to MPE and the databases native library routines run on
MPE. This allowed QSS to run its code just as if it was running its
IMAGE version. The only difference was that the programs were now
accessing a PostgreSQL database on a Linux server.
The high number of connections to attach to networked
database drove down performance. Performance was reasonable for
testing, Percox said, but not stellar due primarily to
the overhead of the SQL access versus IMAGE access and the TCP/IP
network transport. The technical topology was HP COBOL II to QDBLIB
to libpq to network to the PostgreSQL server process. Figure 3 (at left) shows the final step that
moved the application and its databases onto Linux. QSS ported QDBLIB
to Linux, took sample COBOL code to the Linux box and used TinyCOBOL
(www.tinycobol.org) to compile and run the same basic programs used
in the previous example. Moving everything to one environment made
the application much faster.
Performance in this model was very good,
Percox said, especially considering the Linux server was a
fairly basic IDE machine with little memory compared to what you
would use for a production system. Also, when you access the
relational database in this fashion, you are actually using shared
memory and not TCP/IP transport. This is transparent to your
application and handled completely by the libpq routines. This makes
it trivial to move our database to another server, since no software
modifications are required.
Porting the pilot
QSS took an existing database from its Fixed Assets
system and ported it to PostgreSQL. The primary focus of the pilot
project was the processing of a detail set with 70 fields and two
paths, an organization number and an organization number concatenated
with a 10-character asset number.
In order to simulate lots of database activity, QSS
wrote a test program that would prompt for 10 organization number
values. The test program then, in a loop for each entered
organization number, would select the set of records that matched
that organization number. It then wrote those records to a flat
file.
One issue that is not immediately obvious when you
consider migrating from MPE, COBOL, and IMAGE to Linux, COBOL, and a
relational database are the differences in the way data is stored.
Percox said typing makes up a lot of the differences.
Relational databases are strongly typed, and
IMAGE is not, he said. This meant we ended up
standardizing on these basic data types: CHAR[??], DATE, TIME,
NUMERIC. When you get a record from IMAGE, you pass a memory address
and IMAGE sends you a string of bytes starting at that memory
location. These bytes typically overlay a record definition so the
access is very fast. This is possible because of the weak data
typing, and the fact that IMAGE just treats the information as bits
and doesnt really care.
In contrast, he said, Linux relational
databases like PostgreSQL will return a 2-dimensional table of
c terminated strings. The row of the table is equivalent
to a record of the database, and the column is the same as a field.
These strings are in human-readable format and cannot be used by your
COBOL program unless they are converted to standard COBOL
variables.
For example, a number will be returned as
123.45. You will want to convert this to your record
buffer with the variable having a standard COBOL picture. This
conversion of result data is what consumes the additional CPU cycles
that cause slow database access. Consider a table of 10,000 rows and
70 columns. This would be 700,000 c strings that would
have to be converted. No wonder IMAGE is faster.
The conversion overhead shows up while using a
wrapper library, Percox explained. This is why using a dumb
IMAGE wrapper library can lead you into performance nightmare-land.
By not adjusting your source code in any way you have no way to take
advantage of the differences in the SQL engine. For example, instead
of returning all 10,000 rows, it might be better to include selection
criteria in the original SQL SELECT to reduce the number of rows
returned, thereby reducing the number of conversions required. Also,
do not return all columns, only the columns you need, thus
significantly reducing the number of field conversions.
Open source pros and cons
QSS set out to use low-cost, vendor-independent Open
Source solutions wherever possible. In the case of COBOL, there were
only two options, gnu COBOL and TinyCOBOL. Percox said that gnu COBOL
was clearly not ready and is moving very slowly.
QSS chose to use TinyCOBOL for the pilot. However,
Percox noted, We discovered that TinyCOBOL does not have very
good compiler error/warnings. We found it basically not usable for a
production development shop. Copylib members have to be separated out
into separate files in a directory. Compiler error messages report
the line number, but the line number is not accurate when you have
any copy members. Since our code is full of copy books this makes
finding source code errors nearly impossible. In summary, we found
TinyCOBOL was okay for test or demonstration projects, but not a
long-term choice for our migration.
QSS started out using Whisper Technologys
Programmer Studio and TinyCOBOL. Much of the QSS source had been
developed with Robelles Qedit and contained Qedit line tags
that Percox said TinyCOBOL mis-treated. I asked what QSS
was going to use for its actual migration effort.
We are currently debating this entire
issue, he said, and the debate centers around using CVS
for source code management. If we switch to CVS then line-tags become
a thing of the past and we use the built-in difference engine of
CVS.
Percox also pointed out that commercial COBOL
compilers on open systems seem to have the ability for the programmer
to define the source format, fixed column which would allow line tags
or free format which gets confused by line-tags. We have decided to
use Fujitsu NetCOBOL (www.netcobol.com) for the actual
migration.
I asked whether it was fair to call the work QSS had
done so far a feasibility study that turned into a migration plan.
Percox replied, I wouldnt put it like that. We
established a migration plan framework with various components. One
of the components of the plan was migrating to a SQL-92 compliant
relational database. A sub-part to that plan was evaluating the
ability of open source relational database to do work effectively. We
did various investigative works prior to this pilot that validated
the migration plan. The pilot project was used to establish
additional validation for the plan with an actual QSS database.
New developments and lessons
Finally, I asked if there were any developments since
Percox prepared his talk for HP World 2002 that QSS would like to
share. he responded, The primary development is the SAP DB Open
Source relational database, which is looking very promising.
With regard to Linux, as we continue to learn
more and more about the community developing products for Linux, we
are finding many great solutions for establishing high availability
deployments with reasonably low cost. By low cost I am speaking of
pricing in the single digit thousands for file systems and such that
provide for high availability automatic fail-over and the sharing of
resources in a clustered environment.
Percox said the savings in moving to Linux are
substantial over using a vendor-supplied Unix. What would cost
tens of thousands of dollars, if not hundreds of thousands of dollars
extra in the HP-UX world, is available in the Linux world for
significantly less. A number of these vendors had booths at HP World
and they were very knowledgeable about creating high availability
solutions using Linux on clustered commodity priced IA-32 based
servers.
I think the bottom line from the QSS experience is
that the combination of an Open Source operating system (Linux) and
an Open Source DBMS (PostgreSQL or SAP DB) is both a viable and
attractive migration target if you are migrating. If your source code
is COBOL, then you will probably have to purchase a commercial
compiler since the Open Source COBOL compilers are not ready for
prime time. Note that this experiment did not address interactive
screens that is the subject for another project.
My thanks go to QSS and Duane Percox for sharing the
QSS migration experience and lessons learned. QSSs focus
differs from 3000 customers looking to migrate or replace home-grown
applications. For QSS, buying a new package is not an option. Since
its customers tend to be very cost sensitive, an Open Source cost
structure has a lot of appeal for an ISV like QSS providing a
packaged solution. Customers looking to migrate homegrown
applications to other platforms might want to stay with commercial
operating systems, databases and compilers for the vendor support.
[Note: Percoxs slides are available to Interex
members at www.interex.org/conference/hpworld2002
/proceedings.]
John Burke is the founder of Burke Consulting and
Technology Solutions (www.burke-consulting.com),
which specializes in system management, consulting and outsourcing.
John has over 25 years experience in systems, operations and
development, is co-chair of SIGMPE, and has been writing regularly
about HP e3000 issues for over 10 years. You can reach him at john@burke-consulting.com.
|