An MPE Journey Down the Linux Road

Click for WRQ Sponsor Page

January 2003

An MPE Journey Down the Linux Road

QSS pilots its education app onto Open Source

By John Burke

Second of two parts

Last month we looked at the project plan, goals, objectives, and the initial decisions which HP 3000 software vendor Quintessential School Systems (QSS) is undertaking to move its software from MPE/iX to Linux. QSS provides software and consulting services to K-12 school districts and community college systems.

School systems are understandably cost-conscious — so for competitive reasons QSS had already started investigating migrating its software to Linux before HP even announced its intention in late 2001 to leave the HP 3000 market. This put QSS ahead of most ISVs and non-ISVs in determining how to migrate traditional HP 3000 COBOL and IMAGE applications.

At HP World 2002, QSS principal Duane Percox talked on “Migrating COBOL and IMAGE/SQL to Linux with Open Source.” I have combined material from Percox’s presentation with an interview to create this article. This month, we will look at how well the vendor’s test migration worked and what QSS discovered along the way.

Moving to SQL

The company’s key first step was to translate IMAGE calls to SQL. Figure 1 above shows the typical IMAGE Access Model. A COBOL program opens a database, does a find by some key-value, and then in a loop, gets records, selecting the desired subset of records by testing values. The database access is tightly coupled with the program and operating system.

Figure 2 (at left) shows SQL access from a COBOL program on the HP 3000 to an relational database on a Linux server. QSS took this step as an intermediate proof of concept. All the programming remained on the HP 3000, but the relational database (PostgeSQL) was on a Linux server.

QSS took out the IMAGE calls and replaced them with calls to its SQL interface library (QDBLIB – written in gnu ‘c’). This was all possible because PostgreSQL has been ported to MPE and the database’s native library routines run on MPE. This allowed QSS to run its code just as if it was running its IMAGE version. The only difference was that the programs were now accessing a PostgreSQL database on a Linux server.

The high number of connections to attach to networked database drove down performance. “Performance was reasonable for testing,” Percox said, “but not stellar due primarily to the overhead of the SQL access versus IMAGE access and the TCP/IP network transport. The technical topology was HP COBOL II to QDBLIB to libpq to network to the PostgreSQL server process.”

Figure 3 (at left) shows the final step that moved the application and its databases onto Linux. QSS ported QDBLIB to Linux, took sample COBOL code to the Linux box and used TinyCOBOL (www.tinycobol.org) to compile and run the same basic programs used in the previous example. Moving everything to one environment made the application much faster.

“Performance in this model was very good,” Percox said, “especially considering the Linux server was a fairly basic IDE machine with little memory compared to what you would use for a production system. Also, when you access the relational database in this fashion, you are actually using shared memory and not TCP/IP transport. This is transparent to your application and handled completely by the libpq routines. This makes it trivial to move our database to another server, since no software modifications are required.”

Porting the pilot

QSS took an existing database from its Fixed Assets system and ported it to PostgreSQL. The primary focus of the pilot project was the processing of a detail set with 70 fields and two paths, an organization number and an organization number concatenated with a 10-character asset number.

In order to simulate lots of database activity, QSS wrote a test program that would prompt for 10 organization number values. The test program then, in a loop for each entered organization number, would select the set of records that matched that organization number. It then wrote those records to a flat file.

One issue that is not immediately obvious when you consider migrating from MPE, COBOL, and IMAGE to Linux, COBOL, and a relational database are the differences in the way data is stored. Percox said typing makes up a lot of the differences.

“Relational databases are strongly typed, and IMAGE is not,” he said. “This meant we ended up standardizing on these basic data types: CHAR[??], DATE, TIME, NUMERIC. When you get a record from IMAGE, you pass a memory address and IMAGE sends you a string of bytes starting at that memory location. These bytes typically overlay a record definition so the access is very fast. This is possible because of the weak data typing, and the fact that IMAGE just treats the information as bits and doesn’t really care.”

In contrast, he said, “Linux relational databases like PostgreSQL will return a 2-dimensional table of ‘c’ terminated strings. The row of the table is equivalent to a record of the database, and the column is the same as a field. These strings are in human-readable format and cannot be used by your COBOL program unless they are converted to standard COBOL variables.

“For example, a number will be returned as ‘123.45’. You will want to convert this to your record buffer with the variable having a standard COBOL picture. This conversion of result data is what consumes the additional CPU cycles that cause slow database access. Consider a table of 10,000 rows and 70 columns. This would be 700,000 ‘c’ strings that would have to be converted. No wonder IMAGE is faster.”

The conversion overhead shows up while using a wrapper library, Percox explained. “This is why using a dumb IMAGE wrapper library can lead you into performance nightmare-land. By not adjusting your source code in any way you have no way to take advantage of the differences in the SQL engine. For example, instead of returning all 10,000 rows, it might be better to include selection criteria in the original SQL SELECT to reduce the number of rows returned, thereby reducing the number of conversions required. Also, do not return all columns, only the columns you need, thus significantly reducing the number of field conversions.”

Open source pros and cons

QSS set out to use low-cost, vendor-independent Open Source solutions wherever possible. In the case of COBOL, there were only two options, gnu COBOL and TinyCOBOL. Percox said that gnu COBOL was clearly not ready and “is moving very slowly.”

QSS chose to use TinyCOBOL for the pilot. However, Percox noted, “We discovered that TinyCOBOL does not have very good compiler error/warnings. We found it basically not usable for a production development shop. Copylib members have to be separated out into separate files in a directory. Compiler error messages report the line number, but the line number is not accurate when you have any copy members. Since our code is full of copy books this makes finding source code errors nearly impossible. In summary, we found TinyCOBOL was okay for test or demonstration projects, but not a long-term choice for our migration.”

QSS started out using Whisper Technology’s Programmer Studio and TinyCOBOL. Much of the QSS source had been developed with Robelle’s Qedit and contained Qedit line tags that Percox said TinyCOBOL “mis-treated.” I asked what QSS was going to use for its actual migration effort.

“We are currently debating this entire issue,” he said, “and the debate centers around using CVS for source code management. If we switch to CVS then line-tags become a thing of the past and we use the built-in difference engine of CVS.”

Percox also pointed out that “commercial COBOL compilers on open systems seem to have the ability for the programmer to define the source format, fixed column which would allow line tags or free format which gets confused by line-tags. We have decided to use Fujitsu NetCOBOL (www.netcobol.com) for the actual migration.”

I asked whether it was fair to call the work QSS had done so far a feasibility study that turned into a migration plan. Percox replied, “I wouldn’t put it like that. We established a migration plan framework with various components. One of the components of the plan was migrating to a SQL-92 compliant relational database. A sub-part to that plan was evaluating the ability of open source relational database to do work effectively. We did various investigative works prior to this pilot that validated the migration plan. The pilot project was used to establish additional validation for the plan with an actual QSS database.”

New developments and lessons

Finally, I asked if there were any developments since Percox prepared his talk for HP World 2002 that QSS would like to share. he responded, “The primary development is the SAP DB Open Source relational database, which is looking very promising.

“With regard to Linux, as we continue to learn more and more about the community developing products for Linux, we are finding many great solutions for establishing high availability deployments with reasonably low cost. By low cost I am speaking of pricing in the single digit thousands for file systems and such that provide for high availability automatic fail-over and the sharing of resources in a clustered environment.”

Percox said the savings in moving to Linux are substantial over using a vendor-supplied Unix. “What would cost tens of thousands of dollars, if not hundreds of thousands of dollars extra in the HP-UX world, is available in the Linux world for significantly less. A number of these vendors had booths at HP World and they were very knowledgeable about creating high availability solutions using Linux on clustered commodity priced IA-32 based servers.”

I think the bottom line from the QSS experience is that the combination of an Open Source operating system (Linux) and an Open Source DBMS (PostgreSQL or SAP DB) is both a viable and attractive migration target if you are migrating. If your source code is COBOL, then you will probably have to purchase a commercial compiler since the Open Source COBOL compilers are not ready for prime time. Note that this experiment did not address interactive screens – that is the subject for another project.

My thanks go to QSS and Duane Percox for sharing the QSS migration experience and lessons learned. QSS’s focus differs from 3000 customers looking to migrate or replace home-grown applications. For QSS, buying a new package is not an option. Since its customers tend to be very cost sensitive, an Open Source cost structure has a lot of appeal for an ISV like QSS providing a packaged solution. Customers looking to migrate homegrown applications to other platforms might want to stay with commercial operating systems, databases and compilers for the vendor support.

[Note: Percox’s slides are available to Interex members at www.interex.org/conference/hpworld2002 /proceedings.]

John Burke is the founder of Burke Consulting and Technology Solutions (www.burke-consulting.com), which specializes in system management, consulting and outsourcing. John has over 25 years experience in systems, operations and development, is co-chair of SIGMPE, and has been writing regularly about HP e3000 issues for over 10 years. You can reach him at john@burke-consulting.com.