|
October 2004 Data Manipulation Off the HP 3000
Migration Blues Verse 3: Migrating or presenting, manipulate with Microsoft tools By Roy Brown Whether you are migrating data, or just wanting to present it in a more portable format, be aware of how you can manipulate it using those all-pervasive Microsoft tools. When your consulting role takes you across a wide range of HP 3000 sites, you rapidly learn that not everybody has all the add-on tools you might like to see Qedit, MPEX, Adager, Suprtool, and so on. You can rely on whats in FOS, but there are a bunch of things you are brought up short by, that are not so easy without the armory above. So, when I needed to extract and massage data from a bare or nearly bare HP 3000, I pretty soon learned to rely on what I could bring to bear from my laptop, equipped with Reflection and the MS Office suite. Actually, the product I really missed isnt one I listed above its DataExpress, now MBF-UDALink, from MB Foster. Perhaps because Ive never quite mastered its rather quirky interface, I find its often easier to rewrite a query than to modify one. But they are so quick to write that it really doesnt matter especially for multi-set, multi-key extracts. And as it can make your data extract, put it in the format of your choice, and transfer it to your PC via your termulator, all in one go, it lets you skip a whole bunch of what I describe below; stuff you need to do only if all you have is FOS in this area. Extraction Mostly, when grabbing stuff on an ad-hoc basis, I like to list it out in Query, and watch it scroll by in Reflection, with logging to a PC file turned on. I know that I could file equate the output to QSLIST with DEV=DISC, make a file and copy it that way, if I wanted. But this way, I get to see problems as it runs. And if it runs okay, its already on the PC for me. I use Query because its always there. I figure I dont need to do a Query tutorial here though you can email me at roy@kelmscott.co.uk if youd like a copy of one but suffice to say that you can usually walk the paths you need, and pick up the data you want. I generally set LINES=0 or NOPAGE, and I pay attention to numeric field formatting with Edit masks where needed, but I only output Detail lines. Dates I leave in CCYYMMDD format, just as they come. And I dont need to do any math I can save that until Im in Office. But I do hit the 80-character line limit, which is where the first neat WinWord trick kicks in. I use multiple lines, and I mark the end of each line except the last with a string like ### - something that I know wont ever occur naturally in the data ending at position 80. Formatting in WinWord, Part 1 Then I use WinWord to open the .txt file that Reflection has built for me on the PC, and Edit/Replace to change ###^p to one space throughout. Bingo! One long line per returned entry. How does this work? Well, ^p (caret p) is WinWords code for a paragraph mark, which is how each line in the data is terminated. So Im saying Find each line ending in ###, and chop off not just these characters, but also the line ending itself. And then put a space in, to make sure that doesnt cause two fields to run together. If you open Edit/Replace, choose the More tab and then the Special tab, you will be able to see the list of formatting characters you can search for and replace. Paragraph Mark is at the top, and right below it is Tab Character; click this, and youll see caret-t appear in the Find what: window. There are 20 options there in all, but ^p and ^t cover pretty much 99% of what I need to do. I top and tail my output file to remove the original query lines, and the >end at the end. Sometimes, I might then sort it, with Table/Sort and the default options there, to get the detail records in order. But generally, WinWords sort runs out of steam with a file that is more than a few megabytes, so I wait until Im in Excel. So I Save As on the WinWord file, taking care to keep it as a .txt file, and ignoring WinWords warnings about losing formatting if I do. WinWords formatting is exactly what I dont need; .txt is the most versatile format for use here. This excursion into WinWord has really just been to de-block the detail lines, a task which is straightforward here, but nigh-on impossible in Excel. But well find, in turn, that there are things which Excel can easily do for us, while in WinWord they would be nigh-on impossible. Formatting in Excel Next, I open up Excel, and File/Open my .txt file, setting Files of type to All Files so I can see it. Excel comes up with its Text Import Wizard, a most powerful tool that lets me break my file up into individual fields. The option to use here is Fixed Width, and Excel guesses where the field breaks are. It usually does this very well, providing the fields have spaces between them. But if not, I can add, remove or move the suggested breaks. Moving on with Next, I can set the format for each field. Again it usually guesses these right, except for those CCYYMMDD dates. But for those, I just choose Date format, and the YMD option, and it will convert them to Excel dates. On completing the Wizard, I have my data, field by field, neatly arranged in Excel columns. By the way, I could have just Copied from my WinWord file, and Pasted into Column A of the Excel sheet, and then split that. The wizard is available under Data/Text to Columns, for just that purpose. Part 2: Tips for formatting in Excel and Word
Copyright The 3000 NewsWire. All rights reserved.
|