First of a series
We are often ruled by the "tyranny of the urgent," which means we never have an opportunity to control our work agenda, but are pulled to-and-fro by panic situations. The ability to understand -- and actually control -- system performance is unexplored territory when we live and work this way. These articles will show you how you can wrest your HP 3000 performance from that tyrant of urgency and reclaim control. Situations that would normally catch you unprepared can be studied and understood before they occur.
I will discuss how to analyze your three basic resources -- CPU, Memory, and Disk -- and identify bottlenecks that hamper your ability to get maximum performance. I will also show you how to understand your own applications, deal with crisis situations and help you plan for future growth (capacity plan).
You will learn how to fine tune your system for the longest optimal use. And most importantly, I will help you form an overriding political strategy to support your need for training and skill development in the area of performance.
So you're the system manager, in charge of one or perhaps several systems. These systems each have a defined set of purposes, all aimed at meeting some business goal. Your responsibility is to keep the availability and utility of each of these systems as high as possible. Yet you have received little training in what steps should be taken to ensure success in these endeavors. It seems an assumption has been made that computers are boundless in their abilities and that they are exempt from natural laws. On top of this, you often have other responsibilities that take you away from the task of system maintenance.
Many such obstacles stand in the way of the responsible system manager who wants to take a proactive stance and actually manage the system under his or her care. How can you overcome these obstacles, and what knowledge, tools and skills are needed once a proactive stance is decided upon?
The first step in defeating the forces of urgency is to gain the support from management. Without this foothold the battle will be difficult if not impossible. You will still find temporary relief using some of the strategies and gaining the understanding provided in this paper, but like any effective soldier you need the support of your superiors. It may be that you are in a position that allows you to establish guidelines and get management support. If so, put aside time to study and define the goals and objectives for the Data Processing Department. If you are not in a position of authority that allows you to set goals and objectives, then approach your superior and ask to be given time to work on goals and objectives for your department. Of course you will want to be well versed on why this should be done, so keep reading.
Goals and Objectives
Any good explanation of system performance must begin with some mention of
goals and objectives. Without them, there is no measure of success or
failure, and no real rationale to spend money on tools to measure
performance. Goals and objectives have been expressed best in a discussion
of System Level Objectives (SLO), System Level Agreements (SLA), and System
Level Management (SLM).
Setting System Level Objectives is determining goals and expectations for what the data processing department will provide its users. The first step in establishing these is to survey your user group.
Reaching System Level Agreements is designing an agreement that binds the DP department and the users to the objectives.
System Level Management provides an action plan for administering the objectives and agreements. What happens when the agreements to meet the objectives are not met? What is the reporting function? Who makes the evaluation and how often?
You may have to start small with your goals and objectives and work toward an all-encompassing strategy.
Understanding Your System Makeup
The next important issue is to gain an overall understanding of your
system, its resources and inner workings. This understanding is essential
in taking control of your data processing shop. Control is the key to
conquering the tyranny of the urgent.
The CPU
The Central Processing Unit (CPU) is the first element of the HP 3000 we
will discuss. It is important to realize that the CPU is a finite resource.
It can be used up. Since it is a finite resource with many concurrent
demands made upon it, the key to using it successfully is found in
prioritizing current demands. To do this, the HP 3000 CPU assigns a queue
to each process. Each of the five MPE queues (A through E) have been
assigned priority numbers ranging from 1 to 255. When a process is
initiated, it begins at the top of the respective queue (called the base)
and moves down in priority in even intervals until the processes priority
reaches the bottom (which is called the limit).
The MPE Dispatcher is responsible for monitoring processes, assigning them a priority and periodically moving the process down in priority if it continues to require CPU. The key theorem used by the dispatcher is to penalize lengthy processes and reward short processes. A value called the System Average Quantum (SAQ) is used to decide what a normal "short" transaction receives in CPU time before it is complete. When a process continues to require CPU time past the length of the SAQ it is decreased in priority.
Other processes entering the Ready Queue can then steal CPU from the previous, lower priority process, effectively pre-empting the current process. Of course processes in lower queues will rarely be given any attention by the CPU if there is an extremely busy queue above that process.
There are many measurable indicators that can be used to examine the health of your computer system. I will begin by examining and explaining the three key indicators noted below because I have found that they are generally the best.
CPU Total Busy: This is the indicator you should begin with when evaluating the state of your CPU. The CPU Total Busy shows you how much of your CPU is currently and historically in use. The difference between this indicator and 100 percent is the amount of CPU you have available for future use. Your system could be experiencing a high amount of CPU usage and yet still be fine operationally. To check further, you need to consider the following two indicators.
CPU Queue Length: This indicator shows you how many processes are awaiting service by the CPU. When the CPU cannot keep up with the demand, it holds those processes awaiting the CPU in the ready queue. The higher the CPU queue length, the greater the indication of unmet demand.
CPU Sub Queue makeup: The CPU is made up of the A though E queues. The AQ and BQ are assigned to various system processes which must have immediate CPU attention. The AQ is exclusively for system processes and the BQ is also for system processes but can also be used for certain, well-placed user processes. The CQ is used for interactive users while the DQ and EQ SubQueues are used for batch jobs.
Whenever you evaluate the activity of the CPU, its makeup must be considered. If you have a high amount of activity it does not mean that you are nearing the time for an upgrade. In fact, you want to have as full a utilization of your machine as possible. It is only when the SubQueue activity is dominated by the AQ, BQ, and CQ (plus Memory Manager Percentage, Dispatcher percentage and Overhead) that you need to be concerned.
One caveat should be mentioned in conjunction with an analysis of the Subqueue makeup -- the impact of tuning your queues to something other than the default. This can be done with great benefit in certain situations, but when you do this, remember that you may be causing more competition among the different queues.
Each of the three main CPU indicators must be evaluated in conjunction with the other two. One indicator found to be pressed to an extreme does not necessarily mean the CPU is in short supply or becoming a bottleneck. For instance, the CPU may be near 100 percent, but because the CPU SubQueue makeup has a high percentage of DQ and/or EQ activity, no CPU shortage exists and the CPU Queue Length will be acceptable.
Next issue: HP 3000 main memory considerations and the disk environment.
Jeff Kubler is a performance expert working for Lund Performance Solutions whose experience with the HP 3000 dates from 1982. He can be reached at jeff@lund.com