Table of Contents

Commonly Used Macros. 7

Process Management Structures. 9

Process Management Structures: continued. 10

Job/Session Management Structures. 13

Job/Session Management Structures: continued. 14

File System Structures. 17

File System Structures: continued. 18

File System Structures: continued. 19

Finding The GUFD of an opened or closed file. 19

File System Structures: continued. 20

Virtual Space Management Structures. 23

Virtual Space Management Structures: continued. 24

Memory Management Structures. 27

Memory Management Structures: continued. 28

Dispatcher Structures. 31

Table Management 33

System Globals. 35

PA-RISC General Registers. 37

PA-RISC Space Registers. 39

Short vs. Long Pointers. 41

Short vs. Long Pointers: continued. 43

Procedure Calling Convention. 45

Procedure Calling Convention: Registers. 47

Procedure Calling Convention: Stack Frame. 49

Procedure Calling Convention: SP & PSP. 51

Procedure Calling Convention: SP & PSP. 52

Case Study: SA663. 55

Case Study: SA663 continued. 57

Case Study: SA663 continued. 59

Case Study: SA663 continued. 61

Case Study: SA663 continued. 63

Case Study: SA663 continued. 65

Case Study: SA663 continued. 67

Hangs. 69

Before Memory Dump. 71

Case Study: Hang Memory Dump. 73

Case Study: Hang Memory Dump continued. 75

Case Study: Hang Memory Dump continued. 76

Case Study: Hang Memory Dump continued. 77

Case Study: Hang Memory Dump continued. 79

Case Study: Hang Memory Dump. 81

Case Study: Hang Memory Dump continued. 83

Case Study: Hang Memory Dump continued. 85

Case Study: Hang Conclusion. 87

 


 

 

 

 

Notes:
Introduction

 

This paper is being presented at the West Coast HP3000 Solution Symposium in San Jose,  25 April 2003

 

The purpose of this paper is to try to provide basic information how to diagnose system aborts and hangs.

 

As the HP3000 winds down it will be advantageous for owners of this system to be able to perform as much trouble shooting as possible. The amount of trouble shooting will be limited because source code for the OS is not available outside HP.

 

It is assumed that readers have good familiarity with the tools DEBUG, DAT and SAT. The documentation for these tools may be found online at:

 

http://docs.hp.com/mpeix/onlinedocs/32650-90901/32650-90901.html

 

 

 


 

 

Notes:


Commonly Used Macros

 

This is by no means a comprehensive list of macros available in the OS macro set but these are some of the more commonly used macros.

 

The MACLIST (MACL) command can be used to list all current macros once they have been restored. Many of the macros listed will be second level macros, those called by other macros and so would be of limited value. Use the HELP command to see the source for a given macro, i.e. HELP PM_FPIB.

 

Most macros are prefaced with a designator to indicate what area of the OS they are meant to be used for. Here’s a list of some of the designators:

 

pm     = process management

fs     = file system

mm     = memory management

vsm    = virtual space management

rm     = resource management (sirs and semaphores)

xm     = transaction management

ui     = user interface (CI commands)

io     = i/o subsystem

config = hardware configuration


 

 

Notes:


Process Management Structures

 

These are the fundamental process management structures and their types. DEBUG, DAT and SAT provide functions that return pointers to these structures. These functions are:

 

PIB   - returns a pointer to the PIB for a given pin

        Example: fv pib(5) ‘pib_type’

 

PIBX  - returns a pointer to the PIBX for a given pin

        Example: fv pibx(pin) ‘pibx_type’

 

PCB   - returns a pointer to the PCB for a given pin

        Example: fv pcb(200) ‘pcb_type’

 

PCBX  - returns a pointer to the PCBX for a given pin

        Example: fv pcbx(10) ‘pcbx_type’

 

The PIB contains information about a given process. The type for the PIB is divided into functional areas such as:

 

DISPATCH_INFO which contains linkages to the dispatcher run queues.

 

IO_AREA which contains information about outstanding non-memory management I/O requests for the process.

 

PIB_ERROR_STACK is the area that holds that status of errors or warnings. The values in this stack are of type “HPE_STATUS” and are pushed onto this stack by the procedure HPERRPUSH. The PM_ERRORS macro will dump this stack but often times it is useful to dump it raw, e.g. DV PIB(PIN)+350,20 so you can see all the errors, even those that are not current. The PM_ERRORS macro will only dump the active part of the error stack.

 

Decoding HPE_STATUS errors is accomplished using the ERRMSG function in DAT and DEBUG, for example given an HPE_STATUS of fffd008f decoding would be:

 

$1d5 ($21d) nmdat > wl errmsg(S16(fffd), 8f)

Intrinsic layer; an access violation occurred.

 

The “S16” function is used so that “fffd” is treated as a signed quantity rather than as the low 16 bits of a 32 bit quantity.

 


Process Management Structures: continued

 

Two other fields in the PIB worth noting are the PIB_TRAP_PC and PIB_TRAP_ISM. These two fields are used for certain types of process traps. The PC (program counter) of the trap and the interrupt stack marker (ISM) active at the time of the trap are loaded into these fields. If the system should fail as the result of a process trap it may be possible to use the command “INITNM” supplying the ISM pointer in PIB_TRAP_ISM to restore the stack as it was at the time of the trap. Unfortunately it is often the case that the old stack location has been overwritten by activity that transpired from the time of the trap to the time of the abort. It is always worth a shot to see if something meaningful can be retrieved. At the very least PIB_TRAP_PC can tell you what piece of code caused it, e.g. DCS [the value of PIB_TRAP_PC]

 

Useful process management macros are PM_PTREE which is a more full-featured version of the built-in DPTREE. Unlike DPTREE the PM_PTREE macro will display the job or session number. This can be used as input to some of the UI macros

 

PM_FAMILY provides similar output as PM_PTREE but for the whole process family. Note that you get a more complete list of the family tree using the JSMAIN pin. This will give you the JSMAIN, the CI under it and any descendents under that. The UI_SHOWJOB macro lists the JSMAIN pin for each job or session.

 

The PM_FPIB macro is an almost complete formatting of the PIB structure and can be useful in describing the overall state of the process. The input to this macro is, oddly enough a string so the macro would be called like this,

 

$1d6 ($21d) nmdat > pm_fpib('pin')

 

Or

 

$1d7 ($21d) nmdat > pm_fpib('21d')

 

 


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This page intentionally left blank


 

 

Notes:


Job/Session Management Structures

 

 

The JMAT or job master table is what is displayed with the SHOWJOB CI command and there is an equivalent OS macro UI_SHOWJOB. Like its CI counterpart the macro displays all jobs and sessions or will display a specific job or session when a string with the “#Jnnn” or “#Snnn” value is supplied.

 

The JIT and JDT are compatibility mode data segments (DST) but all CM DST’s are objects and have NM virtual addresses. The DSTVA function translates a CM data segment number to its NM virtual address equivalent.

 

The JIT and JDT DST’s are kept in the CM stack in the “PXGLOBAL” area which is more easily remembered as being the first 12 (decimal) 16 bit words. So the quickest way to find the JIT and JDT are to dump the CM stack of the process you want them for.

 

$1de ($21d) nmdat > cm

 

%737 (%1035) cmdat > dd sdst.0,#12

DST %40346.0    

%0      % 000450 000600 137677 005700 003461 000000 020400 000000

%10     % 040001 040000 040332 040330 


Job/Session Management Structures: continued

 

The same thing can be accomplished using the native mode types:

 

$1e1 ($21d) nmdat > fv pcbx(pin) 'pcbx_type.pxglob,true'

 

CRUNCHED RECORD

      DL_MINUS_A      : 128

      DB_MINUS_A      : 180

      USER_ATT        : bfbf

      JMAT_INDEX      : bc0

      JPCNTINDEX      : 731

      JCUTINDEX       : 0

      STUNBIT         : FALSE

      RESTART         : FALSE

      JOBTYPE         : 2

      DUPLICATIVE     : FALSE

      INTERACTIVE     : FALSE

      ALLOWMASK       : FALSE

      JSMSTATE        : TRUE

      JSMCHANGE       : FALSE

      FILLER1         : 0

      STACKDUMP_FLAGS :

            STACKDUMP_INT : 0

      FILLER2         : 0

      NATIVE_LANG     : 0

      JOB_INPUT_LDN   : 4001

      JOB_OUTPUT_LDN  : 4000

      JDTDST          : 40da

      JITDST          : 40d8

END

 

You want to do an FT on PCBX_TYPE to see where the “.PXGLOB” came from. Further you will see that the field PXGLOB is of type PXGLOB_TYPE. A format type on that shows that the less useful record variant appears first, a crunched array of 12 BIT16’s. That will be the variant used unless another is explicitly specified. That’s what you need to do in this case hence the “,TRUE” added to the format virtual command.

 

Now that we know the JIT and JDT DST numbers we can use the DSTVA function to translate that to a virtual address and finally format the type:

 

$1e3 ($21d) nmdat > fv dstva(40da.0) 'jdt_header_type'

 

See HELP DSTVA for additional details.


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This page intentionally left blank


 

 

Notes:


File System Structures

 

These structures are but the tip of the iceberg when it comes to the file system!

 

The PLFD is a file handle, whenever a process has a file or socket or pipe opened that entity will occupy a slot in the PLFD table.

 

The PLFD structure will contain pointers to the GDPD for the file and to the GUFD for the file, if there is one. The PLFD is also where we keep the “type manager control block” which is an area used by the type manager bound to the file at open-time. The type manager’s “PLABEL” (code address) is also kept in the PLFD. Note that this field is usually stored as a short pointer and as a result may be represented for example as “eaca68.0”. This is really “a.eaca68” and can be displayed via “dcs eaca68”. 

 

The macro FS_PLFD can be used to return the PLFD pointer for a given file number associated with a particular pin, for example format the PLFD for file #11 ($b) for the current pin:

 

$1e5 ($21d) nmdat > fv fs_plfd(,b) 'plfd_t'

 

The FS_FILE macros is quite useful for formatting all of the more important areas of the PLFD structure. Like the FS_PLFD macro it takes both a PIN and file number as input.

 

The GDPD is where we keep the current pointers for a file and it is also where we keep the “storage management control block”. The tail end of the GDPD has an SM_CB which is used by storage management to know how to prefetch information from a disk file and where to write information back to the file. Software updates the SM_CB prior to initiating a read from disk or a write to disk.

 

Files that are not opened MULTI or GMULTI will have their own unique GDPD. Files opened MULTI or GMULTI will, of course, share one. The linkage will be through the NEXT_PLFD field in the PLFD.

 

The GUFD structure exists only for disk files, so it is normal to find files that do not have a GUFD. All disk files had better have one!

 

Technically the GUFD is not a file system structure, it is actually part of storage management. Additionally the GUFD is kept immediately adjacent to the “VSOD” structure in the VSM “VSOD/GUFD Table” which will be discussed a bit later.

 


File System Structures: continued

 

The GUFD structure is also retained in most cases when a process closes a file. In other words, if a process is the last accessor of a disk file and closes it we do not release the GUFD rather it is appended to a least recently used (LRU) list. If the file is re-opened chances are the GUFD will be on that list and we can simply pull it off the LRU and use it making the file open process quicker.

 

The GUFD structure contains the virtual address of the file. There’s also the GDPD pointer which is the end of a linked list of GDPD’s associated with the file.

 

If the file is attached to XM that will tracked in the GUFD.

 

Finally, the GUFD contains information taken from the file label, things such as the EOF offset and number of records, the number of readers and writers. The GUFD also contains the pointer to the file label. (Technically the file label is not a file system structure, it is part of label management.)

 

The file label is an address that ends in $20 and the reason for that is that the FLAB_T type is part of a slightly larger structure “T_FILE_LABEL_ENTRY”. This larger structure contains components of what will become the UFID or Unique File Identifier of a file (type is “UFID_TYPE”). And it also contains an offset to the extent block for the file. Replacing the $20 from a file label pointer with $00 allows it to be formatted using the “T_FILE_LABEL_ENTRY” type. This type is a boolean variant and it has the less-than-useful variant first so proper formatting requires specifying the TRUE variant, for example:

 

$1f8 ($70) nmdat > fv 15f.fc600 't_file_label_entry,TRUE'

 

Extent blocks migrate away from the file label as the file grows and more extents are added. The most recent extent block is always kept adjacent to the file label. Extent blocks are formatted with the type “T_EXTENT_BLOCK_ENTRY” which also suffers from the less-than-useful-variant-first problem so formatting with this type also requires the use of the TRUE variant. Each extent block will contain a pointer to the next extent block, if there is one. And it’s worth noting too that all references to disks are volume ID’s and not LDEV’s.

 


File System Structures: continued

 

Finding The GUFD of an opened or closed file

 

GUFD’s for opened files are kept on a HASH_LINK from Storage Management Globals (KSO #210, type “SM_GLOBAL_REC”) and GUFD’s for closed files are kept on a least recently used (LRU) list also from SM Globals. The top portion of the GUFD_T shows these links:

 

GUFD_T =

   RECORD

   HASH_LINK                         : GUFD_PTR_TYPE;

   LRU_LINK                          : GUFD_PTR_TYPE;

   PREV_LRU_LINK                     : GUFD_PTR_TYPE;

 

When a process opens a disk file a search of these lists will be made to see if the file is opened or if the file has been recently closed. Files do not remain on the LRU indefinitely, the list can be no more than 1500 entries long and, if we should run short of GUFD entries for files being opened, the oldest file on the LRU will be pulled off, mapped out and the GUFD given over to a new file open request.

 

With MPE/iX 6.5 onward we also try to hold larger files, those over 1GB in size on the LRU as long as possible because performance can suffer if a very large file is mapped out all at once. These files are rotated around the LRU up to 16 times and at each rotation a 16th of the file is mapped out, from the bottom up. We map out from the bottom up so that if the file is removed from the LRU because it has been re-opened chances are the top portion of the file object will be referenced first and that will minimize the need to page-fault the data in from disk.

 

The FS_FIND_GUFD_ENTRY macro can be used to locate a GUFD for an opened or recently closed file. The macro takes as input the interval timer for the file you want to locate. Since the interval timer is kept in the extended file label that is a good way to get this information. It is not ever going to change so if you were looking at a memory dump and for example wanted to locate the GUFD for XL.PUB.SYS you could use the interval timer from the live system (assuming, of course, you’re logged on the system whose memory dump you’re looking at!).

 

A LISTFILE, –3 will display provide the file label pointer, for example:

 


File System Structures: continued

 

:listfile XL.PUB.SYS,-3

********************

FILE: XL.PUB.SYS             

 

FILE CODE : 1032                FOPTIONS: BINARY,FIXED,NOCCTL,STD

BLK FACTOR: 1                   CREATOR : MANAGER.SYS         

REC SIZE: 256(BYTES)            LOCKWORD:                     

BLK SIZE: 256(BYTES)            SECURITY--READ    : ANY        

EXT SIZE: 0(SECT)                         WRITE   : ANY       

NUM REC: 77293                            APPEND  : ANY       

NUM SEC: 77824                            LOCK    : ANY       

NUM EXT: 41                               EXECUTE : ANY        

MAX REC: 4096000                        **SECURITY IS ON      

                                FLAGS   : 1 ACCESSOR,SHARED,1 R

NUM LABELS: 0                   CREATED : TUE, MAR  4, 2003, 10:43 AM

MAX LABELS: 0                   MODIFIED: TUE, MAR  4, 2003, 10:44 AM

DISC DEV #: 1                   ACCESSED: MON, MAR 17, 2003,  4:09 AM

SEC OFFSET: 0                   LABEL ADDR: $00000013.$00204020

VOLNAME   : MPEXL_SYSTEM_VOLUME_SET:MEMBER1

 

The file label address is 13.204020 and since the interval timer is found at offset $10 in the extended label structure replacing the $20 with a $10 points us right at it.

 

Now, using DAT/DEBUG indirection we can pass that value to the FS_FIND_GUFD macro:

 

$115 ($31) nmdebug > fs_find_gufd_entry([13.204010])

gufd_record pointer  : $ca016e60

File virtual address : $f8.0      

End of file offset   : $12ded00

File name            : XL.PUB.SYS

 

And as this example shows, this macro works quite well in DEBUG too.

 


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This page intentionally left blank


 

 

Notes:

 


Virtual Space Management Structures

 

There are two VSOD tables, one for files and one for everything else. Everything that has a virtual address has an entry in one of the two VSOD tables. These tables are:

 

VSOD/GUFD table, KSO #201

VSOD table, KSO #53

 

Both tables consist of entries whose type is “VS_OD_TYPE”. The difference is that the VSOD/GUFD table, KSO #201 contains both VSOD entries and GUFD entries adjacent one another.

 

That means that if you know the address of a GUFD all you need to do is subtract the length of VS_OD_TYPE from it to get a pointer to the VSOD. Of course you can also use VAINFO to return that value by providing the virtual address of the file, for example:

 

$20b ($70) nmdat > fv ca12cda8 'gufd_t.file_vir_addr'

 

2e4.0

 

$20c ($70) nmdat > wl vainfo(2e4.0, 'vs_od_ptr')

$ca12cd48

 

$20d ($70) nmdat > wl ca12cda8-symlen('VS_OD_TYPE')

$ca12cd48

 

It makes sense to keep the VSOD and GUFD adjacent each other for file objects beca