UNIX Unleashed, Internet Edition

- 33 -

IRIX FAQs

by Jim Scarborough

This chapter addresses questions about IRIX. Questions are answered for IRIX 6 unless otherwise specified, though most of the answers apply to IRIX 5 as well.

Installation Questions

Software installation can bring about a number of questions whether the installation is on the network or local. A good understanding of the workings of the software installation system is helpful to IRIX system administration.

What do the `inst` files contain?

inst files, those files for the software installation manager inst(1M), contain the files to install on the hard drive in addition to some instructions on how to install the files and any scripts that might need to be run to set up the files once they're installed. For example, the patchSG0000466 product contains the executable itself (/sbin/diskpatch), and instructions on how to monitor the crontab file so that the diskpatch command is run every Thursday evening.

From both the graphical software installation manager and the text-based inst program, you can view products, the subsystems they contain, and the files within.

Level 1 shows the product name; level 2 breaks the product down into sections grouped according to the sort of information in each section. Level 3 is subsections that directly contain files, and the files are on level 4. showprods(1) can show levels 1 through 3, as selected by the -D option, and showfiles(1) shows the files contained in the subsystem(s) specified on the command line. See the man pages for showprods(1) and showfiles(1) for more details about command line options.

The level 2 product breakdown is by part, usually one of man, books, sw, and dev, for man pages, online books, software, and development tools, respectively.

These so-called "short product names" are available by default from the text-based installation manager, and are available by selecting "short product names" from the "software" menu on the software installation utility.

What version is this software I have on my hard drive?

To find the version of software installed on your system, you can see the "About" menu or type showprods -n to give you a listing with the revision number. inst compares these revision numbers when it determines software compatibility.

To find the version (and particular name) of an inst image on your hard drive, examine the file with the product name (and no dot or extension). For example, to find the version of the InPerson subsystem, you might do the following:

escher 37% strings InPerson | head -4
pd001V630P00
InPerson
"InPerson Desktop Conferencing, 2.2
P2>dV

Why doesn't `inst` work over the network?

If you are trying to install software from the miniroot and load the miniroot over the network, try verifying the IP address of your system in nvram. It often is not set along with the IP address on the booted system, and a wrong IP address will cause an unreachable host or other similar problem.

Be sure the network cable is plugged in and functioning. One of the easiest problems to fix is a loose cable.

Next, check on the system from which you're trying to install software and see if /etc/inetd.conf has been properly edited. It should contain a line for tftp, managed by tftpd, without the -s option and other arguments following on the line, or with the directory from which you are trying to install as an option after -s. See the man page for tftpd(1M) for more information about the command line. Once you have made the changes, execute telinit q.

If you still have trouble, check to see that the network works on the server machine, and make certain that you have specified the correct directory on the client from which you want to install software.

Alternately, if you have a secondary ethernet interface (such as FDDI), note that the system loads the miniroot from the system's built-in interface, then it configures the network according to what's on the hard drive. In the process, it brings up any secondary interface. You may have to reconfigure your interface (via hardware or software) to install software over the network.

What subsystem contains a particular file?

You can search subsystems that inst has seen using the showfiles(1M) command in conjunction with grep. For example, if you are searching for the subsystem that contains clogin, you could try the technique shown in Figure 33.1.

Figure 33.1 showfiles can be used in combination with grep to reveal the origin of files.

You can see that the file called clogin lives in /usr/Cadmin/bin, and it comes from the subsystem sysadmdesktop.sw.base. If, for some reason, you needed to re-install clogin, you could reinstall that subsystem instead of the entire operating system.

What patches do I need?

Within a month or two after the release of an operating system, patches become available, and SGI's practice of late has been to recommend certain patches for use with the system. For customers with support, a visit to http://www.sgi.com will solve a number of problems.

http://www.sgi.com/Support/patch intro.html is the top-level SGI patch page with links to registering for SurfZone http://www.sgi.com/Support/adv start.html, getting the latest security patches available to all customers http://www.sgi.com/Support/Secur/security.html, and getting the latest recommended patches http://www.sgi.com/SurfZone/Support/recpatch/. It's well worth visiting the site to download the recommended patches because they fix a number of problems with the system.

Network Questions

These are some of the more common questions relating to the network.

What's my ethernet address?

The output from netstat -ia reveals the ethernet address. Figure 33.2 shows the results of issuing the netstat -ia command.

Figure 33.2 netstat -ia shows the ethernet address as the last line of an entry.

The ethernet address for this machine is 08:00:69:0c:16:b0. All SGI ethernet addresses begin with 08:00, and the ethernet address is associated with software licensing and, in many cases, the system serial number.

Why doesn't the network work?

A number of things can cause the network to malfunction. If you're unfamiliar with networking, you should check them in the order of simplest to most difficult to fix. If you are more familiar with networking, you may be able to diagnose the problem more directly.

See the network checklist at http://fly.hiwaay.net/~jimes/checklist.txt for more information.

How can I watch network traffic?

For rudimentary network watching, you can use gr_osview(1) to monitor network traffic on the interface, TCP, and UDP levels. gr_osview has a number of configuration options to specify whether you want a bar graph over time or a single bar that shows current activity, if you want numbers on the display, and how you want the system to accommodate changing scale values. See the question about monitoring system usage for more information.

How do I explicitly add a route?

For IRIX 6 and later, add a file called /etc/config/static-route.options with your route commands. For IRIX before 6, edit the /etc/init.d/network file and add routes in a new else clause to continue the if clause that checks if ...$IS_ON routed....

How do I configure multiple network interfaces?

A second interface needs to be configured through the /etc/config/netif.options file. Use hinv to determine the name of your network interface (like ec3, et1, and so on), then edit the file to reflect your new interface as interface 2, like this:

if2name=ec3
if2addr=gate-$HOSTNAME

Note the absence of the leading colon on the lines. Restart the network, and use your new interface.

TIP: You can set the if2addr name to whatever you want. It doesn't have to be gate-$HOSTNAME. Set it to something convenient for you to remember or type.

Resource Management

Administrators should monitor memory, CPU, and disk usage to see if processes have run amok or to see if upgrades are in order. Ideally, you should monitor the system, adjust the system, and monitor again to see the results. If you are adjusting for performance, be sure to find the bottleneck first.

How can I see which processes take up CPU time?

top(1M) is the standard UNIX utility for watching CPU time, and it is available on SGI. It is also available is gr_top(1M), which shows the same information in a smaller, more convenient graphical window. Figure 33.3 shows a gr_top window.

Figure 33.3 gr_top shows cpu usage of the top processor hogs on the system.

How can I see which processes use too much memory?

On IRIX 6 and later, gmemusage(1), found in eoe.sw.perf, provides a thorough display of how much real memory is consumed by various processes. Clicking on a process can break down memory usage by thread. Figure 33.4 shows an example gmemusage screen.

Figure 33.4
gmemusage shows a breakdown of physical memory usage.

The memory usage in Figure 33.4 is acceptable. Often, though, Netscape will grow with usage, or some other program may exceed its regular size. gmemusage in conjunction with gr_osview(1) will help you to analyze usage patterns and determine when a system needs more memory.

How can I monitor system usage?

gr_osview(1) provides a flexible window to view a number of system statistics. gr_osview can be configured to show different styles of graphs and statistics on different aspects of the system. The man page for gr_osview(1) offers an in-depth explanation of the features available. Figure 33.5 shows a custom configuration of the gr_osview parameters.

Figure 33.5 gr_osview has configurable parameters and can show a variety of status bars in varying styles.

The following file was used to configure gr_osview for Figure 33.5:

cpu
intr
wait strip
rmem max tracksum
swp strip numeric tracksum
wait strip
sysact max tracksum creepscale
gfx max tracksum creepscale
netif(ec0) tracksum(15) max
disk(/)

osview(1) is a comprehensive text-based program for monitoring the system. Figure 33.6 shows osview monitoring an idle system.

Figure 33.6
osview is a useful diagnostic tool for observing system statistics.

Where did all of my disk space go?

Check /var/adm/crash for unusually large files. When your system panics, it puts core dump data in /var/adm/crash.

You can search the hard drive for core files or other files that might be taking an inordinate amount of space. Figure 33.7 shows how one might perform such a search.

Figure 33.7
Removing core files can save several megabytes apiece.

Simple searches can save plenty of disk space.

How can I increase swap space?

If you don't already have your disk full of data, you can increase the size of the swap partition (partition 1, usually), or add another disk with a swap partition.

You can add a swap partition by specifying the name of a blank disk partition (or one that you don't care about) on a line in the /etc/fstab file like this:

/dev/dks0d2s0 swap swap pri=3 0 0

The next time you reboot, the swap space will be automatically configured. You can also use swap -a to add the swap space immediately.

TIP: Swap partitions perform faster than swap files because the data for swap files has to be handled through the filesystem, and data for swap partitions is written directly to the disk.

If it is inconvenient to add a swap partition, you can add a swap file. First, make a file the size of your desired swap space (the example is 80 megabytes):

mkfile 80m /swap/swap

Then add a reference to the file in the /etc/fstab file like this:

/swap/swap swap swap pri=3 0 0

The next time you reboot, the swap space will be added automatically. swap -a will add the swap space immediately.

How can I change partition sizes?

WARNING: Changing partition sizes causes all of the data on the disk to be lost. Copy any data you want to keep somewhere else and make certain it's a good copy.

There are a number of ways to change partition size on an SGI machine. The easiest for a one-time task is using fx(1M) to arrange the partitions. It provides a convenient text menu interface for setting up your partitions. fx can be started from the miniroot if it is loaded on the volume header. (See the question on making a system disk for information about how to change the volume header.)

TIP: Keep a record of your partition sizes in a safe place. You may be able to recover data if you accidentally change partition size and change it right back to what it was.

For larger-scale implementations where a number of identical hard disks need to be partitioned the same way, you can easily write a script to provide input for the dvhtool(1M) program. dvhtool allows the user to specify partition numbers, types, and sizes directly, but unlike fx, dvhtool does not check your arithmetic to verify that there are no overlapping partitions.

CAUTION: Changing partition sizes using dvhtool can cause peculiar problems if your arithmetic is wrong. Overlapping partitions may work well for an extended period and later fail because they both try to use a particular location on the disk. Be careful that your partitions do not overlap.

How do I add partitions besides 0, 1, 6, and 7?

You can have partitions numbered 0-9 (10 is the entire volume). Set the partitions in fx or dvhtool as you normally would, then edit /dev/MAKEDEV.d/DKS_base (/dev/MAKEDEV in IRIX 5.3) to include routines for the partitions. Partition numbers up to 15 may work in IRIX 6 or later, but I have not tested them. Figure 33.8 shows the changes implemented in the beginning of the /dev/MAKEDEV.d/DKS_base file.

Figure 33.8
Add the numbers 2 through 5 to create device files for those partitions.

How can I copy a disk to another disk locally?

bru, tar, and dump and restore can be used to copy data from one place to another keeping permissions and ownership. For example, to copy the contents of the root directory to a /mnt directory, without crossing mounted filesystems, as root, try the following;

escher 3# (cd / ; bru -cmFf - ) | (cd /mnt ; bru -xFf - )

The man page for bru(1) offers another simpler example.

How can I copy a disk to another disk over the network?

bru tends to not work well over the network, but tar works nicely so long as crossing mount points is not a problem:

escher 3# rsh escher '(cd / ; tar -cBf - )' | (cd / ; tar -xBf -)

The -B option tells tar to ignore blocking information from the stream, and the network can obscure some of the original blocking information.

How can I copy the system partition to another system?

You can use the tar command above for the copy itself, but if the systems are different architecture (as reported by hinv) the copy will not work immediately.

One solution to the problem is to copy the data using tar, then re-install the system-dependent subsystems (they are noted in the listings from inst(1M) with a d).

If you want to keep a common directory for a number of systems and save disk space, you can put files for all the different architectures in the /usr/cpu/sysgen and /usr/gfx/arch directories on the server. The clients have symbolic links in /var to point to those directories as necessary (for example /var/sysgen/boot and /var/arch/X11), and all the architectures can share a common mount point for their data.

NOTE: /usr/lib contains a number of files required for building the kernel with autoconfig, and autoconfig starts System V Release 4 networking, which must be started before /etc/init.d/network for System V Release 4 networking to work. For that reason, you must either duplicate /usr/lib on the local hard drive or avoid NFS mounting /usr.

How do I make a system disk?

A system disk needs boot information on the volume header. dvhtool(1M) is a utility to edit the volume header information. With dvhtool, you can configure any disk to be a system disk. Figure 33.9 shows how to copy data to and from the volume header.

Figure 33.9 dvhtool can be used to manage disk volume header information.

First, I started dvhtool with the instruction to look at the volume header for the disk on controller 0, SCSI ID 1. Next, I told it to go to the volume header directory, vd. Typing l (ell) from the vd menu shows the files loaded on the volume label. g sash /stand/sash instructs dvhtool to copy sash from the volume header to the file /stand/sash in the regular filesystem. Next, c /stand/sash sash copies the file /stand/sash to sash on the volume header.

To make a bootable disk, see that you have a copy of sash in your regular filesystem somewhere (perhaps by copying it from the disk you used to boot). Next, copy that sash to the volume header of the disk you want to copy, and the disk is bootable provided everything else is already loaded.

I then quit the vd section of the dvhtool utility by entering a blank line. Quitting dvhtool is a matter of entering q, and, if necessary, confirming a write to the volume header.

How do I boot from a different disk?

Nonvolatile environment variables tell the system where to find the kernel. Use the nvram(1M) command to view and edit the variables. Change the SystemPartition and OSLoadPartition variables to indicate your boot partition.

TIP: See the man page for prom(1M) for insight into the format of nonvolatile memory variables that you can set and view with nvram(1M).

How can I view space consumed by files and directories?

The ordinary methods such as df and du work well. If you want a graphical representation of your free space, you can use fsn, available from ftp://ftp.sgi.com/sgi/fsn. Figure 33.10 shows fsn in action.

Figure 33.10
File System Navigator provides a unique way to browse the filesystem.

fsn was featured in Jurassic Park and has been used to store a wine catalogue for easy hierarchical browsing. It could just as easily be used to create a fly-through diagram of language history or just about any other tree-structure diagram by creating data files and directories to represent the structure.

fsn uses the system's filetype rules (ftr) to determine what to do with each file. See the man page for ftr(1M) for more information about those rules.

Tape Questions

Often, people take tapes or tape drives from one system to another and have troubles. These questions address those problems.

Why doesn't my Sun tape work? Or which tape device should I use?

Tape devices have a convenient naming scheme once it's familiar, and you can use whichever tape device fits your situation.

The first component of a tape device name is simply its path, which is self-explanatory. The second component indicates the type of device, be it dk for disk, tp for tape, or any number of other prefixes for other, less common devices. The third component specifies that it is a SCSI tape drive. The fourth part specifies which SCSI bus to use and the fifth specifies your drive's SCSI ID. You can use hinv to determine the SCSI bus and SCSI ID (a.k.a. unit number) of your tape drive.

The sixth component is an option field that can be one of s, nrs, ns, nrns, nr, sv, nrsv, nsv, v, nrnsv, nrv, or not specified. The default is nrns. Table 33.1 shows the options, their effects, and the order in which to assemble the options for a complete field.

Table 33.1. Tape device options

Order Option Effect

1 nr no-rewind device, such as a DAT tape

2 s byte-swapped

2 ns not byte-swapped 3 v variable block length Most tape problems occur as the result of byte swapping. IRIX machines are big endian; that is, they store the more significant of a pair of bytes first on tape. Other machines, including Suns, write the less significant of a pair of bytes first. If you have trouble reading a tape, try byte swapping first.

Naming and Remembering Endianness
The name endianness is a reference to Jonathan Swift's Gulliver's Travels, wherein a hastily written law served to divide the Lilliputians and their peaceful neighbors. It had long been the practice that people would crack their eggs on the large end, but when the son of the emperor of Lilliput cut his own finger on an egg while cracking it according to custom, the emperor decreed that all eggs henceforth should be cracked at the little end. Citizens rebelled against the change; some even fled to the neighboring empire of Blefuscu. Great wars ensued because of the schism introduced by the edict. The Lilliputians referred to their neighbors as Big-Endians because the people of Blefuscu continued to crack their eggs from the big end.

Big-endian data storage keeps the high-order byte of a two-byte pair first, and little-endian storage keeps the low-order byte first, and it's all easy to remember with the egg story.

You might also try variable block length to solve the problem.

Tape devices besides DAT's are mostly capable of rewinding (a DAT drive can rewind the tape, but it doesn't allow rewinding as a means of navigating the data, only as a means of getting to the beginning).

How do I set up my third party tape drive?

NOTE: This procedure is not for the faint-hearted. It calls for an understanding of C programming and Bourne shell scripting.

Use hinv to see what the computer thinks your tape drive is. It should indicate that the device is unknown and indicate an ID string. Note the ID string.

WARNING: Save backup copies of your /var/sysgen/master.d/scsi and /dev/MAKEDEV.d/TPS_base before you change them.

Edit /var/sysgen/master.d/scsi to include an entry for your particular tape drive. It's best to find an entry for a similar tape drive, copy it, and edit it. The /var/sysgen/master.d/scsi file is a configuration file for the kernel, and you are declaring the values for a struct tpsc_types. struct tpsc_types is defined in /usr/include/sys/tpsc.h, and the comments offer guidance as to what each of the values represents.

WARNING: Save a backup copy of your kernel (/unix) before you change it. If your new kernel does not work, you can boot from the backup copy through the PROM monitor.

Next, rebuild the kernel with autoconfig, reboot, then try hinv to see if it properly recognizes the tape drive. If it doesn't, go back and edit /var/sysgen/master.d/scsi, autoconfig, and reboot until hinv recognizes it.

Once hinv recognizes your tape drive, you need to build the appropriate device files for it. cd /dev and try ./MAKEDEV. If MAKEDEV doesn't create the appropriate devices on the first try, need to edit the /dev/MAKEDEV.d/TPS_base file (/dev/MAKEDEV for 5.3). MAKEDEV uses the hinv string to figure out which device files to make. See that MAKEDEV is looking for a line like your device line from hinv.

Security Questions

Use this section to help you get to your system or protect it from others. Refer to Part IV of this volume for more information on UNIX security.

How do I circumvent the root password?

NOTE: These instructions are included so that you can safeguard your system from attacks and so that you can regain access to your system if you have legitimately lost the root password.

To circumvent the root password in IRIX versions 5.3 and later, you must load the miniroot from an OS CD or other source.

Once the miniroot is loaded, follow these steps (menus have been omitted for brevity):

Inst> admin
Admin> shroot
chrooting to /
escher 21# passwd -d root

With those steps completed, you have removed the root user's password and can set it to whatever you like.

WARNING: Guard your bootable CD-ROMs with care (lock and key if need be). They hold the key to your system. Likewise, if you are running an older version of IRIX (before 5.0), you should either upgrade or guard your system with care.

How do I erase the PROM password I forgot?

The PROM password is stored (encrypted) in the passwd_key nonvolatile variable. You can reset the password by typing nvram passwd_key "" as root. See the man pages for nvram(1M) and prom(1M) for more information.

What do I need to do to secure my system?

UNIX is notoriously lax about security, and IRIX systems are no exception. There are a few key angles from which you can approach an SGI system to make it more secure:

Use chkconfig to turn everything you don't need off.

Edit /etc/services to remove all of the services you don't need.

Edit /etc/passwd and block logins to accounts that have no password.

Edit /var/Cadmin/clogin.conf (or /etc/passwd.sgi on pre IRIX 6.0 systems) to conceal login icons for accounts that shouldn't be visible. Better yet, you could chkconfig noiconlogin on to turn off the login icons altogether.

WARNING: This is by no means a complete list of steps necessary for securing your system. Do not rely on these steps alone to make your system secure. See Part IV of this volume for more information on UNIX security.

Miscellaneous Questions

These questions address more general issues or issues relating more to using the system than administration, though some tools can be very helpful for administration.

How do I know what hardware I have?

The hinv(1M) command will give an overview of your system's configuration. Figure 33.11 shows the output from the hinv command on an O2.

Figure 33.11
hinv shows most system configuration information.

You can tell what sort of machine you have from the hinv command and with a bit of deduction. hinv shows that this system has an O2cam, so it must be an O2. There are other clues such as the IP32 processor. All O2's have an IP32.

Extra information about the graphics can be gleaned from the /usr/gfx/gfxinfo command:

Graphics board 0 is "CRM" graphics.
        Managed (":0.0") 1280x1024
        32 bitplanes
        board revision 2, CRM revision C, GBE revision B
        Display 1280x1024 @ 60Hz, monitor type: SGX 1

gfxinfo shows if there is texture memory on applicable systems, and it shows what other graphics options might be important, such as the number of Raster Managers on a Reality Engine machine.

How do I change the resolution or frequency of my display?

Use the /usr/gfx/setmon(1G) command to specify resolution and refresh rate. For example:

escher 23# /usr/gfx/setmon 72Hz
Make new format the power-on default? <n> n

See the man page for more resolutions and examples.

CAUTION: If you change the resolution of your display to something your monitor cannot display, you should be able to reboot or log in over the network to fix it. Do not make the new format the power-on default if you are not certain it will work.

What's my sysinfo number?

The sysinfo -s number is usually, though not always, the same as the ethernet address of the system, if you first convert it to hexadecimal, and then add 0x08 x 16¹⁰. You can use this technique in reverse to take the last 8 digits of the serial number, convert to decimal, and come up with the sysinfo -s number.

NOTE: Many people incorrectly call the output of the sysinfo -s command the "sys ID" or some similar misnomer. /etc/sys_id is the name of the file that specifies a system's hostname, and sysinfo -s is the command that yields a unique hardware serial number for a system.

Why does my `rsh` command fail?

There are a number of reasons an rsh command might fail, but two are most common provided that the network is functioning properly.

Try rsh with only the hostname as an argument. If you are prompted for a password, you might want to check your .rhosts file on the distant system (the system name in the .rhosts file must match the system name in the $REMOTEHOST environment variable when you log on, and likewise the $REMOTEUSER environment variable must match the name in the .rhosts file.

If you see some output besides the prompt when you perform your rsh command, check to see that the output is inside an if statement like this one from the standard .cshrc file:

if ( (! $?ENVONLY) && $?prompt ) then
   set prompt= "hostname -s' \!% "
endif

How do I compare or merge text files?

SGI provides a fantastic utility for viewing differences and merging text files. gdiff(1) provides a two-pane view of two files, highlighting differences in the files. You can merge the files by selecting which differences you want and saving the merged version. gdiff(1) uses the same matching algorithm as diff(1).

How do I get a vt100 window?

There are a number of ways, xwsh -vt100 is probably the easiest. See the man page for xwsh(1) for more information.

How do I display a custom icon on the login screen?

Instead of having the standard person icon associated with your login name on the login screen, you can have any 100 by 100 or smaller picture you want. You can use imgworks(1) to convert your picture to the appropriate resolution and .sgi (a.k.a. .rgb) format. GIF images will also work with IRIX 6.

Once you've prepared your image, you can put it in your home directory in the file .icons/login.icon. Figure 33.12 shows a custom login icon for the jimes user.

Figure 33.12
You can customize your login icon by creating a ~/.icons/login.icon file with a picture of your choosing.

For more places to put the icon and other information about the login program, see the man page for clogin(1).

For more information about converting and manipulating images with imgworks(1) see the man page.

How do I do away with the login icons and use a picture instead?

Use chkconfig(1M) to turn noiconlogin on (chkconfig noiconlogin on). You can also customize the image that appears by changing the file /usr/Cadmin/images/cloginlogo.rgb. Figure 33.13 shows a custom image in the login screen.

Figure 33.13 chkconfig noiconlogin on causes the login icons to be replaced with the contents of /usr/Cadmin/images/cloginlogo.rgb.

Summary

There are a number of questions at various levels that arise among IRIX users. Software installation is covered well in the Iris Software Installation Guide, and some of the most common installation questions are listed here.

Network questions arise from time to time. Troubleshooting is covered at http://fly.hiwaay.net/~jimes/checklist.txt. Other questions are covered in this chapter.

It's been said that the one thing in every UNIX system's message of the day is "Please delete any extra files." Unix systems often get heavily used, and resources run scarce. A section here on resource management can help the reader to manage disk space, CPU usage, and memory usage.

The number one question asked of me during my term at SGI was "Why doesn't this tape work?" The answer is covered within, along with another popular one, "How do I make this tape drive work on my SGI?"

Occasionally, people, invariably through some bizarre set of circumstances, would find themselves locked out of their own system. Questions on security cover how to get into the system, and how to protect against someone else using the same techniques. Space does not permit complete coverage of all security issues.

Other questions covered in these pages cover some interesting configuration issues, how to find out about your system, and other miscellaneous questions.