Chapter 14 Study Guide: Difference between revisions

From ITCwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
'''Troubleshooting Methodology'''
*Proactive Maintenance
Taking the necessary steps to minimize future problems
Includes performing system back-ups and identifying potential problem areas.
*Reactive Maintenance
Correcting problems when they arise
Always document the solution to help quickly resolve future problems
'''Troubleshooting Procedures'''
*Gather as much information as possible
System log files
Run information utilities such as ps or mount
“tail –f /path/to/logfile” opens a log file for continuous viewing
*Isolate the problem
Determine if the problem is persistent, intermittent, and how many users are effected
*List possible causes and solutions
google is your best friend
*Implement and test solution
*Document your solution and process
*Prioritize problems
Solve most severe problems first
Spending too much time on small problems can result in reduced productivity
*Try to solve the root of the problem
A short term solution might fail in the long term because of an underlying problem
'''Hardware Related Problems'''
*Can come from damaged hardware or improper hardware or software configuration
*Using the dmesg command or viewing the /var/log/boot.log and var/log/messages files can isolate most hardware problems
*The absence of or improper drivers prevents the OS from using the associated hardware
Use lsusb to view only usb devices
Use lspci to view only PCI devices
*Lsmod command lists the drivers loaded into the kernel
By comparing the output of dmesg, lsusb, and lspci with the lsmod output, you can determine if a driver is missing
*Hard drives are the most common hardware component to fail
'''Software Related Problems'''
*Can be application or OS related
*Application Related Problems
Can fail during execution due to missing program libraries and files, process restrictions, or conflicting applications
Identify missing files in a package by using the –V option with the rpm command
Use the ldd command to identify which shared libraries are required by certain programs
It is good practice to run the ldconfig command to ensure the shared library directories are updated
The ulimit command can be used to increase the number of processes the user can start in a shell
*Operating System Related Problems
Typically include problems with X windows, boot loaders, and filesystems
Use xwininfo or xdpyinfo commands to attemp to isolate problems with X windows
Placing the word “linear” and removing “compact” from the /etc/lilo.conf file often fixes LILO boot loader problems
The GRUB boot loader errors are typically the result of a missing file in the /boot directory
File systems can become corrupted due to high use accessing the hard drive
Corrupted filesystems can be identified by very slow write requests, errors printed to the console, or failure to mount
'''User Interface Related Problems'''
*Users need to understand how to use their desktop environment, but often will not
*Assistive technologies are tools you can use to modify your desktop experience
Accessed by opening the system menu and navigate to preferences, assistive technologies
'''Linux Performance'''
'''Linux Performance'''



Revision as of 03:32, 25 April 2012

Troubleshooting Methodology

  • Proactive Maintenance

Taking the necessary steps to minimize future problems Includes performing system back-ups and identifying potential problem areas.

  • Reactive Maintenance

Correcting problems when they arise Always document the solution to help quickly resolve future problems

Troubleshooting Procedures

  • Gather as much information as possible

System log files Run information utilities such as ps or mount “tail –f /path/to/logfile” opens a log file for continuous viewing

  • Isolate the problem

Determine if the problem is persistent, intermittent, and how many users are effected

  • List possible causes and solutions

google is your best friend

  • Implement and test solution
  • Document your solution and process
  • Prioritize problems

Solve most severe problems first Spending too much time on small problems can result in reduced productivity

  • Try to solve the root of the problem

A short term solution might fail in the long term because of an underlying problem

Hardware Related Problems

  • Can come from damaged hardware or improper hardware or software configuration
  • Using the dmesg command or viewing the /var/log/boot.log and var/log/messages files can isolate most hardware problems
  • The absence of or improper drivers prevents the OS from using the associated hardware

Use lsusb to view only usb devices Use lspci to view only PCI devices

  • Lsmod command lists the drivers loaded into the kernel

By comparing the output of dmesg, lsusb, and lspci with the lsmod output, you can determine if a driver is missing

  • Hard drives are the most common hardware component to fail

Software Related Problems

  • Can be application or OS related
  • Application Related Problems

Can fail during execution due to missing program libraries and files, process restrictions, or conflicting applications Identify missing files in a package by using the –V option with the rpm command Use the ldd command to identify which shared libraries are required by certain programs It is good practice to run the ldconfig command to ensure the shared library directories are updated The ulimit command can be used to increase the number of processes the user can start in a shell

  • Operating System Related Problems

Typically include problems with X windows, boot loaders, and filesystems Use xwininfo or xdpyinfo commands to attemp to isolate problems with X windows Placing the word “linear” and removing “compact” from the /etc/lilo.conf file often fixes LILO boot loader problems The GRUB boot loader errors are typically the result of a missing file in the /boot directory File systems can become corrupted due to high use accessing the hard drive Corrupted filesystems can be identified by very slow write requests, errors printed to the console, or failure to mount

User Interface Related Problems

  • Users need to understand how to use their desktop environment, but often will not
  • Assistive technologies are tools you can use to modify your desktop experience

Accessed by opening the system menu and navigate to preferences, assistive technologies


Linux Performance

Monitor system performance using command-line included in the sysstat package. To make it easier to identify performance problems, a network administrator should run performance utilities on healthy Linux systems to develop a baseline.

Performance Problems:

  • Software
  • Hardware
  • Combination of the two

Software Problems

Software that requires too many system resources may use CPU, memory, and peripheral devices creating poor performance. Too many processes running or rouge processes

Hardware Problems

Improperly configured hardware (May still Work) Old ( Most companies retire computer equipment after two to five years of use) Jabbering: Sending large amounts of information to the CPU when not in use.

Resolutions

Software problems can sometimes be resolved by changing hardware. Move or Remove the software Upgrading or adding another CPU Use bus mastering peripheral components (Devices that can perform processes normally performed by the CPU) Adding RAM to increase system speed Replace slower disk drives with faster ones Use disk striping RAID Keep CD and DVD drives on a separate hard disk controller

Monitoring Performance with sysstat Utilities

Using information from the /proc directory and system devices, the System Statistics (sysstat) package contains utilities that monitor the system. To install the latest version of sysstat on a Linux system, use the following method: 1. yum install sysstat

Three of the System Statistics (sysstat) package performance monitoring utilities include:

  • mpstat (multiple processor statistics) command
  • iostat (input/output statistics)
  • sar (system activity reported) command

mpstat (multiple processor statistics)

Used to monitor CPU performance for all processors on the system since the system was started or rebooted.

To monitor a single cpu use the –P option followed by the processor number.

Example: mpstat –P 0 would display the first processor on the system. Limited in abilities


Examining the Output of the mpstat command

  • %user= % of time the processor spent executing user programs and daemons
  • %nice= % of time the processor spent executing programs and daemons that had nondefault nice values
  • %sys= % of time the processor spent maintaining itself
  • %iowait= % of time the CPU was idle when an outstanding disk I/O request existed.
  • %irq= % of time the CPU is using to respond to normal interrupts that span multiple CPUs.
  • %soft= % of time the CPU is using to respond to normal interrupts that span multiple CPUs
  • %steal= % of time the CPU is waiting to respond to virtual CPU requests
  • %guest= % of time the CPU is executing another virtual CPU
  • %idle= % of time the CPU did not spend executing tasks. Should be greater than 25% over a long period of time.

iostat (input/output statistics)

Measusres the flow of information to and from disk devices. Displays CPU statistics similar to mpstat Limited in abilities Adds transfers per second (tps) and block

sar (system activity reporter)

Displays more information than the mpstat or iostat command Displays CPU statistics by default Most widely used performance monitoring tool on UNIX and Linux systems Scheduled using the cron daemon to run every 10 minutes for the current day

logged to a file in the /var/log/sa directory called sa#. The # represents the day of the month

Only one month of records is kept but can be changed by editing the cron table located at /etc/cron.d/sysstat Can display different statistics by specifying options sar (system activity reporter) Displays more information than the mpstat or iostat command Displays CPU statistics by default Most widely used performance monitoring tool on UNIX and Linux systems Scheduled using the cron daemon to run every 10 minutes for the current day

logged to a file in the /var/log/sa directory called sa#. The # represents the day of the month

Only one month of records is kept but can be changed by editing the cron table located at /etc/cron.d/sysstat Can display different statistics by specifying options

Common options with the sar caommand

  • CPU Usage of ALL CPUs (sar -u) Default
  • CPU Usage of Individual CPU or Core (sar -P)
  • Memory Free and Used (sar -r)
  • Display swapping statistics (sar -W)
  • Reports run queue and load average (sar -q)
  • sar -u 1 3 Displays real time CPU usage every 1 second for 3 times.

Other Performance Monitoring Utilities

  • top utility (discussed in Chapter 9)
  • free command can be used to display the total amounts of physical and swap memory (in Kilobytes) and their utilizations.
  • vmstat command indicates more information than the free command to indicate whether more physical memory is required.