Chapter 14 Study Guide: Difference between revisions

Revision as of 03:32, 25 April 2012

Troubleshooting Methodology

Proactive Maintenance

Taking the necessary steps to minimize future problems Includes performing system back-ups and identifying potential problem areas.

Reactive Maintenance

Correcting problems when they arise Always document the solution to help quickly resolve future problems

Troubleshooting Procedures

Gather as much information as possible

System log files Run information utilities such as ps or mount “tail –f /path/to/logfile” opens a log file for continuous viewing

Isolate the problem

Determine if the problem is persistent, intermittent, and how many users are effected

List possible causes and solutions

google is your best friend

Implement and test solution
Document your solution and process

Prioritize problems

Solve most severe problems first Spending too much time on small problems can result in reduced productivity

Try to solve the root of the problem

A short term solution might fail in the long term because of an underlying problem

Hardware Related Problems

Can come from damaged hardware or improper hardware or software configuration
Using the dmesg command or viewing the /var/log/boot.log and var/log/messages files can isolate most hardware problems
The absence of or improper drivers prevents the OS from using the associated hardware

Use lsusb to view only usb devices Use lspci to view only PCI devices

Lsmod command lists the drivers loaded into the kernel

By comparing the output of dmesg, lsusb, and lspci with the lsmod output, you can determine if a driver is missing

Hard drives are the most common hardware component to fail

Software Related Problems

Can be application or OS related
Application Related Problems

Can fail during execution due to missing program libraries and files, process restrictions, or conflicting applications Identify missing files in a package by using the –V option with the rpm command Use the ldd command to identify which shared libraries are required by certain programs It is good practice to run the ldconfig command to ensure the shared library directories are updated The ulimit command can be used to increase the number of processes the user can start in a shell

Operating System Related Problems

Typically include problems with X windows, boot loaders, and filesystems Use xwininfo or xdpyinfo commands to attemp to isolate problems with X windows Placing the word “linear” and removing “compact” from the /etc/lilo.conf file often fixes LILO boot loader problems The GRUB boot loader errors are typically the result of a missing file in the /boot directory File systems can become corrupted due to high use accessing the hard drive Corrupted filesystems can be identified by very slow write requests, errors printed to the console, or failure to mount

User Interface Related Problems

Users need to understand how to use their desktop environment, but often will not
Assistive technologies are tools you can use to modify your desktop experience

Accessed by opening the system menu and navigate to preferences, assistive technologies

Linux Performance

Monitor system performance using command-line included in the sysstat package. To make it easier to identify performance problems, a network administrator should run performance utilities on healthy Linux systems to develop a baseline.

Performance Problems:

Software
Hardware
Combination of the two

Software Problems

Software that requires too many system resources may use CPU, memory, and peripheral devices creating poor performance. Too many processes running or rouge processes

Hardware Problems

Improperly configured hardware (May still Work) Old ( Most companies retire computer equipment after two to five years of use) Jabbering: Sending large amounts of information to the CPU when not in use.

Resolutions

Software problems can sometimes be resolved by changing hardware. Move or Remove the software Upgrading or adding another CPU Use bus mastering peripheral components (Devices that can perform processes normally performed by the CPU) Adding RAM to increase system speed Replace slower disk drives with faster ones Use disk striping RAID Keep CD and DVD drives on a separate hard disk controller

Monitoring Performance with sysstat Utilities

Using information from the /proc directory and system devices, the System Statistics (sysstat) package contains utilities that monitor the system. To install the latest version of sysstat on a Linux system, use the following method: 1. yum install sysstat

Three of the System Statistics (sysstat) package performance monitoring utilities include:

mpstat (multiple processor statistics) command
iostat (input/output statistics)
sar (system activity reported) command

mpstat (multiple processor statistics)

Used to monitor CPU performance for all processors on the system since the system was started or rebooted.

To monitor a single cpu use the –P option followed by the processor number.

Example: mpstat –P 0 would display the first processor on the system. Limited in abilities

Examining the Output of the mpstat command

%user= % of time the processor spent executing user programs and daemons
%nice= % of time the processor spent executing programs and daemons that had nondefault nice values
%sys= % of time the processor spent maintaining itself
%iowait= % of time the CPU was idle when an outstanding disk I/O request existed.
%irq= % of time the CPU is using to respond to normal interrupts that span multiple CPUs.
%soft= % of time the CPU is using to respond to normal interrupts that span multiple CPUs
%steal= % of time the CPU is waiting to respond to virtual CPU requests
%guest= % of time the CPU is executing another virtual CPU
%idle= % of time the CPU did not spend executing tasks. Should be greater than 25% over a long period of time.

iostat (input/output statistics)

Measusres the flow of information to and from disk devices. Displays CPU statistics similar to mpstat Limited in abilities Adds transfers per second (tps) and block

sar (system activity reporter)

Displays more information than the mpstat or iostat command Displays CPU statistics by default Most widely used performance monitoring tool on UNIX and Linux systems Scheduled using the cron daemon to run every 10 minutes for the current day

logged to a file in the /var/log/sa directory called sa#. The # represents the day of the month

Only one month of records is kept but can be changed by editing the cron table located at /etc/cron.d/sysstat Can display different statistics by specifying options sar (system activity reporter) Displays more information than the mpstat or iostat command Displays CPU statistics by default Most widely used performance monitoring tool on UNIX and Linux systems Scheduled using the cron daemon to run every 10 minutes for the current day

logged to a file in the /var/log/sa directory called sa#. The # represents the day of the month

Only one month of records is kept but can be changed by editing the cron table located at /etc/cron.d/sysstat Can display different statistics by specifying options

Common options with the sar caommand

CPU Usage of ALL CPUs (sar -u) Default
CPU Usage of Individual CPU or Core (sar -P)
Memory Free and Used (sar -r)
Display swapping statistics (sar -W)
Reports run queue and load average (sar -q)
sar -u 1 3 Displays real time CPU usage every 1 second for 3 times.

Other Performance Monitoring Utilities

top utility (discussed in Chapter 9)
free command can be used to display the total amounts of physical and swap memory (in Kilobytes) and their utilizations.
vmstat command indicates more information than the free command to indicate whether more physical memory is required.

@@ Line 1: / Line 1: @@
+'''Troubleshooting Methodology'''
+*Proactive Maintenance
+Taking the necessary steps to minimize future problems
+Includes performing system back-ups and identifying potential problem areas.
+*Reactive Maintenance
+Correcting problems when they arise
+Always document the solution to help quickly resolve future problems
+'''Troubleshooting Procedures'''
+*Gather as much information as possible
+System log files
+Run information utilities such as ps or mount
+“tail –f /path/to/logfile” opens a log file for continuous viewing
+*Isolate the problem
+Determine if the problem is persistent, intermittent, and how many users are effected
+*List possible causes and solutions
+google is your best friend
+*Implement and test solution
+*Document your solution and process
+*Prioritize problems
+Solve most severe problems first
+Spending too much time on small problems can result in reduced productivity
+*Try to solve the root of the problem
+A short term solution might fail in the long term because of an underlying problem
+'''Hardware Related Problems'''
+*Can come from damaged hardware or improper hardware or software configuration
+*Using the dmesg command or viewing the /var/log/boot.log and var/log/messages files can isolate most hardware problems
+*The absence of or improper drivers prevents the OS from using the associated hardware
+Use lsusb to view only usb devices
+Use lspci to view only PCI devices
+*Lsmod command lists the drivers loaded into the kernel
+By comparing the output of dmesg, lsusb, and lspci with the lsmod output, you can determine if a driver is missing
+*Hard drives are the most common hardware component to fail
+'''Software Related Problems'''
+*Can be application or OS related
+*Application Related Problems
+Can fail during execution due to missing program libraries and files, process restrictions, or conflicting applications
+Identify missing files in a package by using the –V option with the rpm command
+Use the ldd command to identify which shared libraries are required by certain programs
+It is good practice to run the ldconfig command to ensure the shared library directories are updated
+The ulimit command can be used to increase the number of processes the user can start in a shell
+*Operating System Related Problems
+Typically include problems with X windows, boot loaders, and filesystems
+Use xwininfo or xdpyinfo commands to attemp to isolate problems with X windows
+Placing the word “linear” and removing “compact” from the /etc/lilo.conf file often fixes LILO boot loader problems
+The GRUB boot loader errors are typically the result of a missing file in the /boot directory
+File systems can become corrupted due to high use accessing the hard drive
+Corrupted filesystems can be identified by very slow write requests, errors printed to the console, or failure to mount
+'''User Interface Related Problems'''
+*Users need to understand how to use their desktop environment, but often will not
+*Assistive technologies are tools you can use to modify your desktop experience
+Accessed by opening the system menu and navigate to preferences, assistive technologies
 '''Linux Performance'''

Chapter 14 Study Guide: Difference between revisions

Revision as of 03:32, 25 April 2012

Navigation menu

Search