KDUMP TESTS AUTOMATION SUITE
----------------------------

The kdump test automation suite helps run the kdump tests and report results. 
The testscripts cycle through a series of crash scenarios. Each test cycle does
 the following:

1.  Sets up a crash scenario.
2.  Forces a crash.
3.  Kdump kernel boots and saves a vmcore.
4.  System reboots to 1st kernel.
5.  vmcore is validated and results are saved.
6.  After a 1 to 2 minute delay, the next crash scenario is setup and run.

The scripts make use of the crasher module for basic testing of kdump and the 
new Linux Kernel Dump Test Module (LKDTM) for more involved testing. LKDTM makes
use of the kprobes infrastructure for inserting crashpoints into the kernel at 
run-time. Thus the kernel need not be patched and rebuilt.

KDUMP TEST INSTRUCTION
----------------------

Follow the steps to setup kdump test automation suite.

The tests are written for SuSE Linux Enterprise Server 10 (and onward releases)
as well as RedHat Enterprise Linux 5. Since KDUMP is supported by the above 
mentioned distro's the tests were written and tested on them. Contribution 
towards supporting more distributions are welcome.

1. Install these additional packages:

For SLES10 Distro :
   
   * kernel-kdump
   * kernel-source
   * kexec-tools
   * zlib-64bit-<xxx> (ppc64 only)

For RHEL5 distro :

   * kexec-tools
   * kernel-debuginfo rpm
   * kernel-kdump ( only for ppc64 )

2.  Make sure the partition where the tests are running has space for the tests
results and one vmcore file (size of physical memory) (Currently the test suite
support copying of the dump to the local disk only).

3.  Check if kernel has been booted with 'crashkernel=' parameter. If not, edit
the appropriate bootloader configuration file and make the following changes:

  * On i386 & x86_64, edit /boot/grub/menu.lst and add "crashkernel=128M@16M" to
    the kernel boot parameters. This reserves 128MB of memory for the kdump 
    kernel starting at address 0x1000000.
  * On PPC64, edit /etc/yaboot.conf and add "crashkernel=128M@32M xmon=off" to 
    the kernel boot parameters. This reserves 128MB of memory for the kdump 
    kernel starting at address 0x2000000 and makes sure xmon is off.

4.  Edit the bootloader configuration file to also add appropriate 
'nmi_watchdog=' parameter.
 
5.  Reboot. Verify that the output of /proc/iomem indicates that space has been
reserved for the crashkernel (for i386 & x86_64 only).

6.  'cd' to the test suite directory and run "make". Carefully check for any 
errors. If at some point you need to restart the tests from the beginning, 
simply run "make clean" followed by "make" again.

7.  To enable usage of 'crash' utility to test the integrity of the dump 
captured, path to vmlinux with debug info needs to be provided at the time of 
setup (the path would generally be 
/usr/lib/debug/lib/modules/<kernel version>/vmlinux).

8.  Run "./master run". Make sure no other copy of master is running before 
launching the master script.

Few Important points to remember:
      
* If you need to stop the tests before all tests have run, run "crontab -r" and 
"killall master" within 1 minute after the 1st kernel reboots.

* A failure is likely to occur when booting the kdump kernel. If this happens, 
you'll need to manually reset the system so it reboots back to the 1st kernel 
and continues on to the next test. For this reason, it's best to monitor the 
tests from a console. If possible, setup a serial console (not a must, any type
of console setup will do). If using minicom, enable saving of kernel messages 
displayed on minicom into a file, by pressing ctrl+a+l on the console. Else, 
when it is observed that the kdump kernel has failed to boot, manually copy the
boot message into a file to enable the debugging the cause of the hang.

* The results are saved in <kdump-test-dir>/results/<hostname>.<date-time>. The
"status" file in that directory shows where you are in the test run. When the 
"Test run complete" entry appears in that file, you're done.

* The test machine would be unavailabe for any other work during the period of 
the test run. 

