|
about kernel debug, if u are not doing some about device driver development, you may want to try the user mode linux kernel.
http://user-mode-linux.sourceforge.net/
can anybody try it and intro some experience here?
thx a lot
--------------this is an example from its site, looks great------------------
A debugging session
The following is the beginning of a gdb session with the kernel under gdb from the beginning. It starts at the top of start_kernel() and goes one line at a time through the initial kernel startup.
GNU gdb 4.17.0.11 with Linux support
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) att 1
Attaching to program `/home/dike/linux/2.3.26/um/linux', Pid 1
0x1009f791 in __kill ()
(gdb) b start_kernel
Breakpoint 1 at 0x100ddf83: file init/main.c, line 515.
(gdb) c
Continuing.
Breakpoint 1, start_kernel () at init/main.c:515
515 printk(linux_banner);
(gdb) n
516 setup_arch(&command_line);
(gdb)
517 printk("Kernel command line: %s\n", saved_command_line);
(gdb)
518 parse_options(command_line);
(gdb)
519 trap_init();
(gdb)
520 init_IRQ();
(gdb)
521 sched_init();
(gdb)
522 time_init();
(gdb)
523 softirq_init();
(gdb)
530 console_init();
This is tiring, so I just let it continue booting.
(gdb) c
Continuing.
It's booted, so I ^C it to see what it thinks is up.
Program received signal SIGINT, Interrupt.
0x100a4bc1 in __libc_nanosleep ()
(gdb) bt
#0 0x100a4bc1 in __libc_nanosleep ()
#1 0x100a4b7d in __sleep (seconds=10) at ../sysdeps/unix/sysv/linux/sleep.c:78
#2 0x10095fbf in do_idle () at process_kern.c:424
#3 0x10096052 in cpu_idle () at process_kern.c:450
#4 0x100de0a4 in start_kernel () at init/main.c:593
#5 0x10098df2 in start_kernel_proc (unused=0x0) at um_arch.c:72
#6 0x1009858f in signal_tramp (arg=0x10098db at trap_user.c:50
(gdb)
It's busy sleeping in the idle loop. I'll set a breakpoint in the scheduler and pick it up on the next context switch.
(gdb) b schedule
Breakpoint 2 at 0x10004acd: file sched.c, line 496.
(gdb) c
Continuing.
Breakpoint 2, schedule () at sched.c:496
496 if (!current->active_mm) BUG();
(gdb) bt
#0 schedule () at sched.c:496
#1 0x10095fb3 in do_idle () at process_kern.c:421
#2 0x10096052 in cpu_idle () at process_kern.c:450
#3 0x100de0a4 in start_kernel () at init/main.c:593
#4 0x10098df2 in start_kernel_proc (unused=0x0) at um_arch.c:72
#5 0x1009858f in signal_tramp (arg=0x10098db at trap_user.c:50
Here we are in the scheduler. I'll 'next' through the first few lines of the scheduler, get bored, and set a breakpoint in the SIGIO interrupt handler.
(gdb) n
497 if (tq_scheduler)
(gdb)
501 prev = current;
(gdb)
502 this_cpu = prev->processor;
(gdb)
504 if (in_interrupt())
(gdb)
510 if (softirq_state[this_cpu].active & softirq_state[this_cpu].mask)
(gdb)
518 sched_data = & aligned_data[this_cpu].schedule_data;
(gdb)
520 spin_lock_irq(&runqueue_lock);
(gdb) b sigio_handler
Breakpoint 3 at 0x10094fdc: file irq_user.c, line 36.
(gdb) c
Continuing.
Breakpoint 2, schedule () at sched.c:496
496 if (!current->active_mm) BUG();
Oops, that process scheduled back to the idle thread. Get rid of that breakpoint and continue again.
(gdb) d 2
(gdb) c
Well, the SIGIO handler is waiting for something and nothing is happening by itself. So, I'll type something at one of the login prompts to wake it up.
Breakpoint 3, sigio_handler (sig=29) at irq_user.c:36
36 user_mode = set_user_thread(NULL, 0, 0);
That did the trick. I'll climb down the call chain into the actual driver interrupt handler, starting with a breakpoint in do_IRQ.
(gdb) l
31 struct irq_fd *irq_fd;
32 struct timeval tv;
33 fd_set fds;
34 int i, n, user_mode;
35
36 user_mode = set_user_thread(NULL, 0, 0);
37 if(user_mode){
38 fill_in_regs(process_state(NULL, NULL, NULL), &sig + 1);
39 change_sig(SIGUSR1, 1);
40 }
(gdb) l
41 fds = active_fd_mask;
42 tv.tv_sec = 0;
43 tv.tv_usec = 0;
44 if((n = select(max_fd + 1, &fds, NULL, NULL, &tv)) < 0){
45 printk("sigio_handler : select returned %d, "
46 "errno = %d\n", n, errno);
47 return;
48 }
49 for(i=0;i<=max_fd;i++){
50 if(FD_ISSET(i, &fds)) FD_CLR(i, &active_fd_mask);
(gdb) l
51 }
52 for(irq_fd=active_fds;irq_fd != NULL;irq_fd = irq_fd->next){
53 if(FD_ISSET(irq_fd->fd, &fds)) do_IRQ(irq_fd->irq, user_mode);
54 }
55 if(user_mode){
56 interrupt_end();
57 change_sig(SIGUSR1, 0);
58 }
59 set_user_thread(NULL, user_mode, 0);
60 }
(gdb) b do_IRQ
Breakpoint 4 at 0x10094960: file irq.c, line 266.
(gdb) c
Continuing.
From here, I'll go into handle_IRQ_event.
Breakpoint 4, do_IRQ (irq=2, user_mode=0) at irq.c:266
266 irq_desc_t *desc = irq_desc + irq;
(gdb) n
271 regs.user_mode = user_mode;
(gdb)
272 kstat.irqs[cpu][irq]++;
(gdb)
274 desc->handler->ack(irq);
(gdb)
279 status = desc->status & ~(IRQ_REPLAY | IRQ_WAITING);
(gdb)
280 status |= IRQ_PENDING; /* we _want_ to handle it */
(gdb)
286 action = NULL;
(gdb)
287 if (!(status & (IRQ_DISABLED | IRQ_INPROGRESS))) {
(gdb) l
282 /*
283 * If the IRQ is disabled for whatever reason, we cannot
284 * use the action we have.
285 */
286 action = NULL;
287 if (!(status & (IRQ_DISABLED | IRQ_INPROGRESS))) {
288 action = desc->action;
289 status &= ~IRQ_PENDING; /* we commit to handling */
290 status |= IRQ_INPROGRESS; /* we are handling it */
291 }
(gdb) l
292 desc->status = status;
293
294 /*
295 * If there is no IRQ handler or it was disabled, exit early.
296 Since we set PENDING, if another processor is handling
297 a different instance of this same irq, the other processor
298 will take care of it.
299 */
300 if (!action)
301 goto out;
(gdb) l
302
303 /*
304 * Edge triggered interrupts need to remember
305 * pending events.
306 * This applies to any hw interrupts that allow a second
307 * instance of the same irq to arrive while we are in do_IRQ
308 * or in the handler. But the code here only handles the _second_
309 * instance of the irq, not the third or fourth. So it is mostly
310 * useful for irq hardware that does not mask cleanly in an
311 * SMP environment.
(gdb) l
312 */
313 for (;;) {
314 spin_unlock(&desc->lock);
315 handle_IRQ_event(irq, &regs, action);
316 spin_lock(&desc->lock);
317
318 if (!(desc->status & IRQ_PENDING))
319 break;
320 desc->status &= ~IRQ_PENDING;
321 }
(gdb) b 315
Breakpoint 5 at 0x100949b7: file irq.c, line 315.
(gdb) c
Continuing.
Next, I'll step into handle_IRQ_event and stop just before entering the driver.
Breakpoint 5, do_IRQ (irq=2, user_mode=0) at irq.c:315
315 handle_IRQ_event(irq, &regs, action);
(gdb) s
handle_IRQ_event (irq=2, regs=0x10113c40, action=0x50fef380) at irq.c:141
141 irq_enter(cpu, irq);
(gdb) l
136 struct irqaction * action)
137 {
138 int status;
139 int cpu = smp_processor_id();
140
141 irq_enter(cpu, irq);
142
143 status = 1; /* Force the "do bottom halves" bit */
144
145 if (!(action->flags & SA_INTERRUPT))
(gdb) l
146 __sti();
147
148 do {
149 status |= action->flags;
150 action->handler(irq, action->dev_id, regs);
151 action = action->next;
152 } while (action);
153 if (status & SA_SAMPLE_RANDOM)
154 add_interrupt_randomness(irq);
155 __cli();
(gdb) l
156
157 irq_exit(cpu, irq);
158
159 return status;
160 }
161
162 /*
163 * Generic enable/disable code: this just calls
164 * down into the PIC-specific version for the actual
165 * hardware disable after having gotten the irq
(gdb) b 150
Breakpoint 6 at 0x10094813: file irq.c, line 150.
(gdb) c
Continuing.
Breakpoint 6, handle_IRQ_event (irq=2, regs=0x10113c40, action=0x50fef380)
at irq.c:150
150 action->handler(irq, action->dev_id, regs);
So, here we are in the console driver. I think I've made whatever point I was making, so I'll just delete all the breakpoints, and let the kernel run so I can log in and halt it.
(gdb) s
con_handler (irq=2, dev=0x10120000, unused=0x10113c40) at stdio_console.c:41
41 stdio_rcv_proc(term->fd);
(gdb)
(gdb) d
Delete all breakpoints? (y or n) y
(gdb) c
Continuing. |
|