-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compilation failure on v850-elf target. #1
Comments
It seems this issue is caused by Clang. When I changed compiler's optimization level to -O0, as-new was crashed. |
Not only Clang but also Apple-gcc42. Segmentation Fault (Signal 11) on Apple-gcc42. |
monaka
pushed a commit
that referenced
this issue
May 15, 2014
PR tui/14880 shows a reproducer that triggers this assertion: int value_available_contents_eq (const struct value *val1, int offset1, const struct value *val2, int offset2, int length) { int idx1 = 0, idx2 = 0; /* This routine is used by printing routines, where we should already have read the value. Note that we only know whether a value chunk is available if we've tried to read it. */ gdb_assert (!val1->lazy && !val2->lazy); (top-gdb) bt #0 internal_error (file=0x88a26c "../../src/gdb/value.c", line=549, string=0x88a220 "%s: Assertion `%s' failed.") at ../../src/gdb/utils.c:844 #1 0x000000000057b9cd in value_available_contents_eq (val1=0x10fa900, offset1=0, val2=0x10f9e10, offset2=0, length=8) at ../../src/gdb/value.c:549 #2 0x00000000004fd756 in tui_get_register (frame=0xd5c430, data=0x109a548, regnum=0, changedp=0x109a560) at ../../src/gdb/tui/tui-regs.c:736 #3 0x00000000004fd111 in tui_check_register_values (frame=0xd5c430) at ../../src/gdb/tui/tui-regs.c:521 #4 0x0000000000501884 in tui_check_data_values (frame=0xd5c430) at ../../src/gdb/tui/tui-windata.c:234 #5 0x00000000004f976f in tui_selected_frame_level_changed_hook (level=1) at ../../src/gdb/tui/tui-hooks.c:222 #6 0x00000000006f0681 in select_frame (fi=0xd5c430) at ../../src/gdb/frame.c:1490 #7 0x00000000005dd94b in up_silently_base (count_exp=0x0) at ../../src/gdb/stack.c:2268 #8 0x00000000005dd985 in up_command (count_exp=0x0, from_tty=1) at ../../src/gdb/stack.c:2280 #9 0x00000000004dc5cf in do_cfunc (c=0xd3f720, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:113 #10 0x00000000004df664 in cmd_func (cmd=0xd3f720, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:1888 #11 0x00000000006e43e1 in execute_command (p=0xc7e6c2 "", from_tty=1) at ../../src/gdb/top.c:489 The fix is to fetch the value before comparing the contents. The comment additions to value.h explain why it can't be value_available_contents_eq itself that fetches the contents. Tested on x86_64 Fedora 17. gdb/ 2013-06-28 Pedro Alves <[email protected]> PR tui/14880 * tui/tui-regs.c (tui_get_register): Fetch register value contents before checking whether they're available. * value.c (value_available_contents_eq): Change comment. * value.h (value_available_contents_eq): Expand comment.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
In entry-values.exp, we have a test where the entry value of 'j' is unavailable, so it is expected that printing j@entry yields "<unavailable>". However, the actual output is: (gdb) frame #0 0x0000000000400540 in foo (i=0, i@entry=2, j=2, j@entry=<error reading variable: Cannot access memory at address 0x6009e8>) The error is thrown here: #0 throw_it (reason=RETURN_ERROR, error=MEMORY_ERROR, fmt=0x8cd550 "Cannot access memory at address %s", ap=0x7fffffffc8e8) at ../../src/gdb/exceptions.c:373 #1 0x00000000005e2f9c in throw_error (error=MEMORY_ERROR, fmt=0x8cd550 "Cannot access memory at address %s") at ../../src/gdb/exceptions.c:422 #2 0x0000000000673a5f in memory_error (status=5, memaddr=6293992) at ../../src/gdb/corefile.c:204 #3 0x0000000000673aea in read_memory (memaddr=6293992, myaddr=0x7fffffffca60 "\200\316\377\377\377\177", len=4) at ../../src/gdb/corefile.c:223 #4 0x00000000006784d1 in dwarf_expr_read_mem (baton=0x7fffffffcd50, buf=0x7fffffffca60 "\200\316\377\377\377\177", addr=6293992, len=4) at ../../src/gdb/dwarf2loc.c:334 #5 0x000000000067645e in execute_stack_op (ctx=0x1409480, op_ptr=0x7fffffffce87 "\237<\005@", op_end=0x7fffffffce88 "<\005@") at ../../src/gdb/dwarf2expr.c:1045 #6 0x0000000000674e29 in dwarf_expr_eval (ctx=0x1409480, addr=0x7fffffffce80 "\003\350\t`", len=8) at ../../src/gdb/dwarf2expr.c:364 #7 0x000000000067c5b2 in dwarf2_evaluate_loc_desc_full (type=0x10876d0, frame=0xd8ecc0, data=0x7fffffffce80 "\003\350\t`", size=8, per_cu=0xf24c40, byte_offset=0) at ../../src/gdb/dwarf2loc.c:2236 #8 0x000000000067cc65 in dwarf2_evaluate_loc_desc (type=0x10876d0, frame=0xd8ecc0, data=0x7fffffffce80 "\003\350\t`", size=8, per_cu=0xf24c40) at ../../src/gdb/dwarf2loc.c:2407 #9 0x000000000067a5d4 in dwarf_entry_parameter_to_value (parameter=0x13a7960, deref_size=18446744073709551615, type=0x10876d0, caller_frame=0xd8ecc0, per_cu=0xf24c40) at ../../src/gdb/dwarf2loc.c:1160 #10 0x000000000067a962 in value_of_dwarf_reg_entry (type=0x10876d0, frame=0xd8de70, kind=CALL_SITE_PARAMETER_DWARF_REG, kind_u=...) at ../../src/gdb/dwarf2loc.c:1310 #11 0x000000000067aaca in value_of_dwarf_block_entry (type=0x10876d0, frame=0xd8de70, block=0xf1c2d4 "Q", block_len=1) at ../../src/gdb/dwarf2loc.c:1363 #12 0x000000000067e7c9 in locexpr_read_variable_at_entry (symbol=0x13a7540, frame=0xd8de70) at ../../src/gdb/dwarf2loc.c:3326 #13 0x00000000005daab6 in read_frame_arg (sym=0x13a7540, frame=0xd8de70, argp=0x7fffffffd0e0, entryargp=0x7fffffffd100) at ../../src/gdb/stack.c:362 #14 0x00000000005db384 in print_frame_args (func=0x13a7470, frame=0xd8de70, num=-1, stream=0xea3890) at ../../src/gdb/stack.c:669 #15 0x00000000005dc338 in print_frame (frame=0xd8de70, print_level=1, print_what=SRC_AND_LOC, print_args=1, sal=...) at ../../src/gdb/stack.c:1199 #16 0x00000000005db8ee in print_frame_info (frame=0xd8de70, print_level=1, print_what=SRC_AND_LOC, print_args=1) at ../../src/gdb/stack.c:851 #17 0x00000000005da2bb in print_stack_frame (frame=0xd8de70, print_level=1, print_what=SRC_AND_LOC) at ../../src/gdb/stack.c:169 #18 0x00000000005de236 in frame_command (level_exp=0x0, from_tty=1) at ../../src/gdb/stack.c:2265 dwarf2_evaluate_loc_desc_full (frame #7) knows to handle NOT_AVAILABLE_ERROR errors, but read_memory always throws a generic error. Presently, only the value machinery knows to handle unavailable memory. We need to push the awareness down to the target_xfer layer, making it return a finer grained error indication. We can only return a generic -1 nowadays, which leaves the upper layers with no clue on why the xfer failed. Use target_xfer_partial directly, rather than propagating the error through target_read_memory so as to get a better address to display in the error message. (target_read_memory & friends build on top of target_read (thus the target_xfer machinery), but turn all errors to EIO, an errno value. I think this is a mistake, and we'd better convert all these to return a target_xfer_error too, but that can be done separately. I looked around a bit over memory_error calls, and the need to handle random errno values, other than the EIOs gdb itself hardcodes, probably comes (only) from deprecated_xfer_memory, which uses errno for error indication, but I didn't look exhaustively. We should really get rid of deprecated_xfer_memory and of passing down errno values as error indication in target_read & friends methods). Tested on x86_64 Fedora 17, native and gdbserver. Fixes the test in the PR, which will be added to the testsuite later. gdb/ 2013-08-22 Pedro Alves <[email protected]> PR gdb/15871 * corefile.c (target_xfer_memory_error): New function. (memory_error): Defer EIO to target_memory_error. (read_memory): Use target_xfer_partial, and handle finer-grained target xfer errors. * target.c (target_xfer_error_to_string): New function. (memory_xfer_partial_1): If memory is known to be unavailable, return TARGET_XFER_E_UNAVAILABLE instead of -1. (target_xfer_partial): Make extern. * target.h (enum target_xfer_error): New enum. (target_xfer_error_to_string): Declare function. (target_xfer_partial): Declare function. (struct target_ops) <xfer_partial>: Adjust describing comment.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
… "break", "list") "info threads" changes the default source for "break" and "list", to whatever the location of the first/bottom thread in the thread list is... (gdb) b start (gdb) c ... (gdb) list *lists "start"* (gdb) b 23 Breakpoint 3 at 0x400614: file test.c, line 23. (gdb) info threads Id Target Id Frame * 2 Thread 0x7ffff7fcb700 (LWP 1760) "test" start (arg=0x0) at test.c:23 1 Thread 0x7ffff7fcc740 (LWP 1748) "test" 0x000000323dc08e60 in pthread_join (threadid=140737353922304, thread_return=0x0) at pthread_join.c:93 (gdb) b 23 Breakpoint 4 at 0x323dc08d90: file pthread_join.c, line 23. ^^^^^^^^^^^^^^^ (gdb) list 93 lll_wait_tid (pd->tid); 94 95 96 /* Restore cancellation mode. */ 97 CANCEL_RESET (oldtype); 98 99 /* Remove the handler. */ 100 pthread_cleanup_pop (0); 101 102 The issue is that print_stack_frame always sets the current sal to the frame's sal. print_frame_info (which print_stack_frame calls to do most of the work) also sets the last displayed sal, but only if print_what isn't LOCATION. Now the call in question, from within thread.c:print_thread_info, does pass in LOCATION as print_what, but print_stack_frame doesn't have the same check print_frame_info has. We could consider adding it, but setting these globals depending on print_what isn't very clean, IMO. What we have is two logically distinct operations mixed in the same function(s): #1 - print frame, in the format specified by {print_what, print_level and print_args}. #2 - We're displaying a frame to the user, and I want the default sal to point here, because the program stopped here, or the user did some context-changing command (up, down, etc.). So I added a new parameter to print_stack_frame & friends for point #2, and went through all calls in the tree adjusting as necessary. Tested on x86_64 Fedora 17. gdb/ 2013-09-17 Pedro Alves <[email protected]> PR gdb/15911 * ada-tasks.c (task_command_1): Adjust call to print_stack_frame. * bsd-kvm.c (bsd_kvm_open, bsd_kvm_proc_cmd, bsd_kvm_pcb_cmd): * corelow.c (core_open): * frame.h (print_stack_frame, print_frame_info): New 'set_current_sal' parameter. * infcmd.c (finish_command, kill_command): Adjust call to print_stack_frame. * inferior.c (inferior_command): Likewise. * infrun.c (normal_stop): Likewise. * linux-fork.c (linux_fork_context): Likewise. * record-full.c (record_full_goto_entry, record_full_restore): Likewise. * remote-mips.c (common_open): Likewise. * stack.c (print_stack_frame): New 'set_current_sal' parameter. Use it. (print_frame_info): New 'set_current_sal' parameter. Set the last displayed sal depending on the new paremeter instead of looking at print_what. (backtrace_command_1, select_and_print_frame, frame_command) (current_frame_command, up_command, down_command): Adjust call to print_stack_frame. * thread.c (print_thread_info, restore_selected_frame) (do_captured_thread_select): Adjust call to print_stack_frame. * tracepoint.c (tfind_1): Likewise. * mi/mi-cmd-stack.c (mi_cmd_stack_list_frames) (mi_cmd_stack_info_frame): Likewise. * mi/mi-interp.c (mi_on_normal_stop): Likewise. * mi/mi-main.c (mi_cmd_exec_return, mi_cmd_trace_find): Likewise. gdb/testsuite/ * gdb.threads/info-threads-cur-sal-2.c: New file. * gdb.threads/info-threads-cur-sal.c: New file. * gdb.threads/info-threads-cur-sal.exp: New file.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
Hi, This FIXME goes into my eyes, when I am about to modify something here, /* Name is allocated by name_of_child. */ /* FIXME: xstrdup should not be here. */ This FIXME was introduced in the python pretty-pretter patches. Python pretty-printing [6/6] https://sourceware.org/ml/gdb-patches/2009-05/msg00467.html create_child_with_value is called in two paths, 1. varobj_list_children -> create_child -> create_child_with_value, 2. install_dynamic_child -> install_dynamic_child -> varobj_add_child -> create_child_with_value In path #1, 'name' is allocated by name_of_child, as the original comment said, we don't have to duplicate NAME in create_child_with_value. In path #2, 'name' is got from PyArg_ParseTuple, and we have to duplicate NAME. This patch removes the call to xstrdup in create_child_with_value and call xstrudp in update_dynamic_varobj_children (path #2). gdb: 2013-10-04 Yao Qi <[email protected]> * varobj.c (create_child_with_value): Remove 'const' from the type of parameter 'name'. (varobj_add_child): Likewise. (install_dynamic_child): Remove 'const' from the type of parameter 'name'. (varobj_add_child): Likewise. (create_child_with_value): Likewise. Update comments. Don't duplicate 'name'. (update_dynamic_varobj_children): Duplicate 'name' and pass it to install_dynamic_child.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
The UNWIND_SAME_ID check is done between THIS_FRAME and the next frame when we go try to unwind the previous frame. But at this point, it's already too late -- we ended up with two frames with the same ID in the frame chain. Each frame having its own ID is an invariant assumed throughout GDB. This patch applies the UNWIND_SAME_ID detection earlier, right after the previous frame is unwound, discarding the dup frame if a cycle is detected. The patch includes a new test that fails before the change. Before the patch, the test causes an infinite loop in GDB, after the patch, the UNWIND_SAME_ID logic kicks in and makes the backtrace stop with: Backtrace stopped: previous frame identical to this frame (corrupt stack?) The test uses dwarf CFI to emulate a corrupted stack with a cycle. It has a function with registers marked DW_CFA_same_value (most importantly RSP/RIP), so that GDB computes the same ID for that frame and its caller. IOW, something like this: #0 - frame_id_1 #1 - frame_id_2 #2 - frame_id_3 #3 - frame_id_4 #4 - frame_id_4 <<<< outermost (UNWIND_SAME_ID). (The test's code is just a copy of dw2-reg-undefined.S / dw2-reg-undefined.c, adjusted to use DW_CFA_same_value instead of DW_CFA_undefined, and to mark a different set of registers.) The infinite loop is here, in value_fetch_lazy: while (VALUE_LVAL (new_val) == lval_register && value_lazy (new_val)) { frame = frame_find_by_id (VALUE_FRAME_ID (new_val)); ... new_val = get_frame_register_value (frame, regnum); } get_frame_register_value can return a lazy register value pointing to the next frame. This means that the register wasn't clobbered by FRAME; the debugger should therefore retrieve its value from the next frame. To be clear, get_frame_register_value unwinds the value in question from the next frame: struct value * get_frame_register_value (struct frame_info *frame, int regnum) { return frame_unwind_register_value (frame->next, regnum); ^^^^^^^^^^^ } In other words, if we get a lazy lval_register, it should have the frame ID of the _next_ frame, never of FRAME. At this point in value_fetch_lazy, the whole relevant chunk of the stack up to frame #4 has already been unwound. The loop always "unlazies" lval_registers in the "next/innermost" direction, not in the "prev/unwind further/outermost" direction. So say we're looking at frame #4. get_frame_register_value in frame #4 can return a lazy register value of frame #3. So the next iteration, frame_find_by_id tries to read the register from frame #3. But, since frame #4 happens to have same id as frame #3, frame_find_by_id returns frame #4 instead. Rinse, repeat, and we have an infinite loop. This is an old latent problem, exposed by the recent addition of the frame stash. Before we had a stash, frame_find_by_id(frame_id_4) would walk over all frames starting at the current frame, and would always find #3 first. The stash happens to return #4 instead: struct frame_info * frame_find_by_id (struct frame_id id) { struct frame_info *frame, *prev_frame; ... /* Try using the frame stash first. Finding it there removes the need to perform the search by looping over all frames, which can be very CPU-intensive if the number of frames is very high (the loop is O(n) and get_prev_frame performs a series of checks that are relatively expensive). This optimization is particularly useful when this function is called from another function (such as value_fetch_lazy, case VALUE_LVAL (val) == lval_register) which already loops over all frames, making the overall behavior O(n^2). */ frame = frame_stash_find (id); if (frame) return frame; for (frame = get_current_frame (); ; frame = prev_frame) { gdb/ 2013-11-22 Pedro Alves <[email protected]> PR 16155 * frame.c (get_prev_frame_1): Do the UNWIND_SAME_ID check between this frame and the new previous frame, not between this frame and the next frame. gdb/testsuite/ 2013-11-22 Pedro Alves <[email protected]> PR 16155 * gdb.dwarf2/dw2-dup-frame.S: New file. * gdb.dwarf2/dw2-dup-frame.c: New file. * gdb.dwarf2/dw2-dup-frame.exp: New file.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
…sniffer (move dwarf2_tailcall_sniffer_first elsewhere). Two rationales, same patch. TL;DR 1: dwarf2_frame_cache recursion is evil. dwarf2_frame_cache calls dwarf2_tailcall_sniffer_first which then recurses into dwarf2_frame_cache. TL;DR 2: An unwinder trying to unwind is evil. dwarf2_frame_sniffer calls dwarf2_frame_cache which calls dwarf2_tailcall_sniffer_first which then tries to unwind the PC of the previous frame. Avoid all that by deferring dwarf2_tailcall_sniffer_first until it's really necessary. Rationale 1 =========== A frame sniffer should not try to unwind, because that bypasses all the validation checks done by get_prev_frame. The UNWIND_SAME_ID scenario is one such case where GDB is currently broken because (in part) of this (the next patch adds a test that would fail without this). GDB goes into an infinite loop in value_fetch_lazy, here: while (VALUE_LVAL (new_val) == lval_register && value_lazy (new_val)) { frame = frame_find_by_id (VALUE_FRAME_ID (new_val)); ... new_val = get_frame_register_value (frame, regnum); } (top-gdb) bt #0 value_fetch_lazy (val=0x11516d0) at ../../src/gdb/value.c:3510 #1 0x0000000000584bd8 in value_optimized_out (value=0x11516d0) at ../../src/gdb/value.c:1096 #2 0x00000000006fe7a1 in frame_register_unwind (frame=0x1492600, regnum=16, optimizedp=0x7fffffffcdec, unavailablep=0x7fffffffcde8, lvalp=0x7fffffffcdd8, addrp= 0x7fffffffcde0, realnump=0x7fffffffcddc, bufferp=0x7fffffffce10 "@\316\377\377\377\177") at ../../src/gdb/frame.c:940 #3 0x00000000006fea3a in frame_unwind_register (frame=0x1492600, regnum=16, buf=0x7fffffffce10 "@\316\377\377\377\177") at ../../src/gdb/frame.c:990 #4 0x0000000000473b9b in i386_unwind_pc (gdbarch=0xf54660, next_frame=0x1492600) at ../../src/gdb/i386-tdep.c:1771 #5 0x0000000000601dfa in gdbarch_unwind_pc (gdbarch=0xf54660, next_frame=0x1492600) at ../../src/gdb/gdbarch.c:2870 #6 0x0000000000693db5 in dwarf2_tailcall_sniffer_first (this_frame=0x1492600, tailcall_cachep=0x14926f0, entry_cfa_sp_offsetp=0x7fffffffcf00) at ../../src/gdb/dwarf2-frame-tailcall.c:389 #7 0x0000000000690928 in dwarf2_frame_cache (this_frame=0x1492600, this_cache=0x1492618) at ../../src/gdb/dwarf2-frame.c:1245 #8 0x0000000000690f46 in dwarf2_frame_sniffer (self=0x8e4980, this_frame=0x1492600, this_cache=0x1492618) at ../../src/gdb/dwarf2-frame.c:1423 #9 0x000000000070203b in frame_unwind_find_by_frame (this_frame=0x1492600, this_cache=0x1492618) at ../../src/gdb/frame-unwind.c:112 #10 0x00000000006fd681 in get_frame_id (fi=0x1492600) at ../../src/gdb/frame.c:408 #11 0x00000000007006c2 in get_prev_frame_1 (this_frame=0xdc1860) at ../../src/gdb/frame.c:1826 #12 0x0000000000700b7a in get_prev_frame (this_frame=0xdc1860) at ../../src/gdb/frame.c:2056 #13 0x0000000000514588 in frame_info_to_frame_object (frame=0xdc1860) at ../../src/gdb/python/py-frame.c:322 #14 0x000000000051784c in bootstrap_python_frame_filters (frame=0xdc1860, frame_low=0, frame_high=-1) at ../../src/gdb/python/py-framefilter.c:1396 #15 0x0000000000517a6f in apply_frame_filter (frame=0xdc1860, flags=7, args_type=CLI_SCALAR_VALUES, out=0xed7a90, frame_low=0, frame_high=-1) at ../../src/gdb/python/py-framefilter.c:1492 #16 0x00000000005e77b0 in backtrace_command_1 (count_exp=0x0, show_locals=0, no_filters=0, from_tty=1) at ../../src/gdb/stack.c:1777 #17 0x00000000005e7c0f in backtrace_command (arg=0x0, from_tty=1) at ../../src/gdb/stack.c:1891 #18 0x00000000004e37a7 in do_cfunc (c=0xda4fa0, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:107 #19 0x00000000004e683c in cmd_func (cmd=0xda4fa0, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:1882 #20 0x00000000006f35ed in execute_command (p=0xcc66c2 "", from_tty=1) at ../../src/gdb/top.c:468 #21 0x00000000005f8853 in command_handler (command=0xcc66c0 "bt") at ../../src/gdb/event-top.c:435 #22 0x00000000005f8e12 in command_line_handler (rl=0xfe05f0 "@") at ../../src/gdb/event-top.c:632 #23 0x000000000074d2c6 in rl_callback_read_char () at ../../src/readline/callback.c:220 #24 0x00000000005f8375 in rl_callback_read_char_wrapper (client_data=0x0) at ../../src/gdb/event-top.c:164 #25 0x00000000005f876a in stdin_event_handler (error=0, client_data=0x0) at ../../src/gdb/event-top.c:375 #26 0x00000000005f72fa in handle_file_event (data=...) at ../../src/gdb/event-loop.c:768 #27 0x00000000005f67a3 in process_event () at ../../src/gdb/event-loop.c:342 #28 0x00000000005f686a in gdb_do_one_event () at ../../src/gdb/event-loop.c:406 #29 0x00000000005f68bb in start_event_loop () at ../../src/gdb/event-loop.c:431 #30 0x00000000005f83a7 in cli_command_loop (data=0x0) at ../../src/gdb/event-top.c:179 #31 0x00000000005eeed3 in current_interp_command_loop () at ../../src/gdb/interps.c:327 #32 0x00000000005ef8ff in captured_command_loop (data=0x0) at ../../src/gdb/main.c:267 #33 0x00000000005ed2f6 in catch_errors (func=0x5ef8e4 <captured_command_loop>, func_args=0x0, errstring=0x8b6554 "", mask=RETURN_MASK_ALL) at ../../src/gdb/exceptions.c:524 #34 0x00000000005f0d21 in captured_main (data=0x7fffffffd9e0) at ../../src/gdb/main.c:1067 #35 0x00000000005ed2f6 in catch_errors (func=0x5efb9b <captured_main>, func_args=0x7fffffffd9e0, errstring=0x8b6554 "", mask=RETURN_MASK_ALL) at ../../src/gdb/exceptions.c:524 #36 0x00000000005f0d57 in gdb_main (args=0x7fffffffd9e0) at ../../src/gdb/main.c:1076 #37 0x000000000045bb6a in main (argc=4, argv=0x7fffffffdae8) at ../../src/gdb/gdb.c:34 (top-gdb) GDB is trying to unwind the PC register of the previous frame (frame #5 above), starting from the frame being sniffed (the THIS frame). But the THIS frame's unwinder says the PC of the previous frame is actually the same as the previous's frame's next frame (which is the same frame we started with, the THIS frame), therefore it returns an lval_register lazy value with frame set to THIS frame. And so the value_fetch_lazy loop never ends. Rationale 2 =========== As an experiment, I tried making dwarf2-frame.c:read_addr_from_reg use address_from_register. That caused a bunch of regressions, but it actually took me a long while to figure out what was going on. Turns out dwarf2-frame.c:read_addr_from_reg is called while computing the frame's CFA, from within dwarf2_frame_cache. address_from_register wants to create a register with frame_id set to the frame being constructed. To create the frame id, we again call dwarf2_frame_cache, which given: static struct dwarf2_frame_cache * dwarf2_frame_cache (struct frame_info *this_frame, void **this_cache) { ... if (*this_cache) return *this_cache; returns an incomplete object to the caller: static void dwarf2_frame_this_id (struct frame_info *this_frame, void **this_cache, struct frame_id *this_id) { struct dwarf2_frame_cache *cache = dwarf2_frame_cache (this_frame, this_cache); ... (*this_id) = frame_id_build (cache->cfa, get_frame_func (this_frame)); } As cache->cfa is still 0 (we were trying to compute it!), and get_frame_id recalls this id from here on, we end up with a broken frame id in recorded for this frame. Later, when inspecting locals, the dwarf machinery needs to know the selected frame's base, which calls get_frame_base: CORE_ADDR get_frame_base (struct frame_info *fi) { return get_frame_id (fi).stack_addr; } which as seen above then returns 0 ... So I gave up using address_from_register. But, the pain of investigating this made me want to have GDB itself assert that recursion never happens here. So I wrote a patch to do that. But, it triggers on current mainline, because dwarf2_tailcall_sniffer_first, called from dwarf2_frame_cache, unwinds the this_frame. A sniffer shouldn't be trying to unwind, exactly because of this sort of tricky issue. The patch defers calling dwarf2_tailcall_sniffer_first until it's really necessary, in dwarf2_frame_prev_register (thus actually outside the sniffer path). As this makes the call to dwarf2_frame_sniffer in dwarf2_frame_cache unnecessary again, the patch removes that too. Tested on x86_64 Fedora 17. gdb/ 2013-11-22 Pedro Alves <[email protected]> PR 16155 * dwarf2-frame.c (struct dwarf2_frame_cache) <checked_tailcall_bottom, entry_cfa_sp_offset, entry_cfa_sp_offset_p>: New fields. (dwarf2_frame_cache): Adjust to use the new cache fields instead of locals. Don't call dwarf2_tailcall_sniffer_first here. (dwarf2_frame_prev_register): Call it here, but only once.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
…sniffer (move dwarf2_tailcall_sniffer_first elsewhere). Two rationales, same patch. TL;DR 1: dwarf2_frame_cache recursion is evil. dwarf2_frame_cache calls dwarf2_tailcall_sniffer_first which then recurses into dwarf2_frame_cache. TL;DR 2: An unwinder trying to unwind is evil. dwarf2_frame_sniffer calls dwarf2_frame_cache which calls dwarf2_tailcall_sniffer_first which then tries to unwind the PC of the previous frame. Avoid all that by deferring dwarf2_tailcall_sniffer_first until it's really necessary. Rationale 1 =========== A frame sniffer should not try to unwind, because that bypasses all the validation checks done by get_prev_frame. The UNWIND_SAME_ID scenario is one such case where GDB is currently broken because (in part) of this (the next patch adds a test that would fail without this). GDB goes into an infinite loop in value_fetch_lazy, here: while (VALUE_LVAL (new_val) == lval_register && value_lazy (new_val)) { frame = frame_find_by_id (VALUE_FRAME_ID (new_val)); ... new_val = get_frame_register_value (frame, regnum); } (top-gdb) bt #0 value_fetch_lazy (val=0x11516d0) at ../../src/gdb/value.c:3510 #1 0x0000000000584bd8 in value_optimized_out (value=0x11516d0) at ../../src/gdb/value.c:1096 #2 0x00000000006fe7a1 in frame_register_unwind (frame=0x1492600, regnum=16, optimizedp=0x7fffffffcdec, unavailablep=0x7fffffffcde8, lvalp=0x7fffffffcdd8, addrp= 0x7fffffffcde0, realnump=0x7fffffffcddc, bufferp=0x7fffffffce10 "@\316\377\377\377\177") at ../../src/gdb/frame.c:940 #3 0x00000000006fea3a in frame_unwind_register (frame=0x1492600, regnum=16, buf=0x7fffffffce10 "@\316\377\377\377\177") at ../../src/gdb/frame.c:990 #4 0x0000000000473b9b in i386_unwind_pc (gdbarch=0xf54660, next_frame=0x1492600) at ../../src/gdb/i386-tdep.c:1771 #5 0x0000000000601dfa in gdbarch_unwind_pc (gdbarch=0xf54660, next_frame=0x1492600) at ../../src/gdb/gdbarch.c:2870 #6 0x0000000000693db5 in dwarf2_tailcall_sniffer_first (this_frame=0x1492600, tailcall_cachep=0x14926f0, entry_cfa_sp_offsetp=0x7fffffffcf00) at ../../src/gdb/dwarf2-frame-tailcall.c:389 #7 0x0000000000690928 in dwarf2_frame_cache (this_frame=0x1492600, this_cache=0x1492618) at ../../src/gdb/dwarf2-frame.c:1245 #8 0x0000000000690f46 in dwarf2_frame_sniffer (self=0x8e4980, this_frame=0x1492600, this_cache=0x1492618) at ../../src/gdb/dwarf2-frame.c:1423 #9 0x000000000070203b in frame_unwind_find_by_frame (this_frame=0x1492600, this_cache=0x1492618) at ../../src/gdb/frame-unwind.c:112 #10 0x00000000006fd681 in get_frame_id (fi=0x1492600) at ../../src/gdb/frame.c:408 #11 0x00000000007006c2 in get_prev_frame_1 (this_frame=0xdc1860) at ../../src/gdb/frame.c:1826 #12 0x0000000000700b7a in get_prev_frame (this_frame=0xdc1860) at ../../src/gdb/frame.c:2056 #13 0x0000000000514588 in frame_info_to_frame_object (frame=0xdc1860) at ../../src/gdb/python/py-frame.c:322 #14 0x000000000051784c in bootstrap_python_frame_filters (frame=0xdc1860, frame_low=0, frame_high=-1) at ../../src/gdb/python/py-framefilter.c:1396 #15 0x0000000000517a6f in apply_frame_filter (frame=0xdc1860, flags=7, args_type=CLI_SCALAR_VALUES, out=0xed7a90, frame_low=0, frame_high=-1) at ../../src/gdb/python/py-framefilter.c:1492 #16 0x00000000005e77b0 in backtrace_command_1 (count_exp=0x0, show_locals=0, no_filters=0, from_tty=1) at ../../src/gdb/stack.c:1777 #17 0x00000000005e7c0f in backtrace_command (arg=0x0, from_tty=1) at ../../src/gdb/stack.c:1891 #18 0x00000000004e37a7 in do_cfunc (c=0xda4fa0, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:107 #19 0x00000000004e683c in cmd_func (cmd=0xda4fa0, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:1882 #20 0x00000000006f35ed in execute_command (p=0xcc66c2 "", from_tty=1) at ../../src/gdb/top.c:468 #21 0x00000000005f8853 in command_handler (command=0xcc66c0 "bt") at ../../src/gdb/event-top.c:435 #22 0x00000000005f8e12 in command_line_handler (rl=0xfe05f0 "@") at ../../src/gdb/event-top.c:632 #23 0x000000000074d2c6 in rl_callback_read_char () at ../../src/readline/callback.c:220 #24 0x00000000005f8375 in rl_callback_read_char_wrapper (client_data=0x0) at ../../src/gdb/event-top.c:164 #25 0x00000000005f876a in stdin_event_handler (error=0, client_data=0x0) at ../../src/gdb/event-top.c:375 #26 0x00000000005f72fa in handle_file_event (data=...) at ../../src/gdb/event-loop.c:768 #27 0x00000000005f67a3 in process_event () at ../../src/gdb/event-loop.c:342 #28 0x00000000005f686a in gdb_do_one_event () at ../../src/gdb/event-loop.c:406 #29 0x00000000005f68bb in start_event_loop () at ../../src/gdb/event-loop.c:431 #30 0x00000000005f83a7 in cli_command_loop (data=0x0) at ../../src/gdb/event-top.c:179 #31 0x00000000005eeed3 in current_interp_command_loop () at ../../src/gdb/interps.c:327 #32 0x00000000005ef8ff in captured_command_loop (data=0x0) at ../../src/gdb/main.c:267 #33 0x00000000005ed2f6 in catch_errors (func=0x5ef8e4 <captured_command_loop>, func_args=0x0, errstring=0x8b6554 "", mask=RETURN_MASK_ALL) at ../../src/gdb/exceptions.c:524 #34 0x00000000005f0d21 in captured_main (data=0x7fffffffd9e0) at ../../src/gdb/main.c:1067 #35 0x00000000005ed2f6 in catch_errors (func=0x5efb9b <captured_main>, func_args=0x7fffffffd9e0, errstring=0x8b6554 "", mask=RETURN_MASK_ALL) at ../../src/gdb/exceptions.c:524 #36 0x00000000005f0d57 in gdb_main (args=0x7fffffffd9e0) at ../../src/gdb/main.c:1076 #37 0x000000000045bb6a in main (argc=4, argv=0x7fffffffdae8) at ../../src/gdb/gdb.c:34 (top-gdb) GDB is trying to unwind the PC register of the previous frame (frame #5 above), starting from the frame being sniffed (the THIS frame). But the THIS frame's unwinder says the PC of the previous frame is actually the same as the previous's frame's next frame (which is the same frame we started with, the THIS frame), therefore it returns an lval_register lazy value with frame set to THIS frame. And so the value_fetch_lazy loop never ends. Rationale 2 =========== As an experiment, I tried making dwarf2-frame.c:read_addr_from_reg use address_from_register. That caused a bunch of regressions, but it actually took me a long while to figure out what was going on. Turns out dwarf2-frame.c:read_addr_from_reg is called while computing the frame's CFA, from within dwarf2_frame_cache. address_from_register wants to create a register with frame_id set to the frame being constructed. To create the frame id, we again call dwarf2_frame_cache, which given: static struct dwarf2_frame_cache * dwarf2_frame_cache (struct frame_info *this_frame, void **this_cache) { ... if (*this_cache) return *this_cache; returns an incomplete object to the caller: static void dwarf2_frame_this_id (struct frame_info *this_frame, void **this_cache, struct frame_id *this_id) { struct dwarf2_frame_cache *cache = dwarf2_frame_cache (this_frame, this_cache); ... (*this_id) = frame_id_build (cache->cfa, get_frame_func (this_frame)); } As cache->cfa is still 0 (we were trying to compute it!), and get_frame_id recalls this id from here on, we end up with a broken frame id in recorded for this frame. Later, when inspecting locals, the dwarf machinery needs to know the selected frame's base, which calls get_frame_base: CORE_ADDR get_frame_base (struct frame_info *fi) { return get_frame_id (fi).stack_addr; } which as seen above then returns 0 ... So I gave up using address_from_register. But, the pain of investigating this made me want to have GDB itself assert that recursion never happens here. So I wrote a patch to do that. But, it triggers on current mainline, because dwarf2_tailcall_sniffer_first, called from dwarf2_frame_cache, unwinds the this_frame. A sniffer shouldn't be trying to unwind, exactly because of this sort of tricky issue. The patch defers calling dwarf2_tailcall_sniffer_first until it's really necessary, in dwarf2_frame_prev_register (thus actually outside the sniffer path). As this makes the call to dwarf2_frame_sniffer in dwarf2_frame_cache unnecessary again, the patch removes that too. Tested on x86_64 Fedora 17. gdb/ 2013-11-22 Pedro Alves <[email protected]> PR 16155 * dwarf2-frame.c (struct dwarf2_frame_cache) <checked_tailcall_bottom, entry_cfa_sp_offset, entry_cfa_sp_offset_p>: New fields. (dwarf2_frame_cache): Adjust to use the new cache fields instead of locals. Don't call dwarf2_tailcall_sniffer_first here. (dwarf2_frame_prev_register): Call it here, but only once.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
The UNWIND_SAME_ID check is done between THIS_FRAME and the next frame when we go try to unwind the previous frame. But at this point, it's already too late -- we ended up with two frames with the same ID in the frame chain. Each frame having its own ID is an invariant assumed throughout GDB. This patch applies the UNWIND_SAME_ID detection earlier, right after the previous frame is unwound, discarding the dup frame if a cycle is detected. The patch includes a new test that fails before the change. Before the patch, the test causes an infinite loop in GDB, after the patch, the UNWIND_SAME_ID logic kicks in and makes the backtrace stop with: Backtrace stopped: previous frame identical to this frame (corrupt stack?) The test uses dwarf CFI to emulate a corrupted stack with a cycle. It has a function with registers marked DW_CFA_same_value (most importantly RSP/RIP), so that GDB computes the same ID for that frame and its caller. IOW, something like this: #0 - frame_id_1 #1 - frame_id_2 #2 - frame_id_3 #3 - frame_id_4 #4 - frame_id_4 <<<< outermost (UNWIND_SAME_ID). (The test's code is just a copy of dw2-reg-undefined.S / dw2-reg-undefined.c, adjusted to use DW_CFA_same_value instead of DW_CFA_undefined, and to mark a different set of registers.) The infinite loop is here, in value_fetch_lazy: while (VALUE_LVAL (new_val) == lval_register && value_lazy (new_val)) { frame = frame_find_by_id (VALUE_FRAME_ID (new_val)); ... new_val = get_frame_register_value (frame, regnum); } get_frame_register_value can return a lazy register value pointing to the next frame. This means that the register wasn't clobbered by FRAME; the debugger should therefore retrieve its value from the next frame. To be clear, get_frame_register_value unwinds the value in question from the next frame: struct value * get_frame_register_value (struct frame_info *frame, int regnum) { return frame_unwind_register_value (frame->next, regnum); ^^^^^^^^^^^ } In other words, if we get a lazy lval_register, it should have the frame ID of the _next_ frame, never of FRAME. At this point in value_fetch_lazy, the whole relevant chunk of the stack up to frame #4 has already been unwound. The loop always "unlazies" lval_registers in the "next/innermost" direction, not in the "prev/unwind further/outermost" direction. So say we're looking at frame #4. get_frame_register_value in frame #4 can return a lazy register value of frame #3. So the next iteration, frame_find_by_id tries to read the register from frame #3. But, since frame #4 happens to have same id as frame #3, frame_find_by_id returns frame #4 instead. Rinse, repeat, and we have an infinite loop. This is an old latent problem, exposed by the recent addition of the frame stash. Before we had a stash, frame_find_by_id(frame_id_4) would walk over all frames starting at the current frame, and would always find #3 first. The stash happens to return #4 instead: struct frame_info * frame_find_by_id (struct frame_id id) { struct frame_info *frame, *prev_frame; ... /* Try using the frame stash first. Finding it there removes the need to perform the search by looping over all frames, which can be very CPU-intensive if the number of frames is very high (the loop is O(n) and get_prev_frame performs a series of checks that are relatively expensive). This optimization is particularly useful when this function is called from another function (such as value_fetch_lazy, case VALUE_LVAL (val) == lval_register) which already loops over all frames, making the overall behavior O(n^2). */ frame = frame_stash_find (id); if (frame) return frame; for (frame = get_current_frame (); ; frame = prev_frame) { gdb/ 2013-11-22 Pedro Alves <[email protected]> PR 16155 * frame.c (get_prev_frame_1): Do the UNWIND_SAME_ID check between this frame and the new previous frame, not between this frame and the next frame. gdb/testsuite/ 2013-11-22 Pedro Alves <[email protected]> PR 16155 * gdb.dwarf2/dw2-dup-frame.S: New file. * gdb.dwarf2/dw2-dup-frame.c: New file. * gdb.dwarf2/dw2-dup-frame.exp: New file.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
Given we already have the frame id stash, which holds the ids of all frames in the chain, detecting corrupted stacks with wide stack cycles with non-consecutive dup frame ids is just as cheap as just detecting cycles in consecutive frames: #0 frame_id1 #1 frame_id2 #2 frame_id3 #3 frame_id1 #4 frame_id2 #5 frame_id3 #6 frame_id1 ... forever ... We just need to check whether the stash already knows about a given frame id instead of comparing the ids of the previous/this frames. Tested on x86_64 Fedora 17. gdb/ 2013-11-22 Pedro Alves <[email protected]> Tom Tromey <[email protected]> * frame.c (frame_stash_add): Now returns whether a frame with the same ID was already known. (compute_frame_id): New function, factored out from get_frame_id. (get_frame_id): No longer lazilly compute the frame id here. (get_prev_frame_if_no_cycle): New function. Detects wider stack cycles. (get_prev_frame_1): Use it instead of get_prev_frame_raw directly, and checking for stack cycles here.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
Debugging PR 16155 further, I found that the DWARF unwinder found the function in question, but thought it had no registers saved (fs->regs.num_regs == 0). It seems to me that if a frame does not specify the return address column, or if the return address column is explicitly marked as DWARF2_FRAME_REG_UNSPECIFIED, then we should set the "undefined_retaddr" flag and let the DWARF unwinder gracefully stop. This patch implements that idea. With this patch the backtrace works properly: (gdb) bt #0 0x0000007fb7ed485c in nanosleep () from /lib64/libc.so.6 #1 0x0000007fb7ed4508 in sleep () from /lib64/libc.so.6 #2 0x00000000004008bc in thread_function (arg=0x4) at threadapply.c:73 #3 0x0000007fb7fad950 in start_thread () from /lib64/libpthread.so.0 #4 0x0000007fb7f0956c in clone () from /lib64/libc.so.6 2013-11-22 Tom Tromey <[email protected]> PR backtrace/16155: * dwarf2-frame.c (dwarf2_frame_cache): Set undefined_retaddr if the return address column is unspecified. 2013-11-22 Tom Tromey <[email protected]> * gdb.dwarf2/dw2-bad-cfi.c: New file. * gdb.dwarf2/dw2-bad-cfi.exp: New file. * gdb.dwarf2/dw2-bad-cfi.S: New file.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
Remote servers may cut the connection abruptly since they are not required to reply to a 'k' (Kill) packet sent from GDB. This patch addresses any issues arising from such scenario, which leads to a GDB internal error due to an attempt to pop the target more than once. With the patch, this failure is handled gracefully. Here's the GDB backtrace Maciej got running the testsuite against QEMU. Full paths edited out for brevity. #0 0x55573430 in __kernel_vsyscall () #1 0x557a2951 in raise () from /lib32/libc.so.6 #2 0x557a5d82 in abort () from /lib32/libc.so.6 #3 0x0826e2e4 in dump_core () at .../gdb/utils.c:635 #4 0x0826e5b6 in internal_vproblem (problem=0x85200c0, file=0x8416be8 ".../gdb/target.c", line=2861, fmt=0x84174ac "could not find a target to follow mourn inferior", ap=0xffa4796c "\f") at .../gdb/utils.c:804 #5 0x0826e5fb in internal_verror ( file=0x8416be8 ".../gdb/target.c", line=2861, fmt=0x84174ac "could not find a target to follow mourn inferior", ap=0xffa4796c "\f") at .../gdb/utils.c:820 #6 0x0826e633 in internal_error ( file=0x8416be8 ".../gdb/target.c", line=2861, string=0x84174ac "could not find a target to follow mourn inferior") at .../gdb/utils.c:830 #7 0x081b4ad0 in target_mourn_inferior () at .../gdb/target.c:2861 #8 0x08082283 in remote_kill (ops=0x85245e0) at .../gdb/remote.c:7840 #9 0x081b06d1 in target_kill () at .../gdb/target.c:486 #10 0x081b42f6 in dispose_inferior (inf=0xa501c60, args=0x0) at .../gdb/target.c:2570 #11 0x08290cfc in iterate_over_inferiors ( callback=0x81b42af <dispose_inferior>, data=0x0) at .../gdb/inferior.c:396 #12 0x081b435a in target_preopen (from_tty=1) at .../gdb/target.c:2591 #13 0x0807c2c6 in remote_open_1 (name=0xa5538b6 "localhost:1237", from_tty=1, target=0x85245e0, extended_p=0) at .../gdb/remote.c:4292 #14 0x0807b7a8 in remote_open (name=0xa5538b6 "localhost:1237", from_tty=1) at .../gdb/remote.c:3655 #15 0x080a23d4 in do_cfunc (c=0xa464f30, args=0xa5538b6 "localhost:1237", from_tty=1) at .../gdb/cli/cli-decode.c:107 #16 0x080a4c3b in cmd_func (cmd=0xa464f30, args=0xa5538b6 "localhost:1237", from_tty=1) at .../gdb/cli/cli-decode.c:1882 #17 0x0826bebf in execute_command (p=0xa5538c3 "7", from_tty=1) at .../gdb/top.c:467 #18 0x08193f2d in command_handler (command=0xa5538a8 "") at .../gdb/event-top.c:435 #19 0x08194463 in command_line_handler ( rl=0xa778198 "target remote localhost:1237") at .../gdb/event-top.c:633 #20 0x082ba92b in rl_callback_read_char () at .../readline/callback.c:220 #21 0x08193adf in rl_callback_read_char_wrapper (client_data=0x0) at .../gdb/event-top.c:164 #22 0x08193e57 in stdin_event_handler (error=0, client_data=0x0) at .../gdb/event-top.c:375 #23 0x08192f29 in handle_file_event (data=...) at .../gdb/event-loop.c:768 #24 0x0819266a in process_event () at .../gdb/event-loop.c:342 #25 0x08192708 in gdb_do_one_event () at .../gdb/event-loop.c:394 #26 0x08192781 in start_event_loop () at .../gdb/event-loop.c:431 #27 0x08193b08 in cli_command_loop (data=0x0) at .../gdb/event-top.c:179 #28 0x0818bc26 in current_interp_command_loop () at .../gdb/interps.c:327 #29 0x0818c4e5 in captured_command_loop (data=0x0) at .../gdb/main.c:267 #30 0x0818a37f in catch_errors (func=0x818c4d0 <captured_command_loop>, func_args=0x0, errstring=0x8402108 "", mask=RETURN_MASK_ALL) at .../gdb/exceptions.c:524 #31 0x0818d736 in captured_main (data=0xffa47f10) at .../gdb/main.c:1067 #32 0x0818a37f in catch_errors (func=0x818c723 <captured_main>, func_args=0xffa47f10, errstring=0x8402108 "", mask=RETURN_MASK_ALL) at .../gdb/exceptions.c:524 #33 0x0818d76c in gdb_main (args=0xffa47f10) at .../gdb/main.c:1076 #34 0x0804dd1b in main (argc=5, argv=0xffa47fd4) at .../gdb/gdb.c:34 The corresponding gdb.log excerpt: (gdb) PASS: gdb.base/bitfields.exp: bitfield uniqueness (u9) cont Continuing. Breakpoint 1, break1 () at .../gdb/testsuite/gdb.base/bitfields.c:44 44 } (gdb) PASS: gdb.base/bitfields.exp: continuing to break1 #9 print flags $10 = {uc = 0 '\000', s1 = 0, u1 = 0, s2 = 0, u2 = 0, s3 = 0, u3 = 0, s9 = 0, u9 = 0, sc = 1 '\001'} (gdb) PASS: gdb.base/bitfields.exp: bitfield uniqueness (sc) delete breakpoints Delete all breakpoints? (y or n) y (gdb) info breakpoints No breakpoints or watchpoints. (gdb) delete breakpoints (gdb) info breakpoints No breakpoints or watchpoints. (gdb) break break2 Breakpoint 2 at 0x85f8: file .../gdb/testsuite/gdb.base/bitfields.c, line 48. (gdb) entering gdb_reload target remote localhost:1235 A program is being debugged already. Kill it? (y or n) y Remote connection closed .../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) ^Ccontinue Please answer y or n. .../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) Resyncing due to internal error. n .../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior A problem internal to GDB has been detected, further debugging may prove unreliable. Create a core file of GDB? (y or n) y Command aborted. (gdb) print/x flags $11 = {uc = 0x0, s1 = 0x0, u1 = 0x0, s2 = 0x0, u2 = 0x0, s3 = 0x0, u3 = 0x0, s9 = 0x0, u9 = 0x0, sc = 0x0} (gdb) FAIL: gdb.base/bitfields.exp: bitfield containment #1 cont The program is not being run. (gdb) FAIL: gdb.base/bitfields.exp: continuing to break2 (the program is no longer running) print/x flags $12 = {uc = 0x0, s1 = 0x0, u1 = 0x0, s2 = 0x0, u2 = 0x0, s3 = 0x0, u3 = 0x0, s9 = 0x0, u9 = 0x0, sc = 0x0} (gdb) FAIL: gdb.base/bitfields.exp: bitfield containment #2 delete breakpoints Delete all breakpoints? (y or n) y (gdb) info breakpoints No breakpoints or watchpoints. (gdb) delete breakpoints (gdb) info breakpoints No breakpoints or watchpoints. (gdb) break break3 Breakpoint 3 at 0x8604: file .../gdb/testsuite/gdb.base/bitfields.c, line 52. (gdb) entering gdb_reload target remote localhost:1236 Remote debugging using localhost:1236 Reading symbols from .../lib/ld-linux.so.3...done. Loaded symbols for .../lib/ld-linux.so.3 0x41001b80 in _start () from .../lib/ld-linux.so.3 (gdb) continue Continuing. Breakpoint 3, break3 () at .../gdb/testsuite/gdb.base/bitfields.c:52 52 } (gdb) print flags $13 = {uc = 0 '\000', s1 = 0, u1 = 1, s2 = 0, u2 = 3, s3 = 0, u3 = 7, s9 = 0, u9 = 511, sc = 0 '\000'} (gdb) PASS: gdb.base/bitfields.exp: unsigned bitfield ranges gdb/ 2013-12-02 Pedro Alves <[email protected]> Maciej W. Rozycki <[email protected]> * remote.c (putpkt_for_catch_errors): Remove function. (remote_kill): Handle TARGET_CLOSE_ERROR from the kill packet gracefully.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
With a simple Ada program where I have 3 functions, one just calling the next, the backtrace is currently broken when GDB is compiled at -O2: #0 hello.first () at hello.adb:5 #1 0x0000000100001475 in hello.second () at hello.adb:10 Backtrace stopped: previous frame inner to this frame (corrupt stack?) It turns out that a recent patch deleted the assignment of variable this_id, making it an unitialized variable: * frame-unwind.c (default_frame_unwind_stop_reason): Return UNWIND_OUTERMOST if the frame's ID is outer_frame_id. * frame.c (get_prev_frame_1): Remove outer_frame_id check. The hunk in question starts with: - /* Check that this frame is not the outermost. If it is, don't try - to unwind to the prev frame. */ - this_id = get_frame_id (this_frame); - if (frame_id_eq (this_id, outer_frame_id)) (the code was removed as redundant - but removing the assignment was in fact not intentional). There is no other code in this function that sets the variable. Instead of re-adding the statement in the lone section where it is actually used, I inlined it, and then got rid of the variable altogether. This way, and until we start needing this frame ID in another location within that function, we dont' have to worry about the variable's validity/lifetime. gdb/ChangeLog: * frame.c (get_prev_frame_1): Delete variable "this_id". Replace its use by a call to get_frame_id.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
Doing "info frame" in the outermost frame, when that was indicated by the next frame saying the unwound PC is undefined/not saved, results in error and incomplete output: (gdb) bt #0 thread_function0 (arg=0x0) at threads.c:63 #1 0x00000034cf407d14 in start_thread (arg=0x7ffff7fcb700) at pthread_create.c:309 #2 0x000000323d4f168d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (gdb) frame 2 #2 0x000000323d4f168d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 115 call *%rax (gdb) info frame Stack level 2, frame at 0x0: rip = 0x323d4f168d in clone (../sysdeps/unix/sysv/linux/x86_64/clone.S:115); saved rip Register 16 was not saved (gdb) Not saved register values are treated as optimized out values internally throughout. stack.c:frame_info is handing unvailable values, but not optimized out ones. The patch deletes the frame_unwind_caller_pc_if_available wrapper function and instead lets errors propagate to frame_info (it's only user). As frame_unwind_pc now needs to be able to handle and cache two different error scenarios, the prev_pc.p variable is replaced with an enumeration. (FWIW, I looked into making gdbarch_unwind_pc or a variant return struct value's instead, but it results in lots of boxing and unboxing for no real gain -- e.g., the mips and arm implementations need to do computation on the unboxed PC value. Might as well throw an error on first attempt to get at invalid contents.) After the patch, we get: (gdb) info frame Stack level 2, frame at 0x0: rip = 0x323d4f168d in clone (../sysdeps/unix/sysv/linux/x86_64/clone.S:115); saved rip = <not saved> Outermost frame: outermost caller of frame at 0x7ffff7fcafc0 source language asm. Arglist at 0x7ffff7fcafb8, args: Locals at 0x7ffff7fcafb8, Previous frame's sp is 0x7ffff7fcafc8 (gdb) A new test is added. It's based off dw2-reg-undefined.exp, and tweaked to mark the return address (rip) of "stop_frame" as undefined. Tested on x86_64 Fedora 17. gdb/ 2013-12-06 Pedro Alves <[email protected]> * frame.c (enum cached_copy_status): New enum. (struct frame_info) <prev_pc.p>: Change type to enum cached_copy_status. (fprint_frame): Handle not saved and unavailable prev_pc values. (frame_unwind_pc_if_available): Delete and merge contents into ... (frame_unwind_pc): ... here. Handle OPTIMIZED_OUT_ERROR. Adjust to use enum cached_copy_status. (frame_unwind_caller_pc_if_available): Delete. (create_new_frame): Adjust. * frame.h (frame_unwind_caller_pc_if_available): Delete declaration. * stack.c (frame_info): Use frame_unwind_caller_pc instead of frame_unwind_caller_pc_if_available, and handle NOT_AVAILABLE_ERROR and OPTIMIZED_OUT_ERROR errors. * valprint.c (val_print_optimized_out): Use val_print_not_saved. (val_print_not_saved): New function. * valprint.h (val_print_not_saved): Declare. gdb/testsuite/ 2013-12-06 Pedro Alves <[email protected]> * gdb.dwarf2/dw2-undefined-ret-addr.S: New file. * gdb.dwarf2/dw2-undefined-ret-addr.c: New file. * gdb.dwarf2/dw2-undefined-ret-addr.exp: New file.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
We observed on Windows 2012 that we were unable to unwind past exception handlers. For instance, with any Ada program raising an exception that does not get handled: % gnatmake -g a -bargs -shared % gdb a (gdb) start (gdb) catch exception unhandled Catchpoint 2: unhandled Ada exceptions (gdb) c Catchpoint 2, unhandled CONSTRAINT_ERROR at <__gnat_unhandled_exception> ( e=0x645ff820 <constraint_error>) at s-excdeb.adb:53 53 s-excdeb.adb: No such file or directory. At this point, we can already see that something went wrong, since the frame selected by the debugger corresponds to a runtime function rather than the function in the user code that caused the exception to be raised (in our case procedure A). This is further confirmed by the fact that we are unable to unwind all the way to procedure A: (gdb) bt #0 <__gnat_unhandled_exception> (e=0x645ff820 <constraint_error>) at s-excdeb.adb:53 #1 0x000000006444e9a3 in <__gnat_notify_unhandled_exception> (excep=0x284d2 +0) at a-exextr.adb:144 #2 0x00000000645f106a in __gnat_personality_imp () from C:\[...]\libgnat-7.3.dll #3 0x000000006144d1b7 in _GCC_specific_handler (ms_exc=0x242fab0, this_frame=0x242fe60, ms_orig_context=0x242f5c0, ms_disp=0x242ef70, gcc_per=0x645f0960 <__gnat_personality_imp>) at ../../../src/libgcc/unwind-seh.c:289 #4 0x00000000645f1211 in __gnat_personality_seh0 () from C:\[...]\libgnat-7.3.dll #5 0x000007fad3879f4d in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) It turns out that the unwinder has been doing its job flawlessly up until frame #5. The address in frame #5 is correct, but GDB is not able to associate it with any symbol or unwind record. And this is because this address is inside ntdll.dll, and when we received the LOAD_DLL_DEBUG_EVENT for that DLL, the system was not able to tell us the name of the library, thus causing us to silently ignoring the event. Because GDB does not know about ntdll.dll, it is unable to access the unwind information from it. And because the function at that address does not use a frame pointer, the unwinding becomes impossible. This patch helps recovering ntdll.dll at the end of the "run/attach" phase, simply by trying to locate that specific DLL again. In terms of our medium to long term planning, it seems to me that we should be able to simplify the code by ignoring LOAD_DLL_DEBUG_EVENT during the startup phase, and modify windows_ensure_ntdll_loaded to then detect and report all shared libraries after we've finished inferior creation. But for a change just before 7.7 branch creation, I thought it was safest to just handle ntdll.dll specifically. This is less intrusive, and ntdll is the only DLL affected by the problem I know so far. gdb/ChangeLog: * windows-nat.c (handle_load_dll): Add comments. (windows_ensure_ntdll_loaded): New function. (do_initial_windows_stuff): Use windows_ensure_ntdll_loaded. Add FIXME comment.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
abfd->section_count unexpectedly changes between 218 and 248 in: 150 bfd_simple_get_relocated_section_contents (bfd *abfd, [...] 218 saved_offsets = malloc (sizeof (struct saved_output_info) 219 * abfd->section_count); [...] 230 _bfd_generic_link_add_symbols (abfd, &link_info); [...] 248 bfd_map_over_sections (abfd, simple_restore_output_info, saved_offsets); _bfd_generic_link_add_symbols increases section_count and simple_restore_output_info later reads unallocated part of saved_offsets. READ of size 8 at 0x601c0000c5c0 thread T0 #0 0x1124770 in simple_restore_output_info (.../gdb/gdb+0x1124770) #1 0x10ecd51 in bfd_map_over_sections (.../gdb/gdb+0x10ecd51) #2 0x1125150 in bfd_simple_get_relocated_section_contents (.../gdb/gdb+0x1125150) bfd/ 2014-02-17 Jan Kratochvil <[email protected]> PR binutils/16595 * simple.c (struct saved_offsets): New. (simple_save_output_info): Use it for ptr. (simple_restore_output_info): Use it for ptr. Check section_count. (bfd_simple_get_relocated_section_contents): Use it for saved_offsets.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
binutils readelf -wi: <4><a2>: Abbrev Number: 26 (DW_TAG_inlined_subroutine) <a3> DW_AT_abstract_origin: <0x5a> <a7> DW_AT_low_pc : 0x400590 <ab> DW_AT_high_pc : 0x4 <af> DW_AT_call_file : 1 <b0> DW_AT_call_line : 20 <b1> DW_AT_sibling : <0xb8> <2><b8>: Abbrev Number: 35 (DW_TAG_inlined_subroutine) <b9> DW_AT_abstract_origin: <0x5a> <bd> DW_AT_low_pc : 0x400590 <c1> DW_AT_high_pc : 0x4 <c5> DW_AT_call_file : 1 <c6> DW_AT_call_line : 29 <b1> DW_AT_sibling points to the next DIE - but that DIE is 2 levels upwards - definitely not a sibling. This confuses GDB up to a crash: ==32143== ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6024000198ac at pc 0xb4d104 bp 0x7fff63e96e70 sp 0x7fff63e96e60 READ of size 1 at 0x6024000198ac thread T0 #0 0xb4d103 in read_unsigned_leb128 (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb4d103) #1 0xb15f3c in peek_die_abbrev (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb15f3c) #2 0xb46185 in load_partial_dies (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb46185) #3 0xb103fb in process_psymtab_comp_unit_reader (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb103fb) #4 0xb0d2a9 in init_cutu_and_read_dies (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb0d2a9) #5 0xb1115f in process_psymtab_comp_unit (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb1115f) #6 0xb1235f in dwarf2_build_psymtabs_hard (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb1235f) #7 0xb05536 in dwarf2_build_psymtabs (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xb05536) #8 0x86d5a5 in read_psyms (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x86d5a5) #9 0x9b1c37 in require_partial_symbols (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x9b1c37) #10 0x9bf2d0 in read_symbols (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x9bf2d0) #11 0x9c014c in syms_from_objfile_1 (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x9c014c) gdb/testsuite/ 2014-02-25 Jan Kratochvil <[email protected]> Fix dw2-icycle.exp -fsanitize=address GDB crash. * gdb.dwarf2/dw2-icycle.S: Remove all DW_AT_sibling. Message-ID: <[email protected]>
monaka
pushed a commit
that referenced
this issue
May 15, 2014
…T_WAITKIND_NO_RESUMED). GDBserver currently hangs forever in waitpid if the leader thread exits before other threads, or if all resumed threads exit - e.g., next over a thread exit with sched-locking on. This is exposed by leader-exit.exp. leader-exit.exp is part of a series of tests for a set of related problems. See <http://www.sourceware.org/ml/gdb-patches/2011-10/msg00704.html>: " To recap, on the Linux kernel, ptrace/waitpid don't allow reaping the leader thread until all other threads in the group are reaped. When the leader exits, it goes zombie, but waitpid will not return an exit status until the other threads are gone. This is presently exercised by the gdb.threads/leader-exit.exp test. The fix for that test, in linux-nat.c:wait_lwp, handles the case where we see the leader gone when we're stopping all threads to report an event to some other thread to the core. (...) The latter bit about not blocking if there no resumed threads in the process also applies to some other thread exiting, not just the main thread. E.g., this test starts a thread, and runs to a breakpoint in that thread: ... (gdb) c Continuing. [New Thread 0x7ffff75a4700 (LWP 23397)] [Switching to Thread 0x7ffff75a4700 (LWP 23397)] Breakpoint 2, thread_a (arg=0x0) at ../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:28 28 return 0; /* break-here */ (gdb) info threads * 2 Thread 0x7ffff75a4700 (LWP 23397) thread_a (arg=0x0) at ../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:28 1 Thread 0x7ffff7fcb720 (LWP 23391) 0x00007ffff7bc606d in pthread_join (threadid=140737343276800, thread_return=0x0) at pthread_join.c:89 The thread will exit as soon as we resume it. But if we only resume that thread, leaving the rest of the threads stopped: (gdb) set scheduler-locking on (gdb) c Continuing. ^C^C^C^C^C^C^C^C " This patch fixes the issues by implementing TARGET_WAITKIND_NO_RESUMED on GDBserver, similarly to what the patch above did for native Linux GDB. gdb.threads/leader-exit.exp now passes. gdb.threads/no-unwaited-for-left.exp now at least errors out instead of hanging: continue Continuing. warning: Remote failure reply: E.No unwaited-for children left. [Thread 15454] #1 stopped. 0x00000034cf408e60 in pthread_join (threadid=140737353922368, thread_return=0x0) at pthread_join.c:93 93 lll_wait_tid (pd->tid); (gdb) FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits The gdb.threads/non-ldr-exc-*.exp tests are skipped because GDBserver unfortunately doesn't support fork/exec yet, but I'm confident this fixes the related issues. I'm leaving modeling TARGET_WAITKIND_NO_RESUMED in the RSP for a separate pass. (BTW, in case of error in response to a vCont, it would be better for GDB to query the target for the current thread, or re-select one, instead of assuming current inferior_ptid is still the selected thread.) This implementation is a little different from GDB's, because I'm avoiding bringing in more of this broken use of waitpid(PID) into GDBserver. Specifically, this avoids waitpid(PID) when stopping all threads. There's really no need for wait_for_sigstop to wait for each LWP in turn. Instead, with some refactoring, we make it reuse linux_wait_for_event. gdb/gdbserver/ 2014-02-27 Pedro Alves <[email protected]> PR 12702 * inferiors.h (A_I_NEXT, ALL_INFERIORS_TYPE, ALL_PROCESSES): New macros. * linux-low.c (delete_lwp, handle_extended_wait): Add debug output. (last_thread_of_process_p): Take a PID argument instead of a thread pointer. (linux_wait_for_lwp): Delete. (num_lwps, check_zombie_leaders, not_stopped_callback): New functions. (linux_low_filter_event): New function, party factored out from linux_wait_for_event. (linux_wait_for_event): Rename to ... (linux_wait_for_event_filtered): ... this. Add new filter ptid argument. Partly rewrite. Always use waitpid(-1, WNOHANG) and sigsuspend. Check for zombie leaders. (linux_wait_for_event): Reimplement as wrapper around linux_wait_for_event_filtered. (linux_wait_1): Handle TARGET_WAITKIND_NO_RESUMED. Assume that if a normal or signal exit is seen, it's the whole process exiting. (wait_for_sigstop): No longer a for_each_inferior callback. Rewrite on top of linux_wait_for_event_filtered. (stop_all_lwps): Call wait_for_sigstop directly. * server.c (resume, handle_target_event): Handle TARGET_WAITKIND_NO_RESUMED.
monaka
pushed a commit
that referenced
this issue
May 15, 2014
runtest gdb.base/corefile.exp ==23174== ERROR: AddressSanitizer: heap-use-after-free on address 0x604400008c88 at pc 0x68f0be bp 0x7fffae9d7490 sp 0x7fffae9d7480 READ of size 8 at 0x604400008c88 thread T0 #0 0x68f0bd in svr4_read_so_list (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x68f0bd) #1 0x68f64e in svr4_current_sos_direct (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x68f64e) #2 0x68f757 in svr4_current_sos (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x68f757) #3 0xcebbff in update_solib_list (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xcebbff) 0x604400008c88 is located 8 bytes inside of 1104-byte region [0x604400008c80,0x6044000090d0) freed by thread T0 here: #0 0x7f52677500f9 (/lib64/libasan.so.0+0x160f9) #1 0xd2c68a in xfree (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xd2c68a) #2 0xceb364 in free_so (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xceb364) #3 0xca59f8 in do_free_so (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0xca59f8) #4 0x93432a in do_my_cleanups (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x93432a) #5 0x934406 in do_cleanups (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x934406) #6 0x68efa9 in svr4_read_so_list (/home/jkratoch/redhat/gdb-clean/gdb/gdb+0x68efa9) I did not notice it during my review in: Re: [PATCH v2] Skip vDSO when reading SO list (PR 8882) https://sourceware.org/ml/gdb-patches/2013-09/msg00888.html gdb/ 2014-02-27 Jan Kratochvil <[email protected]> Additional PR 8882 fix. * solib-svr4.c (svr4_read_so_list): Change first to first_l_name. Message-ID: <[email protected]>
monaka
pushed a commit
that referenced
this issue
May 15, 2014
With target async enabled, py-finish-breakpoint.exp triggers an assertion failure. The failure occurs because execute_command re-enters the event loop in some circumstances, and in this case resets the sync_execution flag. Then later GDB reaches this assertion in normal_stop: gdb_assert (sync_execution || !target_can_async_p ()); In detail: #1 - A synchronous execution command is run. sync_execution is set. #2 - A python breakpoint is hit (TARGET_WAITKIND_STOPPED), and the corresponding Python breakpoint's stop method is executed. When and while python commands are executed, interpreter_async is forced to 0. #3 - The Python stop method happens to execute a not-execution-related gdb command. In this case, "where 1". #4 - Seeing that sync_execution is set, execute_command nests a new event loop (although that wasn't necessary; this is the problem). #5 - The linux-nat target's pipe in the event loop happens to be marked. That's normal, due to this in linux_nat_wait: /* If we requested any event, and something came out, assume there may be more. If we requested a specific lwp or process, also assume there may be more. */ The nested event loop thus immediately wakes up and calls target_wait. No thread is actually executing in the inferior, so the target returns TARGET_WAITKIND_NO_RESUMED. #6 - normal_stop is reached. GDB prints "No unwaited-for children left.", and resets the sync_execution flag (IOW, there are no resumed threads left, so the synchronous command is considered completed.) This is already bogus. We were handling a breakpoint! #7 - the nested event loop unwinds/ends. GDB is now back to handling the python stop method (TARGET_WAITKIND_STOPPED), which decides the breakpoint should stop. normal_stop is called for this event. However, normal_stop actually works with the _last_ reported target status: void normal_stop (void) { struct target_waitstatus last; ptid_t last_ptid; struct cleanup *old_chain = make_cleanup (null_cleanup, NULL); ... get_last_target_status (&last_ptid, &last); ... if (last.kind == TARGET_WAITKIND_NO_RESUMED) { gdb_assert (sync_execution || !target_can_async_p ()); target_terminal_ours_for_output (); printf_filtered (_("No unwaited-for children left.\n")); } And due to the nesting in execute command, the last event is now TARGET_WAITKIND_NO_RESUMED, not the actual breakpoint event being handled. This could be seen to be broken in itself, but we can leave fixing that for another pass. The assertion is reached, and fails. execute_command has a comment explaining when it should synchronously wait for events: /* If the interpreter is in sync mode (we're running a user command's list, running command hooks or similars), and we just ran a synchronous command that started the target, wait for that command to end. */ However, the code did not follow this comment -- it didn't check to see if the command actually started the target, just whether the target was executing a sync command at this point. This patch fixes the problem by noting whether the target was executing in sync_execution mode before running the command, and then augmenting the condition to test this as well. 2014-03-20 Tom Tromey <[email protected]> PR gdb/14135 * top.c (execute_command): Only dispatch events if the command started the target.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The text was updated successfully, but these errors were encountered: