GDB Debugging

From RidgeRun Developer Connection
Jump to: navigation, search

Introduction

In the software developer's daily work there is a high chance that some errors might appear, some of them might be easy to solve, but some others might require more effort to resolve. This is when debugging is useful, it is meant to help the developers to be more efficient in finding and fixing the issues they have. This wiki is a small guide to have a better understanding of the debugging concepts, tools that can be used and some debug use cases.

Debugging importance

The debugging process to find and fix errors on one or more software programs. This process requires that the developer has some understanding of the error that needs to be fixed. Debugging is important because it allows software engineers and developers to fix errors in a program before releasing it to the public. It's a complementary process to testing, which involves learning how an error affects a program overall [1].

Error types

There are different types of software errors, depending on the error a deeper debugging might be required. These are the error types that a software developer might face:

Syntax Errors

Are grammatical interruptions in a line of code. For example, an extra bracket or period might cause a syntax error to occur [1]. This type of errors are usually easy to fix because the compilers notify about these errors. This error doesn't require a deep level of debugging.

Logic Errors

Are issues in the code's algorithms. They can occur when a program's code produces an unexpected output or causes the program to stop working [1]. These type of errors are more difficult to resolve than the previous one, they require a deeper level of debugging. Also, these errors can be fixed by carefully checking the code in order to find the issue.

Run-time Errors

These errors occur when a person uses the program and they're detected by the computer executing it. They can still appear after you finish an initial debugging process because a computer might interpret the program's code in an unexpected way [1]. For this type of error, the debugging process and tool are very useful, they can help you detect the line of code where the error is happening or they can give you more information about the state of the program memory at the moment of the crash.

Debugging tools

There are several debugging tools that software developers can use in order to find and fix errors in their programs. Some of these tools can be used on the system terminal or can have a user interface, also some of these tools can be used to detect different kinds of errors and can give different kinds of information about the program that the software developer is debugging. Some of the most used and popular debugging tools are the following:

Valgrind

Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. [2]. Some of the tools that Valgrind offers are the following:

  • Memory Error detector
  • Thread Error detector
  • Cache and branch prediction profiler
  • Heap profiler

Valgrind can be used on the following operating systems:

  • Linux
  • Solaris
  • Android
  • MacOS
  • FreeBSD

Valgrind takes control of the program that is being debugged before it starts, it reads its debug information and symbols. Then it runs the program into a simulated CPU provided by the Valgrind code. As Valgrind simulates every instruction that the program executes, the tools can monitor different kinds of information such as memory leaks, segmentation faults, thread issues, and even issues with linked libraries. Valgrind is mainly used to debug programs written in C and C++

GDB

GDB, the GNU Project debugger, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed. [3]. GDB runs mainly on Unix operating systems and Microsoft Windows variants and can be used to debug the following programming languages:

  • C and C++
  • Assembly
  • Rust
  • Go
  • OpenCL

GDB lets the software developer start the program that is being debugged and changing anything that can affect its behavior, make the program stop under certain conditions, examine what has happen when the program has stopped monitor variables and memory directions, and change between the different program's threads. It is mainly used by terminal commands, but there are some graphical tools that can be used with GDB. It also offers remote debugging which is very useful when working with embedded systems.

This wiki will be focused on the usage of this tool.

KDevelop

At the core of KDevelop lies the combination of an advanced editor with semantic code analysis, which delivers an enriched programming experience thanks to a deep understanding of your project. [4]. KDevelop can be used to debug program written in

  • C/C++
  • Python
  • QML
  • JavaScript
  • PHP

It supports different platforms such as:

  • Linux
  • Windows
  • Solaris
  • FreeBSD
  • MacOS

KDevelop uses a UI-based code editor which uses the KDE and Qt libraries, this offers an easy debug experience for the software developers. KDevelop analyzes the source code of the program that is being debugged, it can be used to retrieve different kinds of information about the program, such as the member's functions of the program and which variables have been defined and their type.

GDB concepts

Exception

This is an event that alters the expected flow of a program's code. In some circumstances, the program can handle the exception and continue to run, but if the exception interrupts the program, a software developer would typically start a debugging process. [1]

Value

This word represents the specific name assigned to a variable after creating it. [1]

Break point

Debugging command to stop the code from running in sections that contain a previously identified error. [1]

Debugging tool

Describes software that allows pinpointing the area in a code where the error occurs and making any necessary adjustments. [1]

Compiler

Is a program that translates a coding language into a usable format, giving the ability to debug the program. [1]

Debugging symbols

A debug symbol is a set of special characters generated when a program is compiled and containing information about the location of variables and functions in the resulting binary file, plus other service information. [5]. They are needed for a debugger to get all the information about the program being run, so they can be added when compiling a program with the "-g" flag.

As they contain extra information about the parts of the program, like the variable names, function names, source code, etc. The generated binary will be bigger, so it can cause some problems to store it on devices with low memory capacity. This problem can be addressed by using the strip utility on linux. It will remove all the not necessary information for the debugging and it will decrease the size for the binary

GDB commands

Miscellaneous

Run a program

With this command, gdb will run the program until it finishes or ends by some other termination signal. It is run by this command

 $(gdb) r

Print program variable

With this command, gdb will print a specific variable defined in the program at the moment that the command is run. It is run by this command

 $(gdb) p <variable name>

Continue code execution

With this command, gdb will resume the code execution after a breakpoint is reached. It is run by this command

 $(gdb) c

Check source code

With this command, gdb will show the source code for the program that is being debugged. It is run by this command

 $(gdb) list

Quit the debugger

With this command, gdb will end the debug session. It is run by this command

 $(gdb) q

Functions

Run until function is finished

With this command, gdb will run the function until is finished. It is run by this command

 $(gdb) finish

Print stack trace

This command is used to print the stack trace for the function that is being run at a specific moment. Is it run by this command

 $(gdb) bt

Execute next line

GDB allows the user to execute the program line by line. It also allows the user to enter the function for the current line of the source code. To execute the next line of the program run

 $(gdb) n

To step into the function for the current line, run

 $(gdb) s

Breakpoints

Set breakpoints

GDB allows the user to set breakpoints to stop the program execution when it reaches a specific line or function. In order to set a breakpoint on a specific line run

 $(gdb) b <line number>

To set it on a specific function, run

 $(gdb) b <function name>

Delete breakpoints

GDB also allows deleting the breakpoints that were set. In order to delete a specific breakpoint run

 $(gdb) d <breakpoint number>

To delete all the breakpoints that were set, run

 $(gdb) d

List all the breakpoints

To get the information for all the breakpoints that are set and the breakpoint id, run

 $(gdb) info break

Multi Threads

Change between threads

To change between all the threads that are currently running on a debugging session, run

 $(gdb) thread <thread number>

List all the threads

To check all the threads that are currently running on a debugging session, run

 $(gdb) info threads

Set thread-specific breakpoint

To set a breakpoint on a specific thread, run

 $(gdb) b <function> thread <thread number>

Remote debugging

In some cases, the debug session must be run on an external machine, like when working with embedded systems. For those cases, GDB offers a way to connect to a server running on that external machine and allow the debugging code on that machine.

This tool is called gdb-server, it supports different communication protocols, like serial and TCP. It runs on the target machine and creates a server on which GDB can connect on and start the debugging session.

GDB Server

In order to run gdb-server a copy of the program, that will be debugged is needed on the target machine, this binary can be stripped, as gdb-server doesn't care about symbols, all symbol handling is taken care of by the GDB running on the host system, which is useful when debugging program on embedded systems with a low storage capacity [6]. In order to run gdb run the following steps

On the target machine run

 gdbserver localhost:<port> <program>

Start gdb on the host machine

 arm-none-eabi-gdb <copy of program to debug>

To connect via serial port run the following command inside gdb

 (gdb) target remote <serial interface>

To connect via TCP port run the following command inside gdb

 (gdb) target remote <target ip>:<port>

arm-none-eabi-gdb

This tool is part of the GNU ARM toolchain, which is a ready-to-use, open-source suite of tools for C, C++ and assembly programming. The GNU Arm Embedded Toolchain targets the 32-bit Arm Cortex-A, Arm Cortex-M, and Arm Cortex-R processor families. [7].

This toolchain allows the programmer to debug ARM binaries from a non-ARM host machine. It is useful when working with ARM embedded systems and some remote debug sessions must be run.

GDB Cheat sheet

For a cheat sheet for the GDB commands previously explained and more information, please visit the following link GDBRefenceCard

Advanced GDB

Advanced GDB Commands

Check register variables

GDB lets the user check the CPU registers when the debugging session is running. To check them on GDB, run

 $(gdb) p $pc

View assembly code

GDB lets the user see the assembly code for the functions of the program being debugged. To check it, run $(gdb) disassemble <function name>

Watchpoints

The watchpoints are a way to check when a program variable is changed. When a watch point is set to a memory address content, GDB will let the user know that the memory address content is changed and also will show the old and the new value for the memory address. To set a watchpoint run

 $(gdb) watch <variable>

Artificial arrays

It is often useful to print a contiguous region of memory as if it were an array. [8] To create an artificial array run

 $(gdb) <array first element>@<array size>

Conditional breakpoints

GDB lets the user to set breakpoints that are only triggered if a certain condition is true. To set a conditional breakpoint run

 $(gdb) b <location> if <condition>

Run shell commands

GDB lets the user run shell commands inside the debugging session. The following command lets the user run shell commands

 $(gdb) <shell command>

Core dumps

A core dump is a file containing a process's address space (memory) when the process terminates unexpectedly. Core dumps may be produced on-demand (such as by a debugger), or automatically upon termination. Core dumps are triggered by the kernel in response to program crashes, and may be passed to a helper program (such as systemd-coredump) for further processing. [9]. They are useful to debug errors that happen over long periods of time, with the need for a debug session to be run. They are like a plane black box, because the programmer can retrieve the status of the program's memory before a crash.

That core dumps files are disabled by default, but they can be enabled by running the command:

 ulimit -c unlimited

It increases the memory for the buffer that the kernel uses to store core dumps, by default it has a size of 0 bites. When the core dump file is created, it will be stored in the same directory as the running program, it will be created under the name "core". In order to use it, run the following command:

 gdb <program> <core dump file>

The core dump file can also be created when a GDB test session is running, by using the command

 generate-core-file

Enable core dumps on Ubuntu 20.04

On Ubuntu 20.04, the core dumps are handled by a service called apport. That service is in charge of getting and registering information about the system, but it doesn't retrieve information about the non-packages programs, so the core dump files are not generated. In order to generate them, run

 sudo systemctl disable apport
 sudo sysctl -w kernel.core_pattern=core.%u.%p.%t
 ulimit -c unlimited

The command sysctl -w kernel.core_pattern is used to specify a pattern for the generated core dump. Some more information can be added to the core dump name, the options are the following

 %p: pid
 %: '%' is dropped
 %%: output one '%'
 %u: uid
 %g: gid
 %s: signal number
 %t: UNIX time of dump
 %h: hostname
 %e: executable filename
 %: both are dropped

Debugging symbols vs optimizations

When the compiler optimizes code, it repositions and reorganizes instructions. This results in more efficient compiled code. Because of this rearrangement, the debugger cannot always identify the source code that corresponds to a set of instructions. [10]. The optimizations can affect different kinds of expressions on the source code, some of them include

  • Local variables
  • Positions inside the functions
  • Functions name

This can give troubles like brake points that are never triggered or variable names are not shown on the source code. For those reasons when debugging, the optimizations must be turned off by compiling with the flag "-O0".

If you need to use a minimum level of optimizations, you can use the flag "-Og", which offers a reasonable level of optimization while maintaining fast compilation. But using higher levels of optimizations can affect the debugging experience on some cases.

Attach GDB to running application

If a program that contains debug symbols is already running, GDB can be attached to that program to check its information at a specific moment. It is because GDB automatically loads symbols and information for programs, such as distribution-supplied packages, and interrupts the program so that you can interact with it. It is especially useful when the application being debugged is stuck and the information at that time must be known.

In order to attach GDB to a running application run the following command:

 gdb --pid <program pid>

Init files

At startup, GDB reads its initialization file. This is a file of commands, such as option settings, for example, that you tell GDB to run every time it starts up. The initialization file is named .gdbinit on Unix (BSD, Linux, etc.) systems [11]. The .gdbinit file can have different scopes depending on its location. It can be store on the following paths:

  • If it is stored on $HOME/.gdbinit, it acts as a global initialization. It will be applied to all the user's debug sessions.
  • If it is stored on ./.gdbinit, it acts like a local initialization. It will be applied to only the gdb sessions for the programs that are run on the current work directory.

On the .gdbinit file you can store from shell commands to gdb commands and they will be applied before the debug session is started.

GDB on VsCode

In order to configure GDB on VSCode, follow this steps

  • Go to Debug / “Create launch.json file”
  • Select a C/C++ debugger extension or install a new one
  • Edit the following values on the auto-generated launch.json file
    • “program”, set the path to the program binary
    • “stopAtEntry”, to automatically stop the program on the main function
    • “cwd”, set the path to the source code of the program

At the end, your "launch.json" file should look like this

{
   "version": "0.2.0",
   "configurations": [
       {
           "name": "TrainingDebugger",
           "type": "cppdbg",
           "request": "launch",
           "program": "path to binary",
           "stopAtEntry": true,
           "cwd": "."
       }
   ]
}

Practical Example

The following example will show how to use gdb to debug a C++ multithreaded program. It is the classic producer-consumer problem. The code being debugged is the following

  #include <iostream>
  #include <thread>
  #include <mutex>
  #include <condition_variable>
  #include <deque>
  #include <vector>

  std::mutex mu;
  std::condition_variable cond;
  std::deque<int> buffer;

  const unsigned int maxBufferSize = 50;

  void producer(int val)
  {
	  while (val)
	  {
		  buffer.push_back(val);
		  std::cout << "Produced: " << val << "\n";
		  val--;
		  locker.unlock();
		  cond.notify_one();
	  }
  }

  void consumer()
  {
	  int accumulator = 0;
	  while (true)
	  {
		int val = buffer.back();
		buffer.pop_back();
		std::cout << "Consumed: " << val << "\n";
		accumulator += val;
		std::cout << "Accumulator is " << accumulator << "\n";
	  }
  }

  int main()
  {
	  std::thread t1(producer, 10);
	  std::thread t2(consumer);

	  t1.join();
	  t2.join();

	  return 0;
  }

When this program is run, the output is the following

 Segmentation fault (core dumped)

Running the program with GDB we can see the following output

 Thread 3 "producerConsume" received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 0x7ffff7211700 (LWP 416041)]
 0x0000555555556431 in consumer () at producerConsumerMkII.cpp:35
 35			int val = buffer.back();

So we can see that the segmentation fault occurs when the consumer thread is trying to consume the value. The deque data structure that is used in this example is not thread-safe, so we need a mutex to add thread safety to the program. Adding the mutex to the program will result in the following code

  #include <iostream>
  #include <thread>
  #include <mutex>
  #include <condition_variable>
  #include <deque>
  #include <vector>

  std::mutex mu;
  std::condition_variable cond;
  std::deque<int> buffer;

  const unsigned int maxBufferSize = 50;

  void producer(int val)
  {
	  while (val)
	  {
		  std::unique_lock<std::mutex> locker(mu);
		  buffer.push_back(val);
		  std::cout << "Produced: " << val << "\n";
		  val--;
		  locker.unlock();
	  }
  }

  void consumer()
  {
	  int accumulator = 0;
	  while (true)
	  {
		  std::unique_lock<std::mutex> locker(mu);
		  int val = buffer.back();
		  buffer.pop_back();
		  std::cout << "Consumed: " << val << "\n";
		  accumulator =+ val;
		  std::cout << "Accumulator is " << accumulator << "\n";
		  locker.unlock();
	  }
  }

  int main()
  {
	  std::thread t1(producer, 10);
	  std::thread t2(consumer);
	  t1.join();
	  t2.join();
	  return 0;
  }

After running again the program, we can see that the segmentation fault is still happening

Thread 3 "producerConsume" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7211700 (LWP 416692)]
0x000055555555654c in consumer () at producerConsumerMkII.cpp:35
35			int val = buffer.back();

The segmentation fault is still happening on the same line, so we need another structure to synchronize our threads. On the standard C++ library there is a class called conditional_variable which can be used to synchronize our threads, it has two main methods wait() and notify_one

The wait() method can be used to wait until a certain condition happens and the notify_one can be used to notify the other threads using the condition_variable class that they need to wake up from their sleeping state. Adding this class to the code will result in the following

#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <deque>
#include <vector>

std::mutex mu;
std::condition_variable cond;
std::deque<int> buffer;

const unsigned int maxBufferSize = 50;

void producer(int val)
{
	while (val)
	{
		std::unique_lock<std::mutex> locker(mu);
		cond.wait(locker, []() { return buffer.size() < maxBufferSize; });
		buffer.push_back(val);
		std::cout << "Produced: " << val << "\n";
		val--;
		locker.unlock();
		cond.notify_one();
	}
}

void consumer()
{
	int accumulator = 0;
	while (true)
	{
		std::unique_lock<std::mutex> locker(mu);
		cond.wait(locker, []() { return buffer.size() > 0; });
		int val = buffer.back();
		buffer.pop_back();
		std::cout << "Consumed: " << val << "\n";
		accumulator =+ val;
		std::cout << "Accumulator is " << accumulator << "\n";
		locker.unlock();
		cond.notify_one();
	}
}

int main()
{
	std::thread t1(producer, 10);
	std::thread t2(consumer);

	t1.join();
	t2.join();

	return 0;
}

And its result is the following

Produced: 10
Produced: 9
Produced: 8
Produced: 7
Produced: 6
Produced: 5
Produced: 4
Produced: 3
Produced: 2
Produced: 1
[New Thread 0x7ffff7211700 (LWP 416786)]
[Thread 0x7ffff7a12700 (LWP 416785) exited]
Consumed: 1
Accumulator is 1
Consumed: 2
Accumulator is 2
Consumed: 3
Accumulator is 3
Consumed: 4
Accumulator is 4
Consumed: 5
Accumulator is 5
Consumed: 6
Accumulator is 6
Consumed: 7
Accumulator is 7
Consumed: 8
Accumulator is 8
Consumed: 9
Accumulator is 9
Consumed: 10
Accumulator is 10

Right now the program is running fine, the consumer is getting the information generated by the producer, but they're still one issue. The accumulator for the consumer is not having the expected result. Running the gdb again, setting a breakpoint on the consumer function and setting a watchpoint on the accumulator variable result on the following

(gdb) b consumer
Breakpoint 1 at 0x25b9: file producerConsumerMkII.cpp, line 29.
(gdb) r
Starting program: /home/dev/Programs/gdb_examples/producerConsumerMkII 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7a12700 (LWP 416842)]
Produced: 10
Produced: 9
Produced: 8
Produced: 7
Produced: 6
Produced: 5
Produced: 4
Produced: 3
Produced: 2
Produced: 1
[New Thread 0x7ffff7211700 (LWP 416843)]
[Thread 0x7ffff7a12700 (LWP 416842) exited]
[Switching to Thread 0x7ffff7211700 (LWP 416843)]

Thread 3 "producerConsume" hit Breakpoint 1, consumer () at producerConsumerMkII.cpp:29
29	{
(gdb) n
30		int accumulator = 0;
(gdb) n
33			std::unique_lock<std::mutex> locker(mu);
(gdb) watch accumulator 
Hardware watchpoint 2: accumulator
(gdb) n
34			cond.wait(locker, []() { return buffer.size() > 0; });
(gdb) n
35			int val = buffer.back();
(gdb) n
36			buffer.pop_back();
(gdb) n
37			std::cout << "Consumed: " << val << "\n";
(gdb) n
Consumed: 1
38			accumulator =+ val;
(gdb) n

Thread 3 "producerConsume" hit Hardware watchpoint 2: accumulator

Old value = 0
New value = 1
consumer () at producerConsumerMkII.cpp:39
39			std::cout << "Accumulator is " << accumulator << "\n";
(gdb) n
Accumulator is 1
40			locker.unlock();
(gdb) n
41			cond.notify_one();
(gdb) n
33			std::unique_lock<std::mutex> locker(mu);
(gdb) n
42		}
(gdb) n
33			std::unique_lock<std::mutex> locker(mu);
(gdb) n
34			cond.wait(locker, []() { return buffer.size() > 0; });
(gdb) n
35			int val = buffer.back();
(gdb) n
36			buffer.pop_back();
(gdb) n
37			std::cout << "Consumed: " << val << "\n";
(gdb) n
Consumed: 2
38			accumulator =+ val;
(gdb) n

Thread 3 "producerConsume" hit Hardware watchpoint 2: accumulator

Old value = 1
New value = 2
consumer () at producerConsumerMkII.cpp:39
39			std::cout << "Accumulator is " << accumulator << "\n";
(gdb) 

So we can see that the value for the accumulator variable is set to the last value received from the consumer, instead of accumulating the results. After examining the source code, we can see in line 38 that we are using the "=+" operator instead of the "+=" operator. After changing the line we have the following code

#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <deque>
#include <vector>

std::mutex mu;
std::condition_variable cond;
std::deque<int> buffer;

const unsigned int maxBufferSize = 50;

void producer(int val)
{
	while (val)
	{
		std::unique_lock<std::mutex> locker(mu);
		cond.wait(locker, []() { return buffer.size() < maxBufferSize; });
		buffer.push_back(val);
		std::cout << "Produced: " << val << "\n";
		val--;
		locker.unlock();
		cond.notify_one();
	}
}

void consumer()
{
	int accumulator = 0;
	while (true)
	{
		std::unique_lock<std::mutex> locker(mu);
		cond.wait(locker, []() { return buffer.size() > 0; });
		int val = buffer.back();
		buffer.pop_back();
		std::cout << "Consumed: " << val << "\n";
		accumulator += val;
		std::cout << "Accumulator is " << accumulator << "\n";
		locker.unlock();
		cond.notify_one();
	}
}

int main()
{
	std::thread t1(producer, 10);
	std::thread t2(consumer);

	t1.join();
	t2.join();

	return 0;
}

And the result is the following

Produced: 10
Produced: 9
Produced: 8
Produced: 7
Produced: 6
Produced: 5
Produced: 4
Produced: 3
Produced: 2
Produced: 1
[New Thread 0x7ffff7211700 (LWP 416901)]
[Thread 0x7ffff7a12700 (LWP 416900) exited]
Consumed: 1
Accumulator is 1
Consumed: 2
Accumulator is 3
Consumed: 3
Accumulator is 6
Consumed: 4
Accumulator is 10
Consumed: 5
Accumulator is 15
Consumed: 6
Accumulator is 21
Consumed: 7
Accumulator is 28
Consumed: 8
Accumulator is 36
Consumed: 9
Accumulator is 45
Consumed: 10
Accumulator is 55

As we can see the output is the expected, so we successfully fixed all the bugs in our program using the gdb tool

References