Difference between revisions of "How to enable A/B redundancy in TX2"

From RidgeRun Developer Connection
Jump to: navigation, search
(Articles related)
m
Line 1: Line 1:
= Introduction =
+
== Introduction ==
  
 
A/B redundancy is useful to recover a system in case of a failure in one of its system partitions. Basically, there is a mirror for each partition and it is used in case that the main partition fails, so the system will fall back to the mirror (or recovery) partition.
 
A/B redundancy is useful to recover a system in case of a failure in one of its system partitions. Basically, there is a mirror for each partition and it is used in case that the main partition fails, so the system will fall back to the mirror (or recovery) partition.
Line 14: Line 14:
 
This article describes how to enable it and how to recover a TX2 after a partition failure. To recover a TX2 is required to enable A/B redundancy, for this you need to flash the new c-boot configuration using <code>flash.sh</code>.
 
This article describes how to enable it and how to recover a TX2 after a partition failure. To recover a TX2 is required to enable A/B redundancy, for this you need to flash the new c-boot configuration using <code>flash.sh</code>.
  
== General information ==
+
=== General information ===
  
=== Relevant directories ===
+
==== Relevant directories ====
  
 
Jetpack is divided into modules represented by directories. Let us assign a convenient name to sort these directories and specify where the different components are.  
 
Jetpack is divided into modules represented by directories. Let us assign a convenient name to sort these directories and specify where the different components are.  
Line 30: Line 30:
 
2. <code>$JETSON_BOOTLOADER</code>: It contains the bootable image with the bootloader (<code>./boot.img</code>), the filesystem packed in an image file (<code>./system.img</code>), the device-tree transferred to the encrypted partition (if you do not define the device-tree in boot script configuration, the TX2 uses the DTB from this partition). Also, it contains all the important binaries used for signing, encrypting, enabling A/B redundancy, and writing files to the Jetson Tegra board.  
 
2. <code>$JETSON_BOOTLOADER</code>: It contains the bootable image with the bootloader (<code>./boot.img</code>), the filesystem packed in an image file (<code>./system.img</code>), the device-tree transferred to the encrypted partition (if you do not define the device-tree in boot script configuration, the TX2 uses the DTB from this partition). Also, it contains all the important binaries used for signing, encrypting, enabling A/B redundancy, and writing files to the Jetson Tegra board.  
  
=== Test environment ===
+
==== Test environment ====
  
 
All the steps and procedures written below were tested on ''Jetson Tegra X2'' in the following versions of Jetpack:
 
All the steps and procedures written below were tested on ''Jetson Tegra X2'' in the following versions of Jetpack:
Line 39: Line 39:
 
</pre>
 
</pre>
  
= Enabling the A/B redundancy =
+
== Enabling the A/B redundancy ==
  
 
There is a configuration file to enable/disable the A/B redundancy. It is in <code>$JETSON_BOOTLOADER/smd_info.cfg</code> and is likely to have the following settings:
 
There is a configuration file to enable/disable the A/B redundancy. It is in <code>$JETSON_BOOTLOADER/smd_info.cfg</code> and is likely to have the following settings:
Line 124: Line 124:
 
Also, it is important to highlight that the priority order defines which slot is used for booting. In the example shown above, slot 0 is going to be used by C-boot during the boot process.
 
Also, it is important to highlight that the priority order defines which slot is used for booting. In the example shown above, slot 0 is going to be used by C-boot during the boot process.
  
=  Recovering the system after a partition failure =
+
==  Recovering the system after a partition failure ==
  
 
If the A/B redundancy is enabled and a principal partition (for example: <code>kernel-dtb</code>) gets broken, the TX2 will fall back to the recovery partition (<code>kernel-dtb_b</code>), which has the same content that the main partition had. However, this process will disable the principal partition indefinitely. For going back to use the principal partition after fixing it, enable it in the target board (TX2) using:
 
If the A/B redundancy is enabled and a principal partition (for example: <code>kernel-dtb</code>) gets broken, the TX2 will fall back to the recovery partition (<code>kernel-dtb_b</code>), which has the same content that the main partition had. However, this process will disable the principal partition indefinitely. For going back to use the principal partition after fixing it, enable it in the target board (TX2) using:
Line 135: Line 135:
 
<code>nvbootctrl set-active-boot-slot </code> allows you to enable a partition again. The <code>SLOT=0</code> partitions are the principal partitions and the <code>SLOT=1</code> partitions are redundant or recovery partitions.
 
<code>nvbootctrl set-active-boot-slot </code> allows you to enable a partition again. The <code>SLOT=0</code> partitions are the principal partitions and the <code>SLOT=1</code> partitions are redundant or recovery partitions.
  
== Case of use ==
+
=== Case of use ===
  
 
Suppose that ''kernel-dtb'' gets broken after an upgrade attempt. When the TX2 reboots, it will fall back to the ''kernel-dtb_b'' and the system is now usable again. After fixing ''kernel-dtb'', the user rebooted the TX2, but it continues falling back to ''kernel-dtb_b''.
 
Suppose that ''kernel-dtb'' gets broken after an upgrade attempt. When the TX2 reboots, it will fall back to the ''kernel-dtb_b'' and the system is now usable again. After fixing ''kernel-dtb'', the user rebooted the TX2, but it continues falling back to ''kernel-dtb_b''.
Line 170: Line 170:
 
A <code>retry_count</code> value different of zero gives SLOT 0 a try in the next boot. If it boots properly, it will continue booting from SLOT 0 and TX2 will use ''kernel-dtb'' again.
 
A <code>retry_count</code> value different of zero gives SLOT 0 a try in the next boot. If it boots properly, it will continue booting from SLOT 0 and TX2 will use ''kernel-dtb'' again.
  
= Conclusion =
+
== Conclusion ==
  
 
A/B redundancy is disabled by default in TX2. To enable it, modify ''smd_info.cfg'', run <code>nv_smd_generator</code> and flash the TX2. If you do not want to rebuild the filesystem, you can use the <code>-r</code> parameter when executing <code>flash.sh</code>.
 
A/B redundancy is disabled by default in TX2. To enable it, modify ''smd_info.cfg'', run <code>nv_smd_generator</code> and flash the TX2. If you do not want to rebuild the filesystem, you can use the <code>-r</code> parameter when executing <code>flash.sh</code>.
Line 182: Line 182:
 
</pre>
 
</pre>
  
= Articles related  =
+
== Links ==
  
 
:[[Gstreamer pipelines for Tegra X2]]  
 
:[[Gstreamer pipelines for Tegra X2]]  
 
:[[Tegra X2 or TX2]]  
 
:[[Tegra X2 or TX2]]  
 +
 +
==ContactUs==
 +
 +
  
 
[[Category:Jetson]][[Category:HowTo]]
 
[[Category:Jetson]][[Category:HowTo]]

Revision as of 16:00, 7 February 2020

Introduction

A/B redundancy is useful to recover a system in case of a failure in one of its system partitions. Basically, there is a mirror for each partition and it is used in case that the main partition fails, so the system will fall back to the mirror (or recovery) partition.

Nvidia Jetson TX2 has this capability. The main partitions do not have any suffix, whereas the recovery partitions have a _b suffix. For example:

kernel-dtb: Main partition
kernel-dtb_b: Recovery partition

However, the Jetson TX2 has the A/B redundancy disabled by default. So, partitions with _b suffix are not used by TX2 when the main partition fails.

This article describes how to enable it and how to recover a TX2 after a partition failure. To recover a TX2 is required to enable A/B redundancy, for this you need to flash the new c-boot configuration using flash.sh.

General information

Relevant directories

Jetpack is divided into modules represented by directories. Let us assign a convenient name to sort these directories and specify where the different components are.

1. Jetpack installation directory: $JETPACKDIR

2. Bootloader directory: JETSON_BOOTLOADER=$JETPACKDIR/64_TX2/Linux_for_Tegra/bootloader

Inside each directory, there are important files. Let's specify what we can find in each one:

1. $JETPACKDIR: It contains all the Jetpack files, Nvidia flashing binaries and files for the right working of Tegra boards. Inside of it, there is a folder named 64_TX2, which contains such files.

2. $JETSON_BOOTLOADER: It contains the bootable image with the bootloader (./boot.img), the filesystem packed in an image file (./system.img), the device-tree transferred to the encrypted partition (if you do not define the device-tree in boot script configuration, the TX2 uses the DTB from this partition). Also, it contains all the important binaries used for signing, encrypting, enabling A/B redundancy, and writing files to the Jetson Tegra board.

Test environment

All the steps and procedures written below were tested on Jetson Tegra X2 in the following versions of Jetpack:

Jetpack 3.2.1
Jetpack 3.3

Enabling the A/B redundancy

There is a configuration file to enable/disable the A/B redundancy. It is in $JETSON_BOOTLOADER/smd_info.cfg and is likely to have the following settings:

...
# SMD metadata information
< VERSION 3 >

#
# Config 1: Disable A/B support (Default)
#

# slot info order is important!
# <priority>    <suffix>     <retry_count>  <boot_successful>
15                  _a          7               1

#
# Config 2: Enable redundancy support (by removing comments ##)
#
##< REDUNDANCY_USER 1 >

# slot info order is important!
# <priority>    <suffix>     <retry_count>  <boot_successful>
##15                  _a          7               1
##14                  _b          7               1

The config 1 disables the redundancy and config 2 enables it. To enable the redundancy, uncomment the lines with ##. Also, comment the settings of config 1. In the end, the settings to enable the A/B redundancy are like the following:

# SMD metadata information
< VERSION 3 >

#
# Config 1: Disable A/B support (Default)
#

# slot info order is important!
# <priority>    <suffix>     <retry_count>  <boot_successful>
##15                  _a          7               1

#
# Config 2: Enable redundancy support (by removing comments ##)
#
< REDUNDANCY_USER 1 >

# slot info order is important!
# <priority>    <suffix>     <retry_count>  <boot_successful>
15                  _a          7               1
14                  _b          7               1

After modifying the smd_info.cfg, the next step is to make the BUP file (this file is used by C-Boot to control the booting partitions). For this, from $JETSON_BOOTLOADER/, run:

cd $JETSON_BOOTLOADER
sudo ./nv_smd_generator smd_info.cfg slot_metadata.bin

Finally, run the Nvidia flashing tool:

cd $JETSON_BOOTLOADER
cd $JETPACKDIR
sudo ./flash.sh jetpack-tx2 mmcblk0p1

For checking if the process was successful, run in the TX2:

sudo nvbootctrl dump-slots-info

It should show the settings that you selected in smd_info.cfg. For example:

slot: 0,             priority: 15,             suffix: _a,             retry_count: 7,             boot_successful: 1
slot: 1,             priority: 14,             suffix: _b,             retry_count: 7,             boot_successful: 1

TX2 A/B redundancy uses a group of partitions called "slots". A slot contains all the necessary partitions which make TX2 capable to boot properly. In the case of TX2, it has two slots: Slot 0 for the principal partitions and Slot 1 for recovery partitions. Besides, the difference between the principal partitions and the recovery ones is basically the suffix. For example, the principal partition which stores the DTB is kernel-dtb and the DTB recovery partition is kernel-dtb_b. The suffix _b indicates that is a recovery partition.

Also, it is important to highlight that the priority order defines which slot is used for booting. In the example shown above, slot 0 is going to be used by C-boot during the boot process.

Recovering the system after a partition failure

If the A/B redundancy is enabled and a principal partition (for example: kernel-dtb) gets broken, the TX2 will fall back to the recovery partition (kernel-dtb_b), which has the same content that the main partition had. However, this process will disable the principal partition indefinitely. For going back to use the principal partition after fixing it, enable it in the target board (TX2) using:

SLOT=0
nvbootctrl set-active-boot-slot $SLOT

nvbootctrl set-active-boot-slot allows you to enable a partition again. The SLOT=0 partitions are the principal partitions and the SLOT=1 partitions are redundant or recovery partitions.

Case of use

Suppose that kernel-dtb gets broken after an upgrade attempt. When the TX2 reboots, it will fall back to the kernel-dtb_b and the system is now usable again. After fixing kernel-dtb, the user rebooted the TX2, but it continues falling back to kernel-dtb_b.

To see what happened:

sudo nvbootctrl dump-slots-info

Giving:

slot: 0,             priority: 15,             suffix: _a,             retry_count: 0,             boot_successful: 0
slot: 1,             priority: 14,             suffix: _b,             retry_count: 7,             boot_successful: 1

Please, note that retry_count and boot_successful are both in zero. So, the SLOT 0 will not work.

For solving this issue, the user has to enable the SLOT 0 and mark it bootable once more. To do so:

SLOT=0
nvbootctrl set-active-boot-slot $SLOT

This leads to:

slot: 0,             priority: 15,             suffix: _a,             retry_count: 7,             boot_successful: 0
slot: 1,             priority: 14,             suffix: _b,             retry_count: 7,             boot_successful: 1

A retry_count value different of zero gives SLOT 0 a try in the next boot. If it boots properly, it will continue booting from SLOT 0 and TX2 will use kernel-dtb again.

Conclusion

A/B redundancy is disabled by default in TX2. To enable it, modify smd_info.cfg, run nv_smd_generator and flash the TX2. If you do not want to rebuild the filesystem, you can use the -r parameter when executing flash.sh.

On the other hand, there are some useful commands in TX2 to verify the status of the A/B redundancy:

nvbootctrl get-current-slot: It shows the current slot
nvbootctrl set-active-boot-slot $SLOT: It chooses the $SLOT for the next boot.
nvbootctrl dump-slots-info: It shows all slots details.

Links

Gstreamer pipelines for Tegra X2
Tegra X2 or TX2

ContactUs