As discussed in the earlier blog, it is becoming very important, in an embedded system, to ensure authenticity of the firmware before running it. Also, the system has to be made tamper proof against further hacking, especially for remotely managed internet connected IoT applications.

To prevent breach of security, the software can be strengthened with various techniques based on the underlying MCU and peripheral set. This blog discusses in particular how it can be done for iMx RT1020 based devices using the High Assurance Boot (HAB) mechanism as recommended by NXP.

Secure Boot Concepts

NXP’s HAB uses the mechanism of asymmetric encryption to protect its firmware. To give a quick introduction to asymmetric encryption, it is essentially creating a pair of keys in a way that one of the keys can encrypt the message and other can decrypt the message (and vice versa). It is mathematically impossible to use the same key used for encryption to decrypt the message. Also, with increased key sizes, it will be highly resource consuming to decrypt a message without the other pair.

Thus, with asymmetric encryption, it is enough to protect one of the keys (private key) and other can be shared (public key). The message encrypted by the private key can only be decrypted by the public key. Further if the public key (or at least its hash) is stored in a location that can not be modified such as On Time Programmable Flash, it will be impossible for any one to compromise the system. An attempt to modify the public key will be nullified because of the check with the OTP memory.

The high level of the above sequence can be capture in the below sequence diagram.

Secure Boot iMX RT 1020 HAB process

During the device provisioning process, the public and private key pairs are generated and private key is secured in the provisioning system. Hash for the public key is generated and stored in the device OTP area, which prevents further modification.

In the code signing sequence, the firmware image is hashed and encrypted using the private key. The final image generated is comprised of the firmware image, its encrypted image along with the public key and the same is programmed on to the boot memory.

During the bootup sequence, the HAB logic extracts the individual components of the signed image and validates to authenticity of the public key by comparing the computed hash and that stored in the OTP fuses. It is impossible to create public key such that the hash is same there by preventing any attempt of overriding the public key by external parties. Then it proceeds to calculate the hash of the firmware. It is compared with a hash generated by decrypting the encrypted hash using the public key. If it is a match, it proceeds to boot. If it fails in any of the place, the boot is aborted.

Code Signing for i.Mx RT1020

NXP provides all the tools necessary for generating public-private key pairs, code signing and blowing boot flashes such as MfgTool, elftosb, cst etc.

The device can be programmed using two methods: Device Boot and Secure Boot. The Device boot mode can be used during development purposes and secure boot for final programming. If the device is once programmed in Secure boot mode, it is not possible to revert back to Dev Boot mode and all further firmware has to be signed properly. The programming process is carried out by Flashloader tools such as- elftosb tool for boot image creation, Mfg tool for boot image programming.

Dev Boot Mode

To program the device, use the Mfgtool.

  • Create unsigned boot_image.sb using elftosb tool from SREC format of the application image (app.s19 file).
  • Make sure the file inside the Mfg tool is available in the name – cfg.ini
  • The content inside the file should be in the following format : chip → MXRT102X, name → MXRT102X-DevBoot
  • Import the boot_image.sb file to …/Tools/mfgtools-rel/Profiles/MXRT102X/OS Firmware from …/Tools/elftosb/linux/amd64/
  • After generating the boot_image.sb and placing it in the following directory …/Tools/mfgtools-rel/Profiles/MXRT102X/OS Firmware
  • Change the device’s boot mode into serial downloader mode and connect it to the host PC
  • Run the MfgTool and press the start button to program the target.
  • To exit MfgTool, click “Stop” and “Exit” in turn

Secure Boot Mode

To program the OTP flash once with hash of the public key, use the Mfgtool as follows

  • Check whether the device is in serial downloader mode
  • Generate the private/public keys using CST tool and create fuse.bin and fuse.table files.
  • Make sure the file inside the Mfg tool is available in the name – cfg.ini
  • The content inside the file should be in the following format : chip → MXRT102X, name → MXRT102X-Burnfuse
  • Create and import the enable_hab.sb file to the following directory …/Tools/mfgtools-rel/Profiles/MXRT102X/OS Firmware from the directory …/Flashloader_RT1020_1.0_GA/Tools/elftosb/linux/amd64/
  • After programming the above mentioned enable_hab.sb file successfully, the device will be ready for secure boot.

The above process of programming the fuse has to be executed only once. Further mode to program the device with signed image, use the Mfgtool as follows

  • Create signed boot_image.sb using elftosb tool from SREC format of the application image (app.s19 file).
  • Check whether the file inside the Mfg tool is available in the name – cfg.ini
  • The content inside the file should be in the following format : chip → MXRT102X, name → MXRT102X-SecureBoot
  • Import the signed boot_image.sb file to the following directory …/Tools/mfgtools-rel/Profiles/MXRT102X/OS Firmware from the directory …/Flashloader_RT1020_1.0_GA/Tools/elftosb/linux/amd64/

The details of the process can be obtained from NXP i.Mx1020 product page. Once secured, it will be impossible to run unauthorized software.

Same concepts can be extended to OTA updates so that the new firmware can be authenticated even before programming.

About Embien :

Embien has been actively developing IoT devices that form important part of a larger network with huge ramifications on security. We have been using advanced tools and techniques to prevent unauthorized access and tampering of the device. Get in touch with us to get your design unprecedented security.

The number of IoT devices connected to the internet are increasing rapidly and is expected to reach 75.44 billion by the end of 2025, which clearly states a fivefold increase within a decade. IoT devices are expected to become the major target for malware attacks. Even in 2019, of the overall cyber-attacks, it is estimated that IoT related breaches is about 26%.

Ranging from automobiles, medical implants to casinos – IoT related attacks have become prominent and it is essential to incorporate security features within IoT devices to help lower the increasing menace.

This blog covers a few steps that can be employed in securing Embedded Firmware for IoT Applications at a high level. While these guidelines help you understand the overall picture, it is also necessary to analyze the threats that are specific to your product and integrate them with due diligence. Some of the considerations discussed here include

  • Securing Boot Firmware
  • Secure firmware update
  • Secure data storage
  • IoT Product Physical Security

Securing Boot Firmware

The First and foremost step in securing the firmware is to protect the processor which is the heart of the system. Almost all the modern variants of the MCUs/MPUs etc., are equipped with features that are specific to security such as HAB (high assurance boo) and TPM (Trusted Platform Module). In some cases, the processor offers a Secure Boot mechanism in which it validates the sanity of the boot code before it starts executing it.

Some silicon vendors provide mechanisms where the hash of the boot code can be stored in a One Time Programmable (OTP) memory. Once the boot code is prepared, the hash is created and stored in these flashes. Upon bootup, the processor reads the complete boot code, calculates the hash and compares it with the pre-programmed value. Only if there is a match, the code will be executed.

On a more powerful system, it is possible to encrypt the hashes with a private key. The public keys can be programmed along with the boot code. The system calculates the hash and compares it with the hash that was decrypted with the public key. To authenticate the veracity of the public key, its hash is stored in the OTP flashes.

On simpler devices, when the reprogramming of boot code is envisaged, microcontrollers offer a mechanism that tend to blow fuses that will permanently lockout the code area in flash and prevent it from further writes and reads by external sources. Such mechanisms lock out the binaries and increase the security as it is difficult to reverse engineer the firmware and identify potential weaknesses.

Once the boot code is authenticated by the system, it validates the next level of firmware/applications by maintaining the chain of trust.

Secure firmware update

In many cases, during the life cycle of a product, it is required to update the firmware for specific reasons like: adding new features, fixing defects in the present release etc., Such products offer secure firmware update feature where by the new firware can be sent to the device either remotely Over The Air (OTA) or physically using memory cards. Before usage, the firmware is must be authenticated and updated in a fail-safe manner.

Modern firmware typically take advantage of asymmetric cryptographic algorithms with the private keys stored securely at the vendor facility. The public key is stored usually in a secured location within the end-product. The firmware to be updated will be signed by the private key and shared over the internet in a secured manner. The device validates the downloaded image with the public key that is available with itself.

While the above mechanism only validates the authenticity of the image, sometimes it is also essential to protect the firmware against reverse engineering. In such scenarios, the whole image is encrypted with the help of the private key. Only the end device will be able to decrypt it and program itself.

When such designs tend to improve security, it is of prime necessity to consider the flexibility of the update process as well. For example, if there is a security breach at the server, these designs should be able to revoke the present keys and replace them with another set.

Secure data storage

This section emphasizes on the importance of safe guarding the data content of the end device. These devices might be collecting sensitive information that should be protected against un-authorized access. In such cases, mechanisms such as encrypting the disk drives can be used to render the disk content unusable, unless it is decrypted by the same set of keys. The key storage should be in a safe place that is protected against cyber-attacks. Cryptographic chips are also available for storing the keys in a secured manner, that even when probing at wafer level, the data cannot be retrieved. Though the cost might go up by a few tens of cents, the amount of security these devices offer is really worth it.

Across product lines, the OEM must ensure that different keys are used for each of them. In this case, even if one of the products is compromised, it will lose effect on the others.

Also, communication with external devices, if possible, must be encrypted to prevent hacking of the system easily.

IoT Product Physical Security

Securing the IoT device physically is also a major consideration that is supposed to be taken into account while improving the security outlook. They include:

Closing Debug Ports – Debug serial ports, USB drivers etc., are used during the course of development. These features should be removed in the final release to reduce the surface of attack. In boot loaders such as Linux Uboot, break-on-user-input feature must be disabled to avoid manipulation of boot images.

JTAG/SWD connectors – Debugging tools such as debuggers and emulators that come in handy during board bring up must also be prevented from access in the field.

Firewall and closing ports – Network enabled products must run at least a minimal version of firewall. Also, closing of TCP/IP/UDP ports that are not in use is highly recommended. Using non-standard port numbers add a small level of security than that of using the standard ports.

Passwords – Many Linux systems have ubiquitous passwords such as root, admin, password, root123. These login credentials must be hardened and validated before product launch.

Mechanicals – Physical product designs can also include mechanisms that prevent easy device access as well as tamper detection mechanisms.

As an English phrase states – “a chain is as strong as its weakest link”, it is essential to ensure whether all the steps in the device operation is carefully reviewed and scrutinized. Security audits can be performed to profile the effectiveness of the design. Standards such as ISO/IEC 27001 can be a base to start with. But as mentioned earlier, these are all just outlines that aid in Securing Embedded Firmware for IoT Applications. Other device specific and domain related improvements must be built over it.

About Embien:

Embien is in the field of embedded product design since a decade and has helped customers realize products with all the features needed for today’s world.

For developing a secured embedded system for your business or to perform a security audit of your present design, get in touch with our team.

Saravana Pandian Annamalai
25. April 2019 · Write a comment · Categories: ARM, Embedded Software · Tags: , , ,

In our earlier blogs on ARM Interrupt architectures, we explored the ARM exception models and registers. Also, we went through different kinds of Interrupt controllers being used. In the upcoming blogs, we will primarily see ARM Interrupt handling from the firmware/software perspective including operating systems like FreeRTOS, Linux and WinCE. To being with, this blog will discuss interrupt handling in ARM Cortex M MCUs.

Cortex M Vector Table

As discussed earlier, the ARM Cortex M series of MCUs typically carters to lower end application with the core running between a few MHz to a maximum 150MHz. To target low cost tools and ease of development, the interrupt architecture is designed to be simpler and straight forward. The vector table in ARM Cortex M series looks like:

Cortex M Vector Table

Typically, on power-on reset, the Vector table base address is defined to be at 0. The ARM core, up on boot up, loads the stack pointer with the value stored at offset 0. And then it loads the Program counter with the address available at offset 4 and starts executing the same.

Thus, the vector table design is such that the stack is operational before the core starts executing thereby eliminating the need for an assembly code to set things up for calling functions. In the firmware perspective, this is a major advantage. There is no need to write any assembly code.

Following these 2 words, the table should hold the addresses of the exception handlers. The first 14 of them are pre-defined ad reserved for handling specific to the core and its execution. From offset, 0x40, the SoC specific interrupt handlers are defined and can be customized by the silicon vendor.

It can also be noted that there is a priority associated with each of these exceptions. Lower the number the higher the priority. Some of the major exceptions defined by ARM are

Reset Handler

At the highest priority of -3, this is the entry point of execution. Loaded to PC on power on reset, this is responsible for initializing the system peripherals and start executing the firmware/OS.

NMI Handler

At a priority only next to Reset Handler (-2), as the name suggests this cannot be masked by software. It is typically triggered by a specialized peripheral unit that can be connected to a critical functionality.

HardFault Handler

This exception is caused when there is an error during exception processing. At a higher priority of -1 than other exception, this can be used to recover from issues during exception handling.

MemManage Handler

Caused due to memory protection faults, the priority level can be configured by the firmware.

BusFault Handler

Caused during to memory access – either during instruction fetch or data access, the priority level can be configured by the firmware

UsageFault Handler

Caused during to instruction executing, the priority level can be configured by the firmware. The handler is called when one of the following errors occurs

  • Presence of an undefined instruction
  • Performing an illegal unaligned access
  • Core in invalid state on instruction execution
  • An error occurring on exception return.
  • Doing an unaligned address on word and halfword memory access
  • Performing division by zero

SVCall Handler

Known as the Supervisor Call, this handler is called up on the core executing a SVC instruction. This is typically used in OS environments to execute system services.

PendSV Handler

This is typically used in OS environments to perform context switching.

SysTick Handler

The ARM Cortex M core defines a specialize timer module to keep track of the System time. This handler is executed once this timer value reaches 0.

With this understanding of Cortex M vector table, now we will see how the firmware handles exceptions in software.

Cortex M Vector Table

To practically understand Cortex M Interrupt handling, we will take an example of software implementation of FreeRTOS running on NXP K66 MCU compiled using GCC tool chain. The first and foremost step is to define the vector table and place it in the Vector base location. The vector table for the device looks like this:

__attribute__ ((used, section(“.isr_vector”)))

void (* const g_pfnVectors[])(void) = {

// Core Level – CM4

&_vStackTop,                            // The initial stack pointer

ResetISR,                                   // The reset handler

NMI_Handler,                           // The NMI handler

HardFault_Handler,                   // The hard fault handler

MemManage_Handler,              // The MPU fault handler

BusFault_Handler,                     // The bus fault handler

UsageFault_Handler,                 // The usage fault handler

0,                                                // Reserved

0,                                                // Reserved

0,                                                // Reserved

0,                                                // Reserved

SVC_Handler,                           // SVCall handler

DebugMon_Handler,                // Debug monitor handler

0,                                               // Reserved

PendSV_Handler,                     // The PendSV handler

SysTick_Handler,                     // The SysTick handler

 

// Chip Level – MK66F18

DMA0_DMA16_IRQHandler,       // 16 : DMA Channel 0, 16 Transfer Complete

DMA1_DMA17_IRQHandler,       // 17 : DMA Channel 1, 17 Transfer Complete

DMA2_DMA18_IRQHandler,       // 18 : DMA Channel 2, 18 Transfer Complete

:

:

UART0_RX_TX_IRQHandler,      // 47 : UART0 Receive/Transmit interrupt

:

:

CAN1_Tx_Warning_IRQHandler,  // 113: CAN1 Tx warning interrupt

CAN1_Rx_Warning_IRQHandler,  // 114: CAN1 Rx warning interrupt

CAN1_Wake_Up_IRQHandler,      // 115: CAN1 wake up interrupt

 

}; /* End of g_pfnVectors */

As it can be seen, the interrupt handlers are clubbed together as an array with the address to the top of the stack (lowest address as it grows towards higher address) as the first element. This array is placed in address 0, via linker script mechanism. In this example, using the section (.isr_vector) keyword, the location of the vector table is set to 0.

Also, it can be noted that there are hundreds of interrupt handlers supported. Even this example has close to 115 handlers. Obviously, a firmware system will not have all these implemented, hardly 20-30 interrupts will be used in a system. Instead of asking the user to define handlers, the start-up code provided by the silicon vendors, will have dummy functions (a while (1) loop) defined with weak reference.

weak void NMI_Handler(void)

{ while(1) {}

}

WEAK_AV void SVC_Handler(void)

{ while(1) {}

}

So, when the developer defines an interrupt handler with the same name in his code, this weak function will be discarded and user-defined function linked against instead.

Cortex M Interrupt Handling

With the vector table installed, the functions that are needed can be added one by one in the firmware. As a first step, the Reset Handler has to be created. At typical, reset handler will look like as below.

void ResetISR(void) {

// Disable interrupts

__asm volatile (cpsid i”);

// If __USE_CMSIS defined, then call CMSIS SystemInit code

SystemInit();

//

// Copy the data sections from flash to SRAM.

//

unsigned int LoadAddr, ExeAddr, SectionLen;

unsigned int *SectionTableAddr;

// Load base address of Global Section Table

SectionTableAddr = &__data_section_table;

// Copy the data sections from flash to SRAM.

while (SectionTableAddr < &__data_section_table_end) {

LoadAddr = *SectionTableAddr++;

ExeAddr = *SectionTableAddr++;

SectionLen = *SectionTableAddr++;

data_init(LoadAddr, ExeAddr, SectionLen);

}

// At this point, SectionTableAddr = &__bss_section_table;

// Zero fill the bss segment

while (SectionTableAddr < &__bss_section_table_end) {

ExeAddr = *SectionTableAddr++;

SectionLen = *SectionTableAddr++;

bss_init(ExeAddr, SectionLen);

}

// Reenable interrupts

__asm volatile (cpsie i”);

main();

//

// main() shouldn’t return, but if it does, we’ll just enter an infinite loop

//

while (1) {

;

}

}

This minimalistic handler, disables all interrupts up on entry, configures the core and major peripherals via SystemInit function. Then it initializes the data and bss sections. Finally, it enables the interrupts before jumping to the main function.

Now we will see how a peripheral is configured for interrupt operation based on the Systick unit.

uint32_t SysTick_Config(uint32_t ticks)

{

if ((ticks – 1UL) > SysTick_LOAD_RELOAD_Msk)

{

return (1UL); /* Reload value impossible */

}

SysTick->LOAD = (uint32_t)(ticks – 1UL); /* set reload register */

NVIC_SetPriority (SysTick_IRQn, (1UL << __NVIC_PRIO_BITS) – 1UL); /* set Priority for Systick Interrupt */

SysTick->VAL = 0UL; /* Load the SysTick Counter Value */

SysTick->CTRL = SysTick_CTRL_CLKSOURCE_Msk |

SysTick_CTRL_TICKINT_Msk |

SysTick_CTRL_ENABLE_Msk; /* Enable SysTick IRQ and SysTick Timer */

return (0UL); /* Function successful */

}

As the code listing shows, the function configures various registers needed for proper operation of the Systick interrupt such as LOAD, VAL, CTRL etc. IT=t also configures the priority of the interrupt. As mentioned earlier, in Cortex M architecture, each of the interrupts has an associated priority. Depending on the implementation, there could be n number of bits corresponding to each interrupt number. Lower the number, higher the priority/urgency and can be set via the NVIC_SetPriority CMSIS API. With this mechanism, it is possible for the interrupts to be prioritized and only the higher priority one will be serviced if more than one interrupt is pending at the same time.

Typical handler for SysTick looks like:

void SysTickHandler( void )

{

/* The SysTick runs at the lowest interrupt priority, so when this interrupt

executes all interrupts must be unmasked. There is therefore no need to

save and then restore the interrupt mask value as its value is already

known. */

         portDISABLE_INTERRUPTS();

         {

                    /* Increment the RTOS tick. */

                    if( xTaskIncrementTick() != pdFALSE )

                   {

                                  /* A context switch is required. Context switching is performed in the PendSV interrupt. Pend the PendSV interrupt. */

                            portNVIC_INT_CTRL_REG = portNVIC_PENDSVSET_BIT;

                  }

         }

     portENABLE_INTERRUPTS();

}

Cortex M Interrupt Handling via Assembly

So far, we have not writing a single line of assembly code and still able to do all the required functionality in software. In some cases, there will be a need to do assembly code. In these cases, the handler can be defined as naked function (Which will not generate any stack push/pop and return codes) and implement them in assembly.

For example, in FreeRTOS, the SVC Handler implementation is as below:

void vPortSVCHandler( void ) __attribute__ (( naked ));

void vPortSVCHandler( void )

{

     __asm volatile (

     ldr r3, pxCurrentTCBConst2 \n” /* Restore the context. */

     ldr r1, [r3] \n” /* Use pxCurrentTCBConst to get the pxCurrentTCB address. */

     ldr r0, [r1] \n” /* The first item in pxCurrentTCB is the task top of stack. */

     ldmia r0!, {r4-r11, r14} \n” /* Pop the registers that are not automatically saved on exception entry and the critical nesting count. */

     msr psp, r0 \n” /* Restore the task stack pointer. */

     isb \n”

     mov r0, #0 \n”

     msr basepri, r0 \n”

     bx r14 \n”

     ” \n”

     ” .align 4 \n”

     “pxCurrentTCBConst2: .word pxCurrentTCB \n”

     );

}

Thus with compiler extensions, it is possible to include assembly based interrupt handling as well.

Switching Vector tables

In the above examples, we have noted that the vector table is located at address 0. But there are cases, where we will need to place it at a different location. For example, the flash location could be at address 0, and the RAM, where the user wants to place the vector table could be elsewhere. Or it could be a dual A/B firmware or a bootloader/application firmware each sitting at a different location. So, based on the current code being executed, the vector table location differs.

ARM provides a simple mechanism to switch the base address of the vector table. It provides a Vector table offset register at address 0xE000ED08 in the NVIC group, where the base address can be programmed. For example, to switch the vector location to address 0x20000000, following lines will suffice.