In consumer products, the consequences of failed over-the-air updates are low, but can they also be used in a high-reliability, safety-critical environment such as in Automotive ECUs?
‘Software Update is available for your device and is ready to install’ – No matter if it’s on your smartphone, computer or TV, we all know this type of notification. It is common practice to extend and supplement the functionality of our smart devices after the product has been shipped and put into service. This practice is generally referred to as over-the-air (OTA) updates, as these updates are delivered remotely to the device over a communications interface. Whereas apps are designed as packaged programs, software updates fundamentally alter the product's functionality. If an app causes the product to stop functioning, a reboot will typically restore it to full working order. But if a software update is corrupted or contains a bug, it could stop the product from working. In this case, a reboot is unlikely to fix the issue and that’s why the correct implementation of OTA updates is more than crucial.
Many electronic consumer products, such as cell phones, tablets, smart TVs and set-top boxes, can now run standalone apps. However, a much larger proportion of connected devices, not equipped to run standalone apps, will be designed to accept an OTA update to their software. These devices include products designed for sectors including industrial, medical and automotive. These high-reliability applications must balance the convenience of OTA updates with the potential downside of permanently disabling a device. Commonly referred to as 'bricking', it turns what was once a useful electronic product into something that simply takes up space.
A much larger proportion of connected devices, not equipped to run standalone apps, will be designed to accept an OTA update to their software.
Necessity for Strict Processes
Engineers need to identify potential points of failure and create a system-level architecture to mitigate against them to avoid bricking a device with a failed OTA update. The key components of this architecture will comprise of a telematics unit, a gateway, or manager and a client: The telematics unit is being alerted about a new software update. Then the telematics unit would validate the source and initiate a secure connection to the server to download the file. Once received, the telematics unit would pass the software to the gateway/manager, preparing it for the client. In most cases, the client, such as an ECU in an automotive application, would be equipped to install the update itself by overwriting or replacing the software stored in its non-volatile memory.
This can be the most crucial part of the OTA process for safety-related applications. Often, the update must be installed without stopping the process. On the other hand, the client will schedule the update during a time when it is not operating. The process needs careful consideration either way.
The automotive industry is a great example of a high-reliability application that must adhere to a strict process when implementing OTA updates.
The automotive industry is a great example of a high-reliability application that must adhere to a strict process when implementing OTA updates. In no instance can there be a scenario where an ECU could be allowed to brick either during or after an OTA update. Security is critical at every stage in the process and needs to be observed. This typically means using encryption and authentication with keys and certificates stored in a secure, tamper-proof way.
Methods to Implement OTA Updates
The are several ways to execute an OTA update depending on the size of the software image, the processing elements available and the available non-volatile program memory. Whatever method used, following the update, the product (such as an ECU) will need to start executing the new software image. This typically means starting from a known point, and in the majority of cases, a restart.
Some microcontrollers (MCUs) now support OTA updates with features that can support the various options.
The table below provides an overview of the different approaches and their relative advantages.
Table 1: The comparative features and benefits of OTA approaches
Enabling Over-the-Air Updates
The NXP S32K3 family of automotive -qualified MCUs have been designed to enable OTA functionality in demanding applications. The architecture features include:
- Dual-bank on-chip Flash memory that supports read-while-write, supporting both the in-place or A/B swap methods of software updates
- Multicore architecture; allowing one Arm® Cortex®-M7 core to continue executing from one part of the Flash memory while the second Arm Cortex-M7 core handles the new software download, authentication and storage in the other part of the Flash memory
- A hardware security engine (HSE), providing encryption and decryption through both symmetrical and asymmetrical ciphers, including AES-128/256 and RSA, to maintain secure communications between the client and the OTA manager
- LIN, CAN and Ethernet support
- Hardware features dedicated to OTA support: OTA indicators, a monitor to detect and record any loss of communication during a software exchange, and the ability to roll-back the software based on the status of the OTA indicators
In Summary
Pervasive connectivity has made OTA updates standard practice. Yet, in safety-critical applications, particularly automotive, their use needs to be applied with great care. There are many challenges involved, such as maintaining reliable operation during an update, ensuring end-to-end security in the process and providing a robust way of preventing bricking the client.
These challenges can be overcome by using the most appropriate hardware. Managing OTA becomes simpler when hardware and software are designed to operate synergistically. This means the benefits and advantages can be applied to any application.
To learn more, visit our contributed article on the eeNews website.