Embedded Software Development
Introduction
Embedded software runs on resource-constrained hardware and is tightly coupled with the underlying hardware. This article covers bare-metal programming, RTOS, hardware abstraction layers, bootloaders, firmware updates, and debugging techniques.
Related content: Embedded Systems
1. Bare-Metal Programming
1.1 Register-Level Operations
Embedded programming often involves direct manipulation of hardware registers:
// STM32 GPIO register-level operations
#define GPIOA_BASE 0x40020000
#define GPIOA_MODER (*(volatile uint32_t *)(GPIOA_BASE + 0x00))
#define GPIOA_ODR (*(volatile uint32_t *)(GPIOA_BASE + 0x14))
// Set PA5 as output mode
GPIOA_MODER &= ~(3 << (5 * 2)); // clear bits
GPIOA_MODER |= (1 << (5 * 2)); // set to output (01)
// Turn on LED (PA5)
GPIOA_ODR |= (1 << 5); // set high
GPIOA_ODR &= ~(1 << 5); // set low
The volatile keyword: tells the compiler that the variable may be modified externally (by hardware or interrupts), preventing optimization.
1.2 Startup Code
The first code executed after MCU power-on:
1. Vector Table
- Initial stack pointer (SP)
- Reset Handler address
- Interrupt handler addresses
2. Reset Handler:
a. Initialize .data section (copy from Flash to RAM)
b. Zero-fill .bss section
c. Initialize system clock
d. Call main()
// Simplified startup code
extern uint32_t _sdata, _edata, _sidata; // defined by linker script
extern uint32_t _sbss, _ebss;
void Reset_Handler(void) {
// Copy .data section
uint32_t *src = &_sidata;
uint32_t *dst = &_sdata;
while (dst < &_edata) *dst++ = *src++;
// Zero-fill .bss section
dst = &_sbss;
while (dst < &_ebss) *dst++ = 0;
// Initialize system
SystemInit();
// Enter main
main();
// main should not return
while (1);
}
1.3 Interrupt Handling
// Interrupt Service Routine (ISR)
void EXTI0_IRQHandler(void) {
if (EXTI->PR & (1 << 0)) { // check interrupt flag
// Handle interrupt
button_pressed = 1;
EXTI->PR = (1 << 0); // clear interrupt flag
}
}
// Interrupt priority configuration (NVIC)
NVIC_SetPriority(EXTI0_IRQn, 2); // priority 2
NVIC_EnableIRQ(EXTI0_IRQn); // enable interrupt
ISR writing principles:
- Keep it as short as possible: only set flags; defer complex processing to the main loop
- No blocking operations: do not call
printf,malloc,sleep - Be aware of reentrancy: use
volatilefor shared variables
2. Real-Time Operating Systems (RTOS)
2.1 FreeRTOS
FreeRTOS is the most popular embedded RTOS, providing task scheduling, synchronization, and communication primitives.
Task Creation:
void vLEDTask(void *pvParameters) {
while (1) {
HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_5);
vTaskDelay(pdMS_TO_TICKS(500)); // delay 500ms
}
}
void vSensorTask(void *pvParameters) {
while (1) {
float temp = read_temperature();
xQueueSend(xTempQueue, &temp, portMAX_DELAY);
vTaskDelay(pdMS_TO_TICKS(1000));
}
}
int main(void) {
xTaskCreate(vLEDTask, "LED", 128, NULL, 1, NULL);
xTaskCreate(vSensorTask, "Sensor", 256, NULL, 2, NULL);
vTaskStartScheduler(); // start scheduler, does not return
}
Queues:
QueueHandle_t xTempQueue = xQueueCreate(10, sizeof(float));
// Send
float temp = 25.5;
xQueueSend(xTempQueue, &temp, pdMS_TO_TICKS(100));
// Receive
float received;
if (xQueueReceive(xTempQueue, &received, pdMS_TO_TICKS(1000)) == pdTRUE) {
process_temperature(received);
}
Semaphores:
// Binary semaphore: for interrupt-to-task synchronization
SemaphoreHandle_t xBinarySem = xSemaphoreCreateBinary();
// Release in ISR
void UART_IRQHandler(void) {
BaseType_t xHigherPriorityTaskWoken = pdFALSE;
xSemaphoreGiveFromISR(xBinarySem, &xHigherPriorityTaskWoken);
portYIELD_FROM_ISR(xHigherPriorityTaskWoken);
}
// Wait in task
void vUARTTask(void *pvParameters) {
while (1) {
xSemaphoreTake(xBinarySem, portMAX_DELAY);
process_uart_data();
}
}
Mutexes:
SemaphoreHandle_t xMutex = xSemaphoreCreateMutex();
void safe_write(const char *msg) {
xSemaphoreTake(xMutex, portMAX_DELAY);
uart_send(msg); // critical section
xSemaphoreGive(xMutex);
}
2.2 Zephyr RTOS
Zephyr is a modern embedded RTOS supported by the Linux Foundation:
- Supports 200+ development boards
- Complete networking stack (BLE, WiFi, Thread, LTE)
- Devicetree for hardware description
- Kconfig configuration system
- Built-in POSIX compatibility layer
3. Hardware Abstraction Layer (HAL)
HAL encapsulates hardware operations behind a unified interface, improving code portability:
// HAL interface definition
typedef struct {
void (*init)(void);
void (*write)(uint8_t *data, uint16_t len);
uint16_t (*read)(uint8_t *buf, uint16_t max_len);
} uart_driver_t;
// STM32 implementation
void stm32_uart_init(void) { /* ... */ }
void stm32_uart_write(uint8_t *data, uint16_t len) { /* ... */ }
uart_driver_t stm32_uart = {
.init = stm32_uart_init,
.write = stm32_uart_write,
.read = stm32_uart_read,
};
// Application layer uses the unified interface
void app_send_message(uart_driver_t *uart, const char *msg) {
uart->write((uint8_t *)msg, strlen(msg));
}
4. Bootloader
The bootloader is the first program that runs after power-on, responsible for initializing hardware and loading the application firmware.
Flash Layout:
┌─────────────────┐ 0x08000000
│ Bootloader │ 16-64KB
├─────────────────┤
│ App Header │ Metadata (version, CRC, size)
├─────────────────┤
│ Application │ Main application firmware
├─────────────────┤
│ OTA Staging │ New firmware staging area (optional)
└─────────────────┘
Bootloader startup flow:
1. Hardware initialization (clock, GPIO)
2. Check for firmware update request
├── Yes → Copy new firmware from staging to app area, verify CRC
└── No → Continue
3. Verify application firmware integrity (CRC/signature)
├── Pass → Jump to application
└── Fail → Wait for re-flashing
4. Set stack pointer, jump to application's Reset Handler
5. Firmware Updates (OTA)
OTA (Over-The-Air) allows remote firmware updates.
5.1 Update Strategies
| Strategy | Description | Reliability |
|---|---|---|
| A/B Partitioning | Two firmware partitions used alternately | High (rollback possible) |
| Delta Updates | Only transmit the changed portions | Medium (saves bandwidth) |
| Compressed Updates | Compress the full firmware | Medium |
5.2 A/B Partition Update Flow
1. Device runs current firmware from Partition A
2. Download new firmware to Partition B
3. Verify Partition B integrity (CRC + signature)
4. Mark Partition B as pending boot
5. Reboot → Bootloader loads Partition B
6. New firmware self-test
├── Success → Confirm, Partition B becomes active
└── Failure → Rollback to Partition A
5.3 Security Considerations
- Signature verification: use RSA/ECDSA to verify firmware origin
- Encrypted transport: TLS-encrypted download channel
- Rollback protection: monotonically increasing version numbers to prevent downgrade attacks
- Secure boot chain: verify each stage from bootloader to application
6. Cross-Compilation
Embedded development typically compiles on a PC and runs on target hardware.
# ARM cross-compilation toolchain
arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -mfloat-abi=hard \
-Os -Wall -ffunction-sections -fdata-sections \
-c main.c -o main.o
# Linking
arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb \
-T linker_script.ld -Wl,--gc-sections \
main.o -o firmware.elf
# Generate bin/hex files
arm-none-eabi-objcopy -O binary firmware.elf firmware.bin
arm-none-eabi-objcopy -O ihex firmware.elf firmware.hex
# Check size
arm-none-eabi-size firmware.elf
# text data bss dec hex filename
# 12456 128 2048 14632 3928 firmware.elf
CMake cross-compilation configuration:
# toolchain.cmake
set(CMAKE_SYSTEM_NAME Generic)
set(CMAKE_SYSTEM_PROCESSOR arm)
set(CMAKE_C_COMPILER arm-none-eabi-gcc)
set(CMAKE_CXX_COMPILER arm-none-eabi-g++)
set(CMAKE_ASM_COMPILER arm-none-eabi-gcc)
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
7. Debugging
7.1 JTAG/SWD
| Interface | Pin Count | Speed | Description |
|---|---|---|---|
| JTAG | 4-5 | Fast | Traditional interface, full-featured |
| SWD | 2 | Fast | ARM-specific, fewer pins |
| UART | 2 | Slow | Debug printing, simple |
7.2 GDB Remote Debugging
# Start OpenOCD (connect debugger)
openocd -f interface/stlink.cfg -f target/stm32f4x.cfg
# In another terminal, start GDB
arm-none-eabi-gdb firmware.elf
(gdb) target remote :3333 # connect to OpenOCD
(gdb) monitor reset halt # reset and halt
(gdb) load # download firmware
(gdb) break main # set breakpoint
(gdb) continue # run
(gdb) print sensor_value # inspect variable
(gdb) x/16xw 0x40020000 # examine memory/registers
7.3 Common Debugging Techniques
| Technique | Description | Use Case |
|---|---|---|
| Breakpoint debugging | JTAG/SWD + GDB | Logic issues |
| Serial printing | UART printf | Quick debugging |
| Logic analyzer | Capture digital signal waveforms | Timing issues |
| Oscilloscope | Observe analog/digital signals | Electrical issues |
| GPIO toggling | Measure code execution time with oscilloscope | Performance analysis |
| Hardware breakpoints | On-chip debug unit (DWT) | Data access monitoring |
8. Embedded Development Best Practices
| Practice | Description |
|---|---|
| Defensive programming | Validate all inputs, return values, and pointer validity |
| Watchdog | Use WDT to detect hangs and auto-reset |
| Memory management | Avoid dynamic allocation; use static allocation or memory pools |
| Low power | Use sleep modes, wake peripherals on demand |
| Code review | Embedded bugs are costly to fix; prevention is key |
| Unit testing | Test hardware-independent logic on PC |
Relations to Other Topics
- See IoT Systems for how firmware, device access, and cloud-side management form an end-to-end chain
- See Operating Systems for scheduling, interrupts, and resource management under real-time constraints
- See Architecture Overview for the hardware background behind MCUs, memory hierarchy, and peripherals
- See Testing and Quality Assurance for validation, debugging, and regression control in embedded contexts
References
- "Making Embedded Systems" - Elecia White
- "Real-Time Operating Systems for ARM Cortex-M" - Jonathan Valvano
- FreeRTOS Official Documentation: https://freertos.org
- Zephyr Official Documentation: https://docs.zephyrproject.org