Introduction
In the past, typical complex avionics system architectures used multiple processors to host various applications, ensuring that each application received the necessary processing time.
As the processors evolved and their capabilities increased, powerful single or multi-core processors replaced multi-processor architecture. This enabled system architects to design systems where multiple applications could be hosted on a single processor. However, this new architecture introduced a significant challenge in ensuring safety for avionics systems.
In avionics systems, each application is assessed for its impact on the flight safety and assigned Design Assurance Level (DAL) from Level A to Level E. The DAL-A avionics system or application, such as the Flight Control System, is the most safety-critical. A failure in this system can lead to catastrophic aircraft failure and result in the loss of many lives.
However, another DAL-E avionics system or application such as In-Flight Entertainment System has no impact on flight safety. When two or more applications of different DAL are hosted on a single processor, it is required by the DO178 standards to ensure that the sharing of common computing resources such as memory space, processing time is managed in such a way that lower DAL application does not interfere or halt the execution of the higher DAL application.
ARINC 653 software architecture for space and time partitioning provides a solution here. ARINC 653 defines a schedule having major frames which comprises minor frames. It specifies the execution time of an individual partition for a fixed duration. ARINC 653 enforces partitioning of the safety critical applications executing in individual partitions.
Partitions or Virtual Machines
In the context of embedded systems, partitioning means dividing the hardware and software resources into multiple independent regions called partitions. Each partition operates as if it is
an independent system with individual memory space, processing time, and peripherals assigned to them.
Figure-1 Theoretical Partition Execution
Each partition hosts a separate application that runs for the predefined time configured in the system. After the allocated time, the real time operating system halts the execution of the current partition and switches to the next partition in the schedule which hosts a different application.
Partition Switch Time
The partition switch time is the maximum time required to transition from the current executing partition to the next in the schedule. This duration is part of the time available for the execution of the scheduled partition. Therefore, a time limit needs to be established for the partition switch time to ensure minimum time is spent in switching the partition.
Figure-2 Practical Partition Execution
The following needs to be considered while calculating the overall partition switch time:
● I/O Interrupt Latency
● Interrupt Disable Time
● Exception Handling Time
● Processor Cache Reload Time
The partition switch time is the time it takes to handle the scheduling clock interrupt to switch from the current partition until the next partition begins execution.
The interrupt disable time is the longest duration when interrupts and preemption are disabled.
Exception or interrupt handling time is the time it takes to complete the execution of the exception handler. There can be multiple exception handlers in the system. The exception handler which takes the maximum time in execution is considered as the exception handling time.
While designing the software architecture, the maximum amount of partition switch time allowed is captured as part of the design requirement to ensure the deterministic behavior of the system. Each platform software component implementation must adhere to this partition switch time requirement.
Partition Switch Jitter Time
During the partition switching, if the critical section is being executed or an exception occurs that must be handled, the partition switch is delayed by the amount of time taken to complete the execution of the critical section or completion of the exception handler. This delay is called the partition switch jitter.
Let’s understand this with an example.
Figure-3 Partition Switch Jitter Time
As shown in Figure-3, the critical section/exception handler of Partition 1 gets completed within the execution time given to that partition.
The critical section/exception handler of Partition 2 starts at the end of the time allocated for that partition. It takes more time than the partition switch to complete the exception handler or critical section execution. This delays the start of Partition 3 execution.
The critical section/exception handler of Partition 3 starts at the end of the time allocated for that partition. But it gets completed within the partition switch time that allows Partition 4 execution to start in time.
Partition Switch Jitter Analysis:
Partition switch jitter analysis is a process used to evaluate the delay in switching between different partitions in a system. It is an important part of the verification procedure to make sure the system design adheres to the DO-178 safety guidelines. It is mandatory to perform partition switch jitter analysis for DAL-A platform software components of an avionics system.
Steps to perform Partition Switch Jitter Analysis:
Typical procedure to perform partition switch jitter analysis:
1. Identify the number of partitions configured in the system.
2. Understand the scheduling policy used to allocate tasks or processes to different partitions. This may include priority-based scheduling, time-based scheduling, or other algorithms.
3. Study the partition scheduling mechanism and identify the conditions or events that trigger a switch from one partition to another.
4. Measure the time taken to switch from one partition to another. This involves capturing timestamps or using hardware performance counters to measure the switch time.
5. Analyze the collected partition switch time data and identify the variation or deviation from the expected partition switch time.
Partition switch jitter analysis helps system architects and developers ensure that critical real-time tasks or processes meet their timing requirements and operate reliably within the system. By limiting partition switch jitter time, the system can achieve deterministic behavior and performance required in DAL-A systems or applications.
How to keep partition switch jitter time within the acceptable limit?
Partition switch jitter time can be limited by thoughtfully designing and implementing the system or application keeping in mind the factors like preemption, critical sections, and race conditions. Here is the common procedure to help identify areas in your source code or design which can determine and optimize the parameters contributing to the partition switch jitter time.
1. Determine the factors contributing to the jitter. This can include interrupt latency, scheduling overhead, contention for shared resources, or other system-level effects.
2. Based on the analysis, identify opportunities to reduce jitter. This may involve optimizing scheduling algorithms, fine-tuning partition configurations, adjusting interrupt priorities, or allocating resources more efficiently.
3. Validate the analysis by running real-time scenarios or workload simulations to ensure that the observed jitter aligns with the predicted values.
4. If necessary, iterate the analysis process by adjusting the system configuration, scheduling policies, or other parameters to further optimize and reduce partition switch jitter.
Summary
The partition switch jitter time is critical for aavionics DAL-A systems to maintain the deterministic behavior and ensure safety. Partition switch jitter analysis, along with other analyses such as memory analysis, stack usage analysis, link map analysis, structural coverage analysis, and timing analysis, is an important aspect in the verification of DAL-A systems. Conducting partition switch jitter analysis requires expertise and in-depth knowledge of the platform.