CPU IP Designed for Safety Critical Systems in an Autonomous Age
Artificial Intelligence (AI) techniques are increasingly used to give products greater awareness of their environment and, from this, a greater ability to control and automate functionality. We are seeing the emergence of artificial neural networks across a wide range of systems, from consumer products to ADAS and autonomous driving to industrial applications and beyond.
With the adoption of AI techniques, the required level of computing performance is much higher, and this is being addressed by a combination of dedicated accelerators and general purpose CPU-based compute capability. High-performance multiprocessor systems are a must.
In areas such as the automotive and industrial markets, Functional safety is also critical. These systems must be designed from the system level with a high degree of redundancy. The first step is to ensure that the product implements the correct behavior by design, which is managed by rigorous QMS processes. Safety-critical products must also detect and respond to errors that can occur during operation. Systems must be designed to adhere to industry standards for functional safety including ISO 26262 for automotive and IEC 61508 for industrial applications.
The MIPS I6500-F is the newest IP core in MIPS CPU product line, extending the variety and scalability of “off-the-shelf” licensable cores based on the proven and respected MIPS64 architecture to address the functional safety and performance requirements of emerging autonomous applications.
The MIPS I6500-F SEooC package is designed for systems requiring the highest level of functional safety: ASIL D. To achieve this level, not every IP core needs to reach the ASIL D. Instead the system is decomposed into IP that each reach a level similar to ASIL B but with enhanced fault detection time and fault coverage.
The additional functional redundancy built into the MIPS I6500-F includes:
- ECC across memories
- Parity protection for buses
- Time-out protection for interfaces
- Support for logic BIST (LBIST) operation on reset and periodically during operation
Heterogeneous Inside and Out
In embedded applications the computational workload is naturally multi-threaded but each thread will require a specific performance level. With the emergence of AI, many of these computational threads are sufficiently specialized to benefit from dedicated computational elements or accelerators. The MIPS I6500-F is well positioned to enable these complex systems.
Multi-threading
The performance of a CPU depends on minimizing the latency to the system memory. Even with a cache hierarchy, the CPU still stalls while waiting for data and this is where multi-threading offers significant performance improvements by running additional threads during these times.
Multi-threading is a more area-efficient alternative to the use of additional cores and offers a typical 40% performance boost for the execution of two threads simultaneously instead of sequentially.
Heterogeneous Combinations of Cores
Threads with high performance requirements can be run as a single thread on a core and this core can be optimized to maximize this single-threaded performance with increased resources such as level 1 cache size and FPU/SIMD capabilities. Other threads can share other cores for greater efficiency while lower-performance threads can be run on cores that are optimized for low power consumption with independently controlled clock frequency and voltage.
The I6500-F allows any combination of core configurations within a single cluster to optimally align to the system needs.
Accelerator Integration
Specialist computational tasks such as artificial neural networks for AI achieve the highest performance with the greatest efficiency as dedicated accelerators. These accelerators need to work closely with the general purpose CPU cores to achieve the combination of flexibility from the CPU cores and efficient performance from the accelerators.
To enable this, the I6500-F provides very low communication latency through:
- Dedicated AXI auxiliary ports for direct communication to the accelerator control registers
- Shared Virtual Memory (SVM) with the accelerators so that data can be passed as pointers rather than through copying
- Low latency coherent access to memory via the I6500-F cluster level 2 cache using the IOCU ports
- Hardware cache coherency at the system level to allow high bandwidth traffic from the accelerators to directly access the system bus to maintain the performance of the CPU cores while retaining the benefits of SVM
- Multi-threading to enable threads to be dedicated to managing the operation of accelerators offering high efficiency with a zero context switch overhead
Performance Scaling
Hardware cache coherency at the system level allows combinations of heterogeneous I6500-F clusters and accelerators or other specialist computational IP to be integrated together to achieve whatever performance level is necessary for each system.
MIPS I-Class I6500-F Series Key Features/Benefits
- SEooC ASIL B(D) package: Rigorous QMS processes addressing systematic errors with optimized functional redundancy to meet system level ASIL D functional safety standards.
- Additional functional safety packages: Support for other functional safety market segments with IEC 61508 for industrial to follow.
- Heterogeneous Inside: In a single cluster, designers can optimize power consumption with the ability to configure each CPU with different combinations of threads, different cache sizes, different frequencies, and even different voltage levels. Optimized, low-latency shared virtual memory (SVM) operations with accelerators can be implemented through connecting via IOCU ports.
- Heterogeneous Outside: The latest MIPS Coherence Manager with an AMBA® ACE interface to popular ACE coherent fabric solutions such as those from Arteris and Netspeed lets designers mix on a chip configurations of processing clusters – including high bandwidth accelerator ports – for high system efficiency.
- Simultaneous Multi-threading (SMT): Based on a superscalar dual issue design implemented across generations of MIPS CPUs, this proven feature enables execution of multiple instructions from multiple threads every clock cycle, providing higher utilization and CPU efficiency.
- Hardware virtualization (VZ): The I6500-F builds on the real time hardware virtualization capability pioneered in the MIPS I6400 core. Designers can save costs by safely and securely consolidating multiple CPU cores with a single core, save power where multiple cores are required, and dynamically and deterministically allocate CPU bandwidth per application.
- SMT + VZ: The combination of SMT with VZ in the I6500 offers “zero context switching” for applications requiring real-time response; alongside the provision of scratchpad memory, this makes the I6500 ideal for applications which require deterministic code execution.
- Ideal for compute intensive, data processing and networking applications: The I6500 is designed for high-performance/high-efficiency data transfers to localized compute resources with data scratchpad memories per CPU, and features for fast path message/data passing between threads and cores.
- Trusted: MIPS multi-domain security technology enables isolation of applications in trusted environments, providing a foundation for security by separation.
- Straightforward software development: The I6500-F is based on the mature MIPS ISA which is broadly supported in the development ecosystem by multiple vendors including a wide choice of compilers, debuggers, operating systems, hypervisors and application software all optimized for the MIPS ISA.
MIPS I-class I6500 Base Core Features
- 64-bit MIPS64® Release 6 Instruction Set Architecture
- Balanced, 9-stage, dual-issue pipeline with Simultaneous Multi-Threading (SMT)
- High-performance dual-issue FPU/SIMD Unit – optional
- IEEE-754 2008 compliant
- Full hardware virtualization
- L1 cache.
- Data ScratchPad RAM (D-SPRAM)
- Programmable Memory Management Unit (MMU)
MIPS I-class I6500 Series Multi-Core & Multi-Cluster Features
- Coherent multi-core and multi-cluster platform, providing extensible implementations in support of both homogeneous and heterogeneous computing applications
- Single cluster IP deliverable for use in combination with coherent fabric alternatives (ACE-compatible) for multi-cluster scalability, or
- Complete multi-cluster sub-system deliverable
- Per cluster multi-core system designed for maximum cluster-level bandwidth
- Integrated L2 cache (L2$): 16-way set associative, up to 8MB of memory
- Up to four auxiliary AXI ports provide for enabling features such as:
- Inter-Thread Communication (ITC)
- Global interrupt controller (GIC) with 256-interrupts per cluster
- Advanced power management
- Virtualization support at system and SoC level
- Advanced debug capabilities – Debug and Trace