MIPS Multi-Threading

MIPS delivers hardware multi-threading in several families of our licensable CPU IP products, providing a differentiated and highly efficient mechanism to achieve higher levels of performance and/or low latency context switching behavior. These benefits are not limited to just the primary application processor (AP), but can be leveraged across a wide range of market segments and applications, providing advantages in embedded and subsystem processing functions in areas such as smartphones, DTVs, set-top boxes, networking, communications, automotive, storage, and microcontroller (MCU) market segments.

Complex SoC designs

System-on-Chip (SoC) designers are constantly striving for ways to deliver more performance in each successive generation of the processors used in their silicon designs. It’s a demand driven by increasing system complexity in support of intelligence, more sophisticated man-machine interfacing, and the ever increasing bandwidth requirements of an electronically interactive and connected world.

The demand for increasing performance is not new; however, simply increasing the operating frequency of the processor no longer provides easy gains. Multi-core processing has been adopted more widely because of this – and MIPS processors are well supported to take advantage of these techniques. MIPS processors are fully synthesizable, so customers can implement our processors in the process node of choice to the frequency and operating specs they target. Coherent multi-core and multi-cluster versions are available for many of the product families.

However, multiple cores of course means additional cost – sometimes substantial. Since what is needed for “increased performance” can be application-specific and varies widely, multiple cores aren’t necessarily the best option. MIPS offers hardware multi-threading (MT) in several MIPS CPU core product families – technology which has been proven in a wide number of designs over more than a decade – and continues to evolve to provide even greater value. MIPS MT provides additional options for tackling performance challenges, whether simply increasing total performance, or improving switching/responsiveness in applications that are sensitive to context switching, real-time operation, or multi-software domain execution environments.

At its heart, the primary goal of implementing hardware MT is to achieve better/higher utilization of a processor core. MT enhances a core’s ability to execute different threads of software in parallel on a common/shared execution pipeline and computational elements. This is achieved by replicating a subset of a core’s resources, but comes at a fraction of the area and power cost of implementing a full multi-core system.

Single-threaded microprocessors today waste many cycles while waiting to access memory, and also on events like software branch mispredictions, interrupt servicing, etc. These wasted cycles impact system performance. Multi-threading can mask the effect of memory latency by utilizing the processor to accomplish other tasks while it’s waiting on memory. As one thread stalls, additional threads are instantly fed into the pipeline and executed, resulting in a significant gain in overall application throughput.

Multi-threading in MIPS CPUs

The actual implementation of MT features and capabilities varies between the different cores of the MIPS families that support it. Both the MIPS32-based interAptiv™ and MIPS64-based I6500 processors support a virtual processing element (VPE)/virtual processor (VP) level of MT, in which each thread VPE/VP appears to a symmetrical multi-processing (SMP) operating system (OS) as a separate core. Or a separate OS can be run on each VPE/VP, such as Linux and an RTOS. Since each VPE includes a complete copy of the processor state as seen by the software system, each VPE appears as a complete standalone processor to an SMP Linux operating system, including interrupts, register set, MMU, etc. The interAptiv processor family supports implementations of up to two VPEs per processor core, while the superscalar dual issue I6400 and I6500 processor families extend this to four VPs per processor core.

For more fine-grained thread processing applications, the interAptiv processor family includes a second level of hardware MT support called a Thread Context (TC) that is “lighter weight” than a VPE, is programmable at the user/application software level, and is complemented with a set of instructions in the MIPS ISA to utilize the threads. An interAptiv core is capable of supporting up to nine TCs, which can be allocated across two VPEs. The TCs share a common execution unit, but each has its own program counter and core register files so that each can handle a thread from the software.

All of this allows for higher utilization and more efficient use of an existing processor core, which translates to higher overall system performance. While more performance is broadly a good thing, some embedded applications have additional requirements that are sensitive to a processor’s real-time responsiveness. MIPS MT enables very fast context switching behavior, making it ideally suited for this challenge. The replication of General Purpose Registers (GPRs) and interrupt handling resources provides a natural mechanism for delivering very low latency on tasks requiring a context switch, such as servicing interrupts, real-time sensitive operations, or systems running multiple operating systems in a multi-guest virtualized software environment. These characteristics can be very important in a variety of applications such as LTE/5G modems, automated driver assistance systems (ADAS), engine control, or other functional safety and/or security critical embedded applications.

Extending on this benefit, MIPS MT cores can be leveraged further to provide best-in-class solutions in more deterministic system designs, via the use of some complementary features available in the interAptiv processor family. Executing out of local ScratchPad RAMs (SPRAMs), using the simpler memory mapping option with a memory protection unit, and/or utilizing some of the Quality of Service (QoS) thread prioritization features supported in the MIPS Multi-Threading architecture all facilitate low-latency, highly deterministic system design.