MIPS Products

MIPS32® 1004K™

The MIPS32® 1004K™ Coherent Processing System (CPS) is the industry's first multi-threaded multiprocessor IP core. Incorporating multi-threading in each core in a coherent multi-core architecture enables the 1004KTM multiprocessor to surpass the performance of multi-core systems based on single-threaded processor cores. This performance boost essentially is "free" in both hardware and software, as the additional hardware threads in the cores are minimal in size relative to a typical SoC design, and multi-threading makes use of the same Symmetric Multiprocessing (SMP) versions of operating systems and software programming models as coherent multi-core platforms.

The 1004K coherent processing system is comprised of 1-4 multi-threaded cores, connected via a coherence management unit to maintain coherency between the L1 caches in each CPU. The system includes an optional block to provide coherency on data transfers from I/O peripherals, enabling additional performance by offloading I/O coherency schemes typically run in software as part of the operating system.

The coherent processing system also includes a global interrupt controller that accepts up to 256 interrupts and distributes them down to the cores, or even h/w threads within each core. The whole system can be used with the MIPS® L2 cache controller (available separately), which connects to the coherence management unit via an extended 256-bit wide interface for optimized throughput between the coherent system and the L2 cache. An EJTAG and a "coherence-aware" PDtrace (program and data trace) block rounds out the system, providing synchronized visibility into each of the CPU cores and the coherency units in the system via development tools.

Initially, the 1004K CPS is available in two versions: the 1004KcTM using integer cores, and the 1004KfTM with a floating point unit in each core.


  • A coherent multiprocessor system using multi-threading to extend performance beyond traditional multiprocessor solutions
    • Up to four multi-threaded CPU cores, with two hardware threads/core
    • Multi-threading complements multi-core – leverages SMP operating systems and programming models, with minimal silicon cost adder
  • Hardware I/O coherency – offloads CPU software I/O coherency overhead
  • Configuration and scalability at core and system levels, addressing a broad range of price/performance implementation points for optimal product implementations
  • Licensable IP core – enables broad industry adoption

A complete system for coherent multiprocessing, including:

  • 1 to 4 1004K multi-threaded "base” cores (up to 8 hardware threads)
  • Coherence Management (CM) unit – the system “glue” for managing coherent operation between cores and I/O
  • I/O Coherence Unit (IOCU) – hardware block for offloading I/O coherence from software implementation on CPUs
  • Global Interrupt Controller (GIC) – system and inter-processor interrupt controller
  • Extended 256-bit interface to L2 cache controller (available separately)
  • EJTAG/PDtraceTM block for advanced debug/trace of complete coherent system

1004K Base Core

  • 9-stage pipeline delivering more than 1.5 DMIPS/MHz per core
  • Supports single- or dual-threaded operation per core
  • Uses Virtual Processing Elements (VPEs) for hardware multi-threading
  • Integer (1004Kc™) and floating point (1004Kf™) versions
  • Support for Revision 1 of MIPS32 DSP ASE
  • Coherency port has duplicate data cache tags for background coherency checks
  • Design-time configurability for inclusion and sizing of instruction and data TLBs, caches, scratchpad RAM and other options

Floating Point Unit (FPU)

  • IEEE 754-compliant FPU, compliant to MIPS® 64-bit FPU architecture (1004Kf version only)
  • Supports single- and double-precision data types
  • Separate in-order, dual-issue pipeline decoupled from integer pipeline

Coherency Management (CM) Unit

  • Manages coherency using the MESI protocol
  • Operates at same clock (1:1) as CPUs for maximum performance
  • 256-bit extended interface for maximum throughput to (optional) L2 cache controller
  • Supports performance enhancements via L1 cache-to-cache transfers, speculative reads to external memory, and globalized cache operations
  • Global Configuration Registers (GCRs) for configuring/controlling CM scheme

I/O Coherence Unit (IOCU) – optional use

  • Bridges non-coherent I/O peripheral transfer and makes transactions coherent
  • Supports per-transaction attributes for snooping L1 caches, L1+L2 caches, or noncoherent transactions, plus I/O prioritization

Global Interrupt Controller (GIC) – optional use

  • Supports system-level interrupts; inter-processor interrupts
  • Routes interrupts to particular core or VPE
  • Configurable # of system interrupts (up to 256)

Development Tools

  • MIPS Navigator™ ICS - IDE, software toolkit, MIPSsim™, EJTAG and PDtrace probes
  • CodeSourcery - SG++ toolchains for MIPS

Frequency (MHz)>800 (worst case)
Total Area *~4.2mm2
Performance1.5/core DMIPS/MHz
ProcessTSMC 65GP

Note: Frequency, power consumption and size depend upon configuration options, synthesis, silicon vendor, process and cell libraries.

Quoted speeds are PTSI and don't contain OCV, clock jitter or design margin.

* Configuration: 2 cores, each core with 2 threads/core and 32KB Inst/Data caches, Coherent Manager (CM), and Global Interrupt Controller (GIC)


MIPS32® 1004K™ Core - Simplified Overview

Simplified Overview