Computer Architecture Lab/Winter2006/JeitMossFrühRamb/ISA

Introduction edit

MatPRO is a pipelined CoProzessor that handels 4x4 Matrix-Operations like addition or multiplication. It's equipped with a SimpCom-Interface to ensure an easy transport connection to existing SoC-Solutions.

key data edit

  • 16 bit Coprocessor for handling Matrix-Arithmetics
  • Dimension of Matrix is 4x4
  • basic datatype is 16 bit signed Integer
  • 3 Matrix-Registers
  • 16 general purpose 16 bit Registers
  • SIMD (Single Instruction Multiple Data) - Architecture

Instruction formats edit

To keep things simple, our instructions are all codeable in 16 bit. Besides this, all OP-Codes are 4 bit coded.

We distinguish between 5 different Instruction formats:

  • 3 Opperand - Instructions
width="400px" }
Bits 15-12 11-8 7-4 3-0
Content OPCODE DESTREG SRCREG1 SRCREG2
  • 2 Opperand - Instructions
width="400px" }
Bits 15-12 11-8 7-4 3-0
Content OPCODE DESTREG SRCREG 0000
  • 1 Opperand - Instructions
width="400px" }
Bits 15-12 11-0
Content OPCODE Address
  • Load/Store Word - Instructions
width="400px" }
Bits 15-12 11-8 7-0
Content OPCODE DSTREG Address
  • Load/Store Matrix - Instructions
width="400px" }
Bits 15-12 11-10 9-0
Content OPCODE DSTREG Address
  • Conditional - Instructions
width="400px" }
Bits 15-12 11-8 7-0
Content OPCODE DSTREG Address

Instructions edit

At the moment, we plan to implement 14 different instructions:

width="500"}
Instruction OPCode Description
nop 0000 does nothing
jmp 0001 set the PC to desired value
brz 0010 branche if zero: if the sourceregister is zero --> go on, else, jump to address
sub 0011 subtracts one 16 bit integer from the other and stores the result in the destinationregister
loadm 0100 loads a Matrix from the Memory at a given Addressstartpoint and stores it in the destinationregister
loadw 0101 loads a 16 bit Integer from the Memory at a given Address and stores it in the destinationregister
storem 0110 stores a Matrix from the Sourceregister to the Memory at the given Address
storew 0111 stores a 16 bit Integer from the Sourceregister to the Memory at the given Address
mulm 1000 multiplies 2 Matrix and stores the result in the destinationregister
addm 1001 adds 2 Matrix and stores the result in the destinationregister
subm 1010 subtracs 1 Matrix from the other and stores the result in the destinationregister
1011 still free
1100 still free
1101 still free
mulw 1110 multiplies a Matrix with a Scalar and stores the result in the destinationregister
1111 still free

Special purpose of OPCode edit

Since we are using 4 bit to code the desired register, only 16 Registers are possible to address. As mentioned at the beginning, we are using 32 registers (16 Matrix-Registers and 16 "normal" registers). So how do we know which register is meant to be read?

Here comes the OPCode into play.

If you analyze the OPCode, you will notice, that the first 2 bits of it decide what to do:

  • 00xx: those are operations that don't to anything with a Matrix
  • 01xx: those are load or store instructions
  • 10xx: the sourceregisters are the Matrix-Registers
  • 11xx: The first sourceregister is a Matrix and the second sourceregister is a 16 bit value

Assembler edit

width="500"}
Assembler Operation
nop nothing
jmp addr PC <- addr12
brz i0,imm8 true: PC <- PC+1; false: PC <- imm8
sub i2,i0,01 i3 <- i0-i1
loadm m0,addr10 m0 <- (addr10)
loadw i0,addr8 i0 <- (addr8)
storem m0,addr10 (addr10) <- m0
storew i0,addr8 (addr8) <- i0
mulm m2,m0,m1 m2 <- m0*m1
addm m2,m0,m1 m2 <- m0+m1
subm m2,m0,m1 m2 <- m0-m1
mulw m1,m0,i0 m1 <- m0*i0


Legend:

m0,m1,...,mF ... Matrix-Registers

i0,i1,...,iF ... 16 bit Integer - Registers

addr8 ... 8 bit Address

addr10 ... 10 bit Address

addr12 ... 12 bit Address

imm8 ... 8 bit signed immediate


Assembler:

Assembler.zip

Block Diagramm edit

The matrix processor (MatPro) is designed as a coprocessor an communicates with the main processor (JOP in this case) with the SimpCon Interface. The memory is separated into a data cache and a instruction cache to keep the memory access simple. The data cache is double buffered to allow a more efficient data transfer beween the processors. The main processor writes all data and instuctions in the caches and sets RUN, then MatPro starts it's program and sets READY when the program execution is done.

MatPro Schematic edit