What is Alpha 21064 Processor?
The Alpha 21064 is a microprocessor that was developed in 1992 by a company named Digital Equipment Corporation. It is sometimes also referred to as EV4 which was its code name. This microprocessor was an upgrade to its predecessor EV3 which was fabricated using Digital’s 1.0 micrometer CMOS-3 process. EV4, however, was fabricated using a 0.75-micrometer CMOS-4 process. This fabrication technology was considered to be a breakthrough innovation which helped it to be crowned as the fastest microprocessor at that time. Soon, IBM launched the multi-chip POWER2 which then became the fastest microprocessor.
Alpha AXP Architecture:
The Alpha AXP Architecture provided a large, 64-bit linear address space. It also offered a fully 64-bit operating system with DEC OSF/1. As it was a 64-bit architecture, it avoided hardware baggage that could have included orphan 32-bit instructions and other compatibility issues. This architecture also avoided condition codes, special registers, suppressed instructions, and branch delay slot instructions. Similarly, it also avoided direct hardware support for features that would limit the performance of the anticipated system through cycle-time restriction. The design provides support using software assistance for full functionality. All the data moves between registers and memory without computation.
- Addressing: The AXP employed little-endian byte addressing which was very similar to Intel X86 and VAX computers. Using the byte manipulation instructions with a single instruction modification to the sequence, systems could access both big and little-endian data. The AXP also does virtual-to-physical-address mapping on a per-page basis, and its pages are 8 Kbytes.
- Data Types: The architecture’s unit of data was 64-bit quadword, but it also supported 32-bit longwords. The floating-point data types included both IEEE and VAX formats in both 32-bit single and 64-bit double-precision formats. Byte and word data types were not supported by direct load-and-store instructions but by a short sequence of instructions.
The Alpha 21064:
The Alpha 21064 was the first implementation of the Alpha AXP architecture. It had 1.68 million transistors. This was a well-designed microprocessor that provided high performance through the superscalar operation with an exceptionally high-frequency internal clock cycle. It also had an on-chip programmable system clock that helped easily accommodate a range of system designs. System design could run the CPU at from two to eight times the system clock frequency. The two factors controlled by the microprocessor designer are cycle time and the number of instructions completed per cycle.
- Pipeline: There are two pipelines – Integer and floating-point pipelines. The integer pipeline has 7 stages while the floating-point pipeline has 10 stages. The first 4 stages are commonly shared by both. Each stage can process up to 2 instructions in parallel. The processor fetches a pair of instructions each cycle from the 8-Kbyte instruction cache in the Instruction Fetch (IF) stage. The swap stage controls instruction prefetching, doing branch prediction, and cache index calculation. Intrafetch dependencies are checked by the issue-zero (I0) stage. It also completes the decoding and setup for the issue-one (I1) stage. The integer and floating-point register files are read in the issue-one stage. They supply data to the integer, floating-point, load, and branch calculation.
- Integer unit: Integer unit consists of the Integer Register File (IRF) and the Ebox. The IRF contains thirty-two 64-bit general-purpose registers. It has 6 ports in total – four reads and two writes to allow the parallel execution of both integer calculations and load, store, branch operations. Dedicated adder, shifter, multiplier, and logic units are included in the data path. Adder and logic units provide results in one cycle. However, the shifter requires two cycles for results. The shifter is fully pipelined but the multiplier is not pipelined for area savings.
- Floating-point unit: The floating-point unit consists of the F-box and FRF. Its unit combines short latencies with maximum throughput. It contains a 32-bit entry by 64-bit register file with two write ports and three read ports. A new instruction cycle can be initiated every cycle with dependent operations requiring six-cycle latency. The fast cycle-time goal translates into longer total latency as measured in cycles.
- Address unit: The address unit is also known as the Abox. It performs all the load-and-store operations. It contains a dedicated displacement adder to do that in parallel with other units. It also has a 32-entry data translation look-aside buffer. Ranges of 8 Kbytes, 64 Kbytes, 512 Kbytes, or 4 MB for each entry are allowed by this unit. The address unit can also block independent instructions. The write buffer merges data from adjacent stores to reduce off-chip bandwidth requirements. It also allows early service for critical load data. It achieves this by temporarily delaying stores that would have otherwise occupied the data bus. By accessing the current store tag with the last store data in separate cache tag and data arrays, the address unit allows back-to-back load-and-store operations in any order.
- Caches: The Alpha 21064 has two on-die primary caches named as I-Cache and D-Cache. The I-Cache is an 8 KB instruction cache while the D-Cache is an 8 KB data cache. Six-transistor static random access memory (SRAM) cells have been used to build these caches. B-Cache was an optional secondary cache with capacities of 128 KB to 16 MB. The Cache operated at 12.5 to 66.67 MHz at 200 MHz. All the three caches are direct-mapped but the I-Cache and D-cache have a 32-byte line cycle while the B-Cache has a 128-byte line size by default.
- Interface: The interface is extremely flexible to accommodate a range of system designs. Although the chip operates at a 3.3-volt power supply, it can also interface with a 5-volt power supply. The external interface is a 128-bit data bus. The width of the data bus can be configured, and it can not only have the original 128-bit external interface but also the 64-bit external interface.
- Fabrication: The learnings from the EV3 helped in the fabrication of the EV4. The EV3 was fabricated using Digital’s 1.0 micrometer CMOS-3 process. The Alpha 21064 was fabricated using a 0.75 micrometer CMOS-4 process which had 3 levels of aluminum interconnect. It contained 1.68 million transistors. It measures 13.9 mm by 16.8 mm, for an area of 233.52 mm2. CMOS-4S process had a 0.675-micrometer feature size. This process was used later for fabrication instead of the regular CMOS-4 which helped shrink the overall size of the chip from 233.52 mm2 to 186 mm2.
- Upgraded Versions: There was further development over the original Alpha 21064 and many versions came out later. Alpha 21064A, Alpha 21066, Alpha 21066A, Alpha 21068, Alpha 21068A are the upgraded version that came out.
Please Login to comment...