Pentium Pro

The Pentium Pro pipeline had extra decode stages to dynamically translate IA-32 instructions into buffered micro-operation sequences which could then be analysed, reordered, and renamed in order to detect parallelizable operations that may be issued to more than one execution unit at once.The Pentium Pro was the first processor in the x86 family to support upgradeable microcode under BIOS and/or operating system (OS) control.[5] Micro-ops exit the re-order buffer (ROB) and enter a reserve station (RS), where they await dispatch to the execution units.Of the two integer units, only the one that shares the path with the FPU on port 0 has the full complement of functions such as a barrel shifter, multiplier, divider, and support for LEA instructions.The second integer unit, which is connected to port 1, does not have these facilities and is limited to simple operations such as add, subtract, and the calculation of branch target addresses.Division and square root can operate simultaneously with adds and multiplies, preventing them from executing only when the result has to be stored in the ROB.[9]: 3  The Pentium Pro's integer performance lead disappeared rapidly, first overtaken by the MIPS Technologies R10000 in January 1996, and then by Digital Equipment Corporation's EV56 variant of the Alpha 21164.[10] Reviewers quickly noted the very slow writes to video memory as the weak spot of the P6 platform, with performance here being as low as 10% of an identically clocked Pentium system in benchmarks such as VIDSPEED.This meant that a single, tiny flaw in either die made it necessary to discard the entire assembly, which was one of the reasons for the Pentium Pro's relatively low production yield and high cost.The chip was popular in symmetric multiprocessing configurations, with dual and quad SMP server and workstation setups being commonplace.Intel skipped out on providing a mobile version of the original Pentium Pro due to power draw and heat concerns.As Slot 1 motherboards became prevalent, several manufacturers released slotket (or slocket) adapters, such as the Tyan M2020, Asus C-P6S1, Tekram P6SL1, and the Abit KP6.[23] Futurebus had been intended as an advanced bus to replace VMEbus used with the Motorola 68000 from the late 1970s, but it stagnated in standardization committee for more than a decade if you count all the twists and turns.[23] Intel's iAPX 432 initiative was also a commercial failure, but in the process they did learn how to build a split-transaction bus to support a cacheless multiprocessor system.The i960 had further developed the split-transaction iAPX 432 bus to include a cache coherency protocol, ending up with a feature set highly reminiscent of the original Futurebus ambitions.[23] The Pentium Pro was designed to include the 4-way SMP split-transaction cache-coherent bus as a mandatory feature of every chip produced.[23] While the Pentium Pro was not successful as a machine for the masses due to poor 16-bit support for Windows 95 and many other 16-bit and mixed 16/32-bit operating systems, it did see significant successes in the file server space due to its advanced, integrated bus design,[23] introducing many advanced features that had formerly only been available in the pricey workstation segment into the commodity marketplace.
Block Diagram of the Pentium Pro's Microarchitecture
200 MHz Pentium Pro with a 512 KB L2 cache in PGA package
200 MHz Pentium Pro with a 1 MB L2 cache in PPGA package
Decapped Pentium Pro 256 KB
Pentium II Overdrive with heatsink removed. Flip-chip Deschutes core is on the left. 512 KB cache is on the right. [ 20 ]
clock rateServerWorkstationTechnology nodeMicroarchitectureInstruction setTransistorsSocket 8PentiumPentium IIPentium II XeonmicroprocessorP6 microarchitecturesupercomputersASCI RedteraFLOPSTOP500DecappedFred PollacksuperscalarityIntel iAPX 432NexGenmicro-operationexecution unitout-of-order executionspeculative executionregister renamingaddress busPhysical Address Extensioninstruction cacheinstruction decodersmicro-operationsreduced instruction set computermicrocodeoperating systemre-order bufferfloating-point unitbarrel shifterfloating point unitPentium IIIBaniasPentium MIntel CoreCore microarchitectureCore 2zero flag? : operation16-bit32-bitWindows 3.1xWindows 95Windows USERdynamic link libraryWindows NTSPECint95MIPS TechnologiesR10000Digital Equipment CorporationAlpha 21164write combiningMemory type range registersWindows NT 4.0L2 cacheback-side busdual independent busmemory-level parallelismexternal busoverclocksymmetric multiprocessingmobileportable computerBiCMOSgluelessSlot 1slotketIntel 440BXSocket 370CeleronL1 cachemulti-chip moduleFront-side busFuturebusIntel i960VMEbusMotorola 68000AMD K5Cyrix 6x86WinChipList of Intel Pentium II microprocessorsList of Intel Pentium Pro microprocessorsThe New York TimesPearson EducationDvorak, John C.c't | magazin für computertechnikHeinz HeiseArs TechnicaMicroprocessor ReportGitHubNext GenerationImagine MediaIEEE Microarchive.todayAddison-Wesley ProfessionalCRC PressDr. Dobb's JournalIntel processorsProcessorsItaniumMicroarchitecturesChipsetsP6 variant (Pentium M)P6 variant (Enhanced Pentium M)NetBurstx86-6464-bitPenrynNehalemWestmereSandy BridgeIvy BridgeHaswellBroadwellSkylakeCannon LakeSunny CoveCypress CoveWillow CoveGolden CoveBonnellSaltwellSilvermontGoldmontGoldmont PlusTremontGracemont80C18780387SX80387DXRapidCADOverDriveA100/A110Original i586Dual-CoreP6-basedNetBurst-basedCore-basedTolapaiNehalem-basedSandy Bridge-basedIvy Bridge-basedHaswell-basedBroadwell-basedSkylake-basediAPX 432StrongARMXScaleTick–tock modelProcess–architecture–optimization modelIntel GPUsIntel HD, UHD, and Iris GraphicsStratixCodenamesLarrabee