ILLIAC IV
A formal design did not start until 1960, when Slotnick was working at Westinghouse Electric and arranged development funding under a US Air Force contract.Political tension over the funding from the US Department of Defense led to the ARPA and the university fearing for the machine's safety.After three years of thorough modification to fix various flaws, ILLIAC IV was connected to the ARPANET for distributed use in November 1975, becoming the first network-available supercomputer, beating the Cray-1 by nearly 12 months.Running at half its design speed, the one-quadrant ILLIAC IV delivered 50 MFLOP peak,[4] making it the fastest computer in the world at that time.[8] Originally equipped with Williams tube memory, a magnetic drum from Engineering Research Associates was later added.Nevertheless, further consideration showed that parallel machines could still offer significant performance in some applications; Slotnick and a colleague, John Cocke (better known as the inventor of RISC), wrote a paper on the concept in 1958.[10] After a short time at IBM and then another at Aeronca Aircraft, Slotnick ended up at Westinghouse's Air Arm division, which worked on radar and similar systems.[11] Under a contract from the US Air Force's RADC, Slotnik was able to build a team to design a system with 1,024 bit-serial ALUs, known as "processing elements" or PE's.As the design work continued, the primary sponsor within the US Department of Defense was killed in an accident and no further funding was forthcoming.Illinois had been designing and building large computers for the U.S. Department of Defense and the Advanced Research Projects Agency (ARPA) since 1949.[18] In contrast to the bit-serial concept of SOLOMON, in ILLIAC IV the PE's were upgraded to be full 64-bit (bit-parallel) processors, using 12,000 gates and 2048-words of thin-film memory.[19] Based on a 25 MHz clock, with all 256-PEs running on a single program, the machine was designed to deliver 1 billion floating point operations per second, or in today's terminology, 1 GFLOPS.[23][24] Sample work at the university was primarily aimed at ways to efficiently fill the PEs with data, thus conducting the first "stress test" in computer development.In order to make this as easy as possible, several new computer languages were created; IVTRAN and TRANQUIL were parallelized versions of FORTRAN, and Glypnir was a similar conversion of ALGOL.They would also provide a Burroughs B6500 mainframe to act as a front-end controller, loading data from secondary storage and performing other housekeeping tasks.Connected to the B6500 was a 3rd party laser optical recording medium, a write-once system that stored up to 1 Tbit on thin metal film coated on a strip of polyester sheet carried by a rotating drum.Attempts to increase the size of the cabinets to make room for the memory caused serious problems with signal propagation.[19] In 1969, these problems, combined with the resulting cost overruns from the delays, led to the decision to build only a single 64-PE quadrant,[19] thereby limiting the machine's speed to about 200 MFLOPS.This unusual arrangement was due to the constraint that no government employee could be paid more than a Congress person and many Illiac IV personnel made more than that limit.It suffered from all sorts of problems from cracking PCBs, to bad resistors, to the packaging of the TI ICs being highly sensitive to humidity.Starting in June 1975, a concerted four-month effort began that required, among other changes, replacing 110,000 resistors, rewiring parts to fix propagation delay issues, improving filtering in the power supplies, and a further reduction in clock speed to 13 MHz.[37] Over time this improved, notably after Ames programmers wrote their own version of FORTRAN, CFD, and learned how to parallel I/O into the limited PEMs.One control unit and one processing element chassis from the machine is now on display at the Computer History Museum in Mountain View, less than a mile from its operational site."[40] In terms of project management it is widely regarded as a failure, running over its cost estimates by four times and requiring years of remedial efforts to make it work.Slotnick received a lot of criticism when he chose Fairchild Semiconductor to produce the memory ICs, as at the time the production line was an empty room and the design existed only on paper.At the original 25 MHz design speed, impedance in the ground wiring proved to be a serious problem, demanding that the PCBs be as small as possible.[50] It included four 64-bit registers, using an accumulator A, an operand buffer B and a secondary scratchpad S. The fourth, R, was used to broadcast or receive data from the other PEs.For this reason, the CU could not be used to coordinate actions, instead, the entire system was clock-synchronous with all operations in the PEs guaranteed to take the same amount of time no matter what the operands were.This process is interrupted by branches, which causes the PC to jump to one of two locations depending on a test, like whether a given memory address holds a non-zero value.Logical tests did not change the PC, instead, they set "mode bits" that told the PE whether or not to run the next arithmetic instruction.