A look at the hidden powers of a CPU - how it works and a little bit of history

Dec 6, 2006 15:35 GMT  ·  By

The microprocessor (some people like to call it Central Processing Unit or CPU for short) is undoubtedly the most important piece of hardware inside a PC. It basically is the brains of the entire system. That's why I think it's appropriate to start the articles about hardware components with it. This series of articles will present historical facts about every computer hardware component and peripheral, as well as basic insights concerning the workings of each.

Those monsters I mentioned in the title are in fact the hidden powers behind the surprizingly complex piece of hardware that is the CPU. Don't worry, I won't get too obscure about this whole issue. I only want to present the most important features and for this I intend to split the article about CPUs in two parts, so be sure to check the second part too.

Let's begin with a little bit of history. As you may still remember from the previous articles that dealt with the computer time-line, the computers of the early 1970s were still gargantuan machines stored in heavily air-conditioned halls and attended to by technicians that had to be very careful with these machines. Back then computers were called mainframes and they featured a sort of Central Processing Unit which resembled a steel cabinet bigger than a refrigerator full of circuit boards crowded with transistors. Computers formerly relied on vacuum tubes and only the latest machines included primitive integrated circuits where a few transistors could form a small, compact package. Those days proudly presented CPUs as a big pile of equipment and nobody imagined that the CPU could be reduced to a chip of silicon the size of your fingernail.

1971 was an important year for a small and obscure Silicon Valley company that was contracted to design an integrated circuit for a Busicom business calculator. Instead of hardwired calculations like other computer chips of the day, this one was designed as a relatively small CPU that could be programmed to perform almost any calculation. That obscure company was Intel and their first microprocessor was the 4004. This "primitive" CPU managed to eliminate the expensive and time-consuming work of designing a custom wired chip as the new technology allowed instructions to be stored in a separate ROM (Read Only Memory) chip. Thus, the concept of a general purpose CPU was born and it evolved into those microprocessors that power our PCs.

You are probably curious to know how this wonder CPU actually works, that is if you don't already know. If you have read the first article from my time-line, you may have stumbled over the concepts of programmability and automation. These are after all what really define computers and makes them so versatile. As I mentioned before, in the early stages of computer evolution, somewhere in the 1930s, the so-called computers had to be programmed by rewiring their circuits in order to perform a certain calculation or groups of calculations in a loop. Von Neumann, alongside a bunch of other bright scientists grew sick of this complicated development and came up with the concept of the stored instruction digital computer. This new concept was helped by memory chips that could store sets of instructions, which were to be performed over and over. Of course, there had to be implemented some sort of logical language to vary the instruction path. And there you have it, the programmable CPU in its infancy.

A basic definition informs us that the CPU is that component of the computer, which fetches the instructions and data from the memory and carries out the instructions in the form of data manipulation and numerical calculations. But why is it called CENTRAL? Well now, all the memory and the input/output devices (peripherals) must somehow connect to the CPU, so it's clearly natural to position the CPU somewhere in the middle. How about that Processing Unit part? This is quite intuitive: just remember that the entire instruction execution and mathematical computations are PROCESSED by the microprocessor. Further more, the CPU integrates a program counter that signals the next instruction string to be executed. It goes through a cycle where it retrieves the instructions from memory, directly in the program counter. The CPU then retrieves the required data from the storage device (HDD), performs the calculation indicated by the instruction and stores the result. The program counter is incremented to point to the next instruction and the cycle restarts whenever needed.

Let us now see why CPUs had to evolve into more powerful ones. The main reason is the need for increased processing speeds to calculate more complex instructions. This processing speed owes it big time to the type of registry a CPU can handle, as well as to the cycle speed measured in MHz.

Intel's 4004 microprocessor chip could deal with a 4-bit registry. Four bits gives you sixteen possible numbers, which is enough to handle standard decimal arithmetic for a calculator (remember not to confuse calculators with computers). The problem is that there is another form of calculation which needs to be performed by a CPU: it has to figure out where exactly in memory instructions are. More clearly, it has to calculate memory locations in order to process program branch instructions or to index data tables.

As a 4-bit registry can deal with only sixteen possibilities, even the 4004 needed to address 640 bytes of memory to handle calculator functions. For comparison, modern microprocessor chips like the Intel Pentium 4 32-bit CPU can theoretically address 18,446,744,073,709,551,616 bytes of memory, though the motherboard limits this amount to a mere fraction of 4 GB of physical RAM (random access memory) and a staggering 64 TB of virtual memory . The larger the addressing space, the larger the amount of instructions to be processed. This idea led to the push for more bits in microprocessors and the present-day CPUs already integrate a 64-bit registry, though there isn't much support from the software side. Consider for example, Microsoft Windows Vista OS comes in 32-bit and 64-bit versions, a fact that shows Microsoft's intentions for supporting the adoption of the improved registry.

With a total memory address space of 640 bytes, the Intel 4004 chip was only good for calculator integration. In 1972, Intel's 8008 was the first of many 8- bit microprocessors to power the personal computer revolution. This improved model was limited to 16 Kilobytes of address space, but in those days no one could afford more RAM.

Then, a gradual improvement began and Intel managed to introduce the 8080 microprocessor with 64 Kilobytes of memory space and increased the rate of execution by a factor of ten over the 8008. About the same period, Motorola brought out the 6800 with similar performance rates. The 8080 became the core of serious microcomputers and the Intel 8088 was the undisputed choice for IBM's PC model. Alternatively, the Motorola 6800 family was chosen by Apple to power their models.

The major advantage of the 8088 was the possibility to address as much as 1 Megabyte of memory. With this improvement, large spreadsheets or large documents could be read in from the disk and held in RAM memory for fast access and manipulation.

Intel and Motorola were not the only players in the CPU arena. Companies such as AMD, Cyrix or Transmeta tried to keep up and launched competitive CPUs of their own. Only AMD managed to remain a serious competitor for Intel. Nowadays, Intel continues its supremacy and it is closely followed by AMD.

Back to our beloved CPU now. Another important aspect of the evolution of CPUs is the way the overall architecture has been implemented over generations and what innovations have been introduced. The first serious issue to arise was the slow speeds of the complementary memory. Memory space was ever expanding and the speed of microprocessor cores went faster and faster, but the physical memory started to suffer speed problems, slowing down the CPU as well.

Large low-powered memories cannot go as fast as smaller higher power RAM chips. So, CPU engineers had to come up with a solution in order to keep the CPUs running full speed. At first, the architecture changed so that a few of the fast and small memories were placed between the main large RAM and the microprocessor. The purpose of this smaller memory was to hold instructions that get repeatedly executed or data that is accessed often.

This smaller memory was dubbed cache RAM and allows the microprocessor to execute at full speed. It is clear that by having larger cache RAM capacities, there is a greater percentage of cache hits and the microprocessor can continue running at full speed for an extended period. You certainly don't want the program execution to lead to instructions that are not stored in the cache, but - when this happens - the instructions need to be fetched from the main memory and the microprocessor has to stop and wait. This is known as CPU latency.

The CPU complexity evolved over the years and the die-size shrunk more and more. It was obvious that cache RAM had to grow along in order to support the ever faster CPUs. The next important architecture change occurred when the cache memory was placed right inside the microprocessor itself (on-die cache). This improvement removed the slowdown of those electronic circuits found between chips on the motherboard. A few inches really make a difference when it comes to the speed of light with which the electrons are propelled through PC hardware.

Further on, CPU engineers saw that even the physical size of an on-die cache can slow the CPU down. It was decided that the cache memory should be given a cache memory of its own. Present-day CPUs have what is known as Level 1 and Level 2 cache. The larger and slower cache is L2 and is the usual size quoted in specifications for cache capacity. Server-specific CPUs such as Intel Itanium or a couple of future CPUs even include an L3 cache. For comparison between the old and the present CPUs just take into consideration that a high-end Intel Core 2 Duo has 4 Megabytes of L2 cache RAM built into the chip. That's more than four times the entire memory address space of the original 8088 chip used in the first PC and the clones to follow.

Keep in mind that the sheer size of cache RAM or the number of levels are not reliable indications of cache performance. Different microprocessor architectures between Intel and AMD make it especially hard to compare their cache specifications. Just like Intel's super high clock rates don't translate into proportionately more performance, doubling of cache size certainly doesn't double the performance of a microprocessor.

The second part of this article will focus on CPU "monstrous" enhancements such as super-scalar, hyper-threading and dual core architectures as well as reveal other interesting facts about the workings of the CPU, so don't forget to visit our site tomorrow.