By Terry Stratoudakis, Executive Director, Wall Street FPGA LLC
FPGAs are customisable chips that are fast, deterministic, parallel, and just plain awesome. And in case you were wondering, they have nothing to do with golf, the PGA, or Pringles potato chips! FPGA is a four-letter F-word which you can say on television but won’t get a response unless you’re speaking to electrical engineers; and these days, high frequency traders. FPGAs conjure terms such as reconfigurable computing, reconfigurable hardware, hardware acceleration, floor planning, black magic, synthesis, schematic capture, multiple personalities, RTL, VHDL, Verilog, high level synthesis, place and route, and mapping.
Let’s try that again. Introduced in the 1980s, Field Programmable Gate Array (FPGA) is a customisable integrated circuit (IC). ICs are everywhere, from pacemakers to automobiles to mobile phones to audio greeting cards. Some ICs are very application specific, such as those in the orbit stabilizers of a spacecraft while others are general purpose, such as the central processing unit (CPU) of your computer. Another well-known IC is the graphics processing unit (GPU). The GPU is a special kind of chip that can process … you guessed it, graphics data.
FPGAs are the basis of a $4 billion industry that permeates into many applications. College courses, books, and entire conferences focus on their usage and applications. And while people from all walks of life come into contact with devices that use FPGAs, a deep understanding of FPGAs can take many years. Below we will review what’s inside an FPGA and the benefits of FPGAs.
Inside an FPGA
ICs contain various logic blocks connected and arranged in a certain way that is fixed during the manufacturing phase. An example of such an IC is the microprocessor that is the CPU of your computer. CPUs have functions that can be run via software commands called instructions. By contrast, FPGAs allow you to change the actual functions (or personality) of an IC in the field, after it has been manufactured, hence the Field Programmable aspect of FPGAs.
At the lower levels, but just above the transistors, the hardware of an IC consists of logic blocks such as OR, AND, and NOT to name a few. Among the basic building blocks of an FPGA are Configurable Logic Blocks (CLB) and Input Output Blocks (IOB); see Figure 1. A CLB contains several logic blocks such that the CLB is configured to be one or more of the contained logic blocks. And a CLB can be configured to make a multi-logic block circuit having multiple inputs and outputs. The FPGA consists of an array of these CLBs, hence the Gate Array aspect of FPGAs. The IOBs connect adjacent CLBs. When the FPGA is programmed, each of the CLBs is configured as one of the contained logic blocks. The IOBs make the appropriate connections from one CLB to the next, forming a user specified electrical circuit.
The pins of the FPGA are the data entry and exit points. There are IOBs that connect the CLBs to the pins. Data can be passed in to and out of the FPGA in parallel.
At some point, if we wish to change the electrical circuit we can send a new configuration to the FPGA and voilà, a new IC has been formed. The CLBs change function, the IOBs make new connections, and the pins are configured to accept different forms of data.
Today’s FPGAs have CLBs with more complex logic blocks and more inputs/outputs per CLB. CLBs can be configured to be memory, triggers, clocks, fixed-point math, floating point math, digital signal processors, etc. Since the CLB are arranged as a grid, calculations run through the FPGA circuit with true parallelism.
The first commercially viable FPGA had 64 CLBs. Today there are many kinds of FPGAs with specialised CLBs that can be configured to form a circuit with the equivalent of millions of logic gates. The latest FPGAs have over 1,000 pins. Their pins can accept or generate data as fast as 28 gigabits per second. FPGAs benefit from advances in semiconductor technology, keep pace with Moore’s Law, and currently utilise 28 nanometer technology.
Benefits of an FPGA
At the most basic level, an FPGA is a means for computing or performing a task. Being a semiconductor, many try to understand FPGAs through their knowledge of CPUs, GPUs, and other processors. Simply put, coming from the CPU world, the best way to understand FPGAs is to take your world and turn it upside-down. FPGAs have no notion of instructions, interrupts, threads, and have no operating system. Among the ways one compares processors is by the number of cores, an indication of the number of tasks that can be processed in parallel where in an FPGA, you [the designer] define the cores. Most software developers think in terms of sequential programs where one task runs after the other. On an FPGA, if you do not think parallel, the game is lost!
When designing a system, speed is the goal but we need to review the context of the speed of the system. What are the conditions? Does it run fast under low load or specific conditions? Say you are traveling from downtown to uptown Manhattan. An FPGA is akin to taking the subway, and software on a CPU is like driving. Under no traffic, which just about never happens in the city that never sleeps, the car will get you there in minutes but during rush hour it could take forever. The subway, like it or not, will get you there much faster regardless of whether it is rush hour or not.
Software and operating systems are notorious for their jitter. An application may be benchmarked to take say 15 microseconds 99.9% of times, but what about the other 0.1% of the times? FPGAs compute “on the clock cycle” as an electrical circuit so there is a level of determinism and speed that even real-time operating systems (RTOS) cannot achieve. FPGAs run in a very predictable way; an FPGA that is clocked to run at say 2,100 nanoseconds will do that all the time.
The spatial layout of the CLBs are considered “islands of logic” and when connected using IOBs form circuits which can interact with each other or be completely independent of each other. When more logic is added to a partially utilised FPGA, the new logic will not affect the existing logic and the way it operates. More data can come into the FPGA via unused pins. Processed data or results can come out of other unused pins.
A core can be thought of as the number of parallel tasks that an IC can run at the same time. The notion of cores on a CPU can on the one hand be convenient but they can also be restricting. What if we have 1,000 parallel operations that are all relatively small? I think of cores on a CPU as cubicles. They are great if you want to have worker bees, but what if your work environment needs to be adjusted so that we now have a set of conference rooms with some cubicles the one day or a large hall for a speaker the next?
We could configure an FPGA where the cores are sized to these operations. In another instance, an FPGA could be configured to have 500 of the aforementioned cores and the rest of the FPGA running a different kind of operation where we can fit 100 instances resulting in 100 cores. The resulting 600 core FPGA would have two (2) kinds of custom cores.
Besides comparing cores, some compare processors based on their maximum clock speed, which is measured in gigahertz or megahertz - when in reality, many other factors that come into play. Ask any car buff that maximum speed is just one way of comparing two automobiles.
FPGAs are clocked in the hundreds of megahertz (MHz), whereas all CPUs are clocked in the gigahertz (GHz). Various algorithms can run up to 1,000 times faster on an FPGA. The main reason is that CPUs spend a lot of time fetching and decoding instructions where FPGAs focus solely on the problem. Read more http://research.microsoft.com/apps/pubs/default.aspx?id=70636
Systems are hardware and software where typically only the software is optimised and it runs on commodity hardware. If you think about it, why limit optimisations to just the software when we can optimise the hardware? FPGAs allow the designer to customise the hardware.
I like to think of FPGAs as the Dr. Jekyll and Mr. Hyde of ICs. If ICs went to a therapist, the FPGA would be a schizophrenic with multiple personalities. The FPGA can be a CPU in the morning, a GPU during lunch, a DSP in the afternoon, or a combination of all, plus custom logic, without going back to fabrication. FPGAs are multi-lingual and can talk PCI, PCIe, TCP, IP, UDP, I2C, and any other custom protocols or interfaces.
Application Specific Integrated Circuits (ASICs) are that really smart person who can solve triple integrals no matter what time of day it is - and nothing else. ASICs are the traditional way of making dedicated, custom hardware; the specialist of ICs. ASICs can take 1-2 years to develop and if errors are found, lots of time can be lost in remaking the ASIC.
CPUs are that well rounded person who knows a little about everything and could be considered the jack of all trades but master of none. CPUs are the most flexible and general purpose of all hardware. Since CPUs are generic enough to handle word processing, spreadsheets, web surfing, and other tasks there is a lot of overhead involved.
FPGAs are hardware that can be reconfigured, sometimes on the fly, to perform multiple functions in a specialised way that only dedicated hardware can.
We go back to our subway versus the automobile analogy of that poor soul who wishes to go from downtown to uptown Manhattan during rush hour and has to be there in less than 20 minutes. Not only will mass transit do the job more reliably, but it also favors the environment and in turn our pockets by consuming less energy.
Hardware dedicated to a specific function, results in a more energy efficient system. In other words, we are configuring both hardware and software that is tuned to our needs. Computing energy is measured in terms of calculations per watt. Energy savings can be 10-30 times when using an FPGA, making it green computing.
Further reading and videos