Here you will find some technical details and examples for our FPGA-based solutions.
Please follow this link for the latest version of this article.
An insight into how FPGA is programmed
In the case of someone tasked with creating a DSP application, for example, the majority of DSP code first sees the light of day in C. As horrendous as it seems, once the algorithms have been verified at this level of abstraction, a large proportion of the code gets rewritten and "tweaked" in assembly code in a desperate attempt to achieve the required performance. This manual translation is, of course, both painful and time-consuming. The bigger problem is that the DSP code will eventually be run on a general-purpose microprocessor or a dedicated DSP device. Both of these realizations are inherently slow, because they are based on a classical von Neuman architecture, which requires them to
- Fetch an instruction
- Fetch a piece of data
- Fetch another piece of data
- Perform an operation
- Store the result
- Do the same thing all over again
Now, consider a typical DSP-like function along these lines:
y = (a * b) + (c * d) + (e * f) + (g * h);
If you run this through a DSP chip, it will take a substantial number of operations and clock cycles to execute. Now consider an equivalent dedicated hardware implementation in an FPGA, in which all of the multiplications are performed in parallel without the need to fetch and decode the instructions. This results in orders-of-magnitude speed improvement.
What FPGA can do for you?
- Calculate a non-trivial mathematical function (sin, cos, atan) on 10 numbers simultaneously - in one cycle
- Calculate monthly mortgage repayment with a number of input parameters - in one cycle
- Calculate accrued interest and other popular financial functions for many inputs - simultaneously
- Calculate common statistical functions for many inputs - simultaneously
- Calculate a stock exchange index (FTSE-100, DoW) in one cycle
- Calculate a complicated formula for 10 independent numbers in only a few cycles
- Perform simultaneous complex vector and matrix operations under 10 system clocks
- Generate many random numbers simultaneously, brining results directly into host's RAM (useful for fast Monte-Carlo implementations, cryptography etc.)
- Perform Monte-Carlo - many instances in parallel in a few cycles
What FPGA cannot do for you?
- Access databases or consume Web services (this is done faster by the regular CPU)
- Easily adapt to changing requirements and new algorithms - once again, you're usually much better off with a conventional CPU