Development of an FPGA based DSP System
Low cost FPGA's have now reached the point where £15 will buy you a device with over 10,000 registers, 64K byte ram and 16 hardware multipliers, and the difficulty is now knowing how to utilise all of this resource.
Over the last decade BCD Audio have developed VHDL hardware modules and support software to allow them to produce real audio and video products that use the Xilinx Spartan devices. I would like to share some of these experiences with you.
Comparison of DSP Processors with FPGA solutions.
DSP processors have evolved to become good at 24bit processing, with fast hardware multiply accumulates, internal memory and sometimes floating point support. They also have dedicated integrated development environment ( IDE ) software support , which always supports assembler and sometimes C code support. The devices have also evolved to become more like micro controllers , with on-board SPI, I2C , timers and serial ports.
Where they fall down however is the ability to connect to many audio and video components at the same time, and it is often difficult to establish a good communication path to the control processor system. Processors are also only capable of doing one thing at a time, although interrupts and multitasking are good at hiding this fact. Often, to connect the DSP processor to the audio and video components , an FPGA would be required, which gave us the idea of using the FPGA as the complete system.
FPGA systems are difficult to get started with, as you are faced with essentially a clean sheet of paper, with unlimited possibilities , and no clear idea of the best way to implement the requirement.
BCD has spent the last three years developing useful modules that allow us to produce these systems.
FPGA systems are able to run many modules in parallel , and these may run at many different clock rates. Although each processing module might not run as fast as the DSP solution, they can all run simultaneously.
FPGA systems can implement hardware that it just not possible with DSPs; direct conversion of SD I video, AES-3 inputs and outputs, MADI and Ethernet are all possible.
FPGA systems can be hard to debug. Hardware logic analysers are often required, and software maybe buried and inaccessible. BCD have put much thought into debugging, and have established a real time debugger for the embedded DSP modules we use.
FPGA systems are slow to develop, as the compile time for a medium to large scale design can be measured in minutes. BCD has established a method of real-time code development, that doesn't involve recompiling the whole design.
FPGA systems can easily include multiple processors , so that Interrupts are just not required at all, giving the system a higher reliability. Dual and multi-port memory, hardware FIFO systems eliminate the need for Interrupt systems.
FPGA systems can be more easily adapted to different system requirements , and the hardware can be reprogrammed in the field if necessary.
FPGA modules can with care, be reused in future designs. As most of the modules are written in VHDL, these may be reused in future designs.
Short list of useful modules used.
Eight bit control processor
This is based upon the Xilinx Picoblaze, but expanded to 4K code space and 64K IO space. All IO can also support 16bit variables. The embedded processor is developed using a specially written IDE, which allows real time debugging and software download. Hardware interface modules listed below are supported, and any number can implemented in a given design. Up to 16 separate processors can be supported and debugged inside one FPGA; there is no limit to the number of processors that could be embedded in the FPGA without the debug support.
This module implements RS232, RS422 and RS485 protocols at any rate, and use hardware based FIFOs so that interrupts are not required. Any number of baud rates can be used, as simple hardware timers running on a common system clock can be used.
SPI modules can be included, with 8 and 16 bit support. The 16 bit mode is ideal for communicating with Gennum SPI chips. Multiple SPI modules and internal address decoding give no limit to the complexity of systems. The module uses internal baud rate timing.
The hardware I2C module allows communication to external I2C type peripherals , at slow ( 100KHz ) , medium ( 400KHz ) and fast modes. The module uses internal baud rate timing.
Any number of timers can be included, and arranged to achieve whatever is required. The timers can be designed latency free, as they do not need to run from one common clock, and really fast or really slow timers are easily achieved.
Internal Communication and Memory
As the embedded processor now has 64K of internal IO space, internal bock memory is easy to implement, and often this can be dual-port with another internal processor, or direct memory-mapped module. As all IO is memory mapped, the specially written IDE can inspect and update the IO hardware and memory in an easy way.
18/32 bit DSP processor.
This module allows us to implement complex DSP based audio systems. It is based on the 8bit Xilinx Picoblaze, but expanded to 18bit operation, with 4K code space and 64K IO space.
The data space is usable with 18bit and 36bit variables. 18/36Bit was chosen, as the FPGA block memory is 18/36 bits wide.
18bit variables can be signed or unsigned, integer or 2bit integer with 16 bit fractional part. This notation allows filter coefficients, gain control coefficients to reach +12dB, still with 0.0002 dB resolution.
36 bit variables and data are useful in 24 bit digital audio systems as a gain structure with 4bit overload, 24bit main section and 8bit noise floor can be established. For most purposes this avoids the need for floating point arithmetic.
Extensions to the processor include a fractional 18 * 36 multiplier and multiply accumulator, with automatic overload limiting, signed and unsigned variables support.
All hardware is IO mapped into a 64K data space, which can be debugged using the IDE.
A simple dual-port memory allows simple control from further internal control processors.
Each DSP processor is arranged to restart its program at the leading wordclock, so that it is almost impossible to crash.
I2S and I8S interface modules.
These are memory mapped IO to the DSP processor, and automatically send and receive 16 or 24bit data to the connected hardware. Automatic signal scaling and overload protection are possible.
Any number of modules can be used within a system.
Versions exist that can communicate 8 stereo signals to Video hardware, 8 programmable input or output lines, and new versions can be written to support other devices.
This module is memory mapped as IO to the DSP processor, and directly synthesises 4 AES-3 transmitters. Channel status, User bit control occurs from a second memory mapped area, which normally internally connects to the control processor in the FPGA. Multi rate and multichannel versions are possible.
This module will be completed in 2012. It uses oversampling techniques to directly decode AES-3 signals with just RS422 receiver hardware. The audio signals are left in memory mapped IO for easy connection to the internal DSP processor. Channel status and User bits are transferred to a second memory mapped area, typically for use with an internal control processor.
Multi rate and multichannel versions are possible.
Audio sample rate conversion.
This module will be completed in 2012. Based upon a published design, the BCD version requires less than half of the FPGA resources, and extends the number of audio channels possible from 4 to 16 channels. These channels may be split into two input frequency groups with the same internal hardware, and can support sample rate ratios from 8 * scale up and 8 * down-conversion.
As 8 channels may be converted at the same rate, the module is ideal for connection to Audio from Video streams, and Multichannel AES-3 signals.
Modules exist to pass video SDI, and embed and de-embed audio signals from the ANC control packets. Direct conversion SD SDI output modules have been used.
Products produced to date.