Logo Your Source for COM Express News and Education
ad_banner

Application-Specific COM Express Modules using System-on-FPGAs

Kishan Jainandunsing
(May 2006)

Summary

Vertical embedded applications are all about application-specific functions and I/O. The COM Express standard, a PICMG open specification that has found considerable traction in the vertical embedded industry, offers OEMs a great deal of flexibility in functionality and I/O. It does not dictate a specific processor architecture and offers considerable freedom in I/O configurations. COM Express modules can therefore be implemented with application-specific processors and chipsets. Using an ASIC makes sense from a customization perspective,  but volumes are typically not high enough to make this approach economical. Using a general purpose processor and companion chipset makes sense from a cost perspective, but may lack the required application-specific functions and I/O. Implementing these functions through external components is possible, but may drive up costs too high. High-density FPGAs strike a perfect balance between the flexibility of an ASIC and the affordability of a general purpose processor and chipset. In this article we illustrate how this feat may be accomplish, using the Virtex-4 FX FPGAs from Xilinx as an example.

Inside The Xilinx Virtex-4 FX

The Xilinx Virtex-4 FX FPGA integrates up to two PowerPC 405 (PPC405) cores, together with field-programmable logic, DSP slices and high-speed I/O blocks. This makes it a veritable system-on-FPGA (SoF), which yields the kind of balance that was alluded to in the Summary between an ASIC and standard off-the-shelf general purpose processor and chipset. Figure 1 illustrates the Virtex-4 FX SoF concept.

system-on-FPGA

Figure 1. Xilinx Virtex-4 FX system-on-FPGA concept

PPC405 Core
The PPC405 core provides general purpose computing services. Virtex-4 FX devices with two PPC405 cores can either implement a SMP or distributed software programming model. These cores can run at a maximum speed of 450MHz. But what really makes these cores unique, is the Auxiliary Processor Unit (APU) interface. This interface creates a direct connection between the PPC405 cores and coprocessors embedded in the programmable logic fabric. See block diagram in Figure 2.

system-on-FPGA

Figure 2. Virtex-4 FX PPC405 core with APU interface to embedded coprocessor

CLBs and RAM
Configurable Logic Blocks (CLBs) form the programmable fabric of FPGAs. They provide combinatorial and synchronous logic, as well as distributed memory and shift register capability. Block RAM modules on the Virtex-4 FX provide flexible, true dual-port RAM, up to 10Mbit, which are cascadable to form larger memory blocks. The Virtex-4 FX devices can contain up to 200,000 logic cells, which include lookup tables (LUTs) and registers.

XtremeDSP Slices
The available XtremeDSP slices in the Virtex-4 FX devices can be configured with any combination of an 18x18 bit multiplier, a 48-bit accumulator for multiply-accumulate operations and registers to support pipelining between and inside XtremeDSP slices. Figure 3 shows a fully pipelined butterfly configuration with four XtremeDSP slices forming the pipeline sections. The Virtex-4 FX is available with up to 192 XtremeDSP slices, operating at 500MHz.

system-on-FPGA

Figure 3. Pipeline of two XtremeDSP slices with internal pipeline support

Parameterized DSP IP cores are also available from Xilinx or 3rd parties, which combine XtremeDSP slices, combinatorial logic blocks and memory. Examples are FEC (Forward Error Correction) with interleaver/de-interleaver, Reed-Solomon codecs and Viterbi decoders.

SelectIO Blocks
The Virtex-4 FX allows instantiation of soft I/O blocks for a wide range of low, medium and high-speed I/O. These include both single ended and differential interfaces. Examples of available SelectIO blocks include PCI, PCI Express, SATA, USB 2.0, and memory interfaces for DDR, and DDR-2 DRAM, SDRAM, QDR-II SRAM and RLDRAM-II memory devices.

RocketIO Blocks
Virtex-4 FX devices integrate up to 12 hard-wired, high-speed, full-duplex serial transceiver blocks with speeds between 622 and 10Gbps and with integrated CRC (cyclic redundancy check) generation and checking. These blocks also integrate transmitter and receiver equalization.

Ethernet MACs
Virtex-4 FX devices integrate up to 4 hard-wired tri-mode 10/100/1000 Base-X media access control (MAC) cores. Combined with the RocketIO multi-gigabit media independent interfaces, they form a complete on-chip solution for gigabit Ethernet I/O.

Implementing COM Express Modules With System-on-FPGAs

FPGAs continue to evolve rapidly in terms of functional density and FPGA manufacturers offer different combinations of integrated functions – number of general purpose processor cores, DSP slices, hard-wired I/O blocks, number of CLBs and amounts of memory. Hence, it makes sense to  modularize the SoF subsystem and there is no better choice than the COM Express standard to do so.

The support of COM Express I/O interfaces in the Virtex-4 FX FPGAs makes it realistic to implement COM Express modules with. Table 1 shows COM Express pin-out Types 3 and 5 interfaces side by side with available SelectIO and RocketIO blocks on the Virtex-4 FX FPGAs. I/O blocks not readily available via SelectIO and RocketIO can be implemented using the CLBs in these FPGAs.

Port Type 3 Pin-out Type 5 Pin-out Virtex-4 FX
PCI Express 2-32 2-32 SelectIO/RapidIO
SATA 2-4 2-4 SelectIO/RapidIO
LAN 1-3 1-3 SelectIO/RapidIO
PCI 1 0 SelectIO
USB 4-8 4-8 SelectIO
LPC 1 1 CLB
SMBus 1 1 SelectIO
I2C 1 1 SelectIO
GPI/GPO 4/4 4/4 CLB

Table 1. COM Express I/O implementation through Virtex-4 FX FPGA

The COM Express standard requires that system memory is integrated on the module itself - there are no system memory interfaces specified in the spec, other than the position of a memory socket. The HSTL and SSTL memory interfaces on the Virtex-4 FX allows for memory sockets or direct-on-board soldered memory for DDR and DDR-2 DRAM, SDRAM, QDR-II SRAM and RLDRAM-II type of memory.

Advanced graphics capabilities can be implemented by connecting a high-performance graphics chip via an x16 or x8 PCI Express link. This functionality can either reside on the COM Express module or on the carrier board. Given parts obsolescence issues, OEMs in general prefer the graphics to reside on the module itself. This avoids having to redesign the carrier board in such events and simply swap out the module instead.

Target Applications

SoF COM Express modules are applicable to various signal-processing intensive vertical embedded applications. These include medical and industrial imaging, remote sensing, packet and spectrum analyzers, radar, etc. The large degree of parallelism and pipelining that is inherent in the signal processing algorithms in these applications is a perfect match for the parallelism and pipelining that can be created inside an FPGA. Exploitation of parallelism lowers operating clock frequencies, which in turn lowers power dissipation.

Of special interest is the trend in medical diagnostic ultrasound in 4D (time and space). The burgeoning market for these new generation tools and the signal processing complexity that is involved, is especially an attractive market for SoF COM Express modules. Especially since computer-on-modules today already are found in high volume in this application.

The combination of the high-speed interfaces, on-chip FIFOs and dual-port SRAM and QDR-II SRAM and RLDRAM-II memory support, makes such modules also very desirable in networking applications, such as intrusion detectors, VPNs, routers, firewalls, virus walls, as well as in telecom applications, such as ATM, SONET/SDH and IP framers.

In applications that require the application software to run off an x86 host platform, the system can be designed with separate x86 and SoF COM Express modules. Alternatively, the COM Express module itself may combine an x86 and a SoF solution in a single module. In either case communication between the two subsystems occurs via a PCI Express link, where the SoF is configured as a PCI Express node and the x86 as the PCI Express root complex.

Conclusions

The flexibility in I/O specification of the COM Express standard makes perfect sense for a system-on-FPGA implementation, such as the Xilinx Virtex-4 FX. The high logic and functional density of today’s FPGAs, combined with their software and logic programmability, offers OEMs an excellent platform for the implementation of high-performance, vertical embedded systems.

Device variability in terms of functionality and functional density, as well as application specific I/O requirements not satisfied by standard off-the-shelf parts, can be isolated and captured, respectively, using COM Express modules. This gives OEMs the security that they will be able to modify their products to meet changing market requirements with minimal development cost.

As such, the COM Express standard is easily capable of  fulfilling its objective of processor and platform independent specification. And by including SoF implementations, it proofs to be the most flexible standard for modular computing components in the industry.

Tell a Friend link

 |  Home  |  Archives  |  Subscribe  |  Update Subscription  |  Unsubscribe  |  Advertisers  |  Contact Us  |  Privacy Policy  | 
© 2006-2012 COM Express Source. All rights reserved.  ::   Powered by Web Canvas.