Products

DNBFC_S12_PCIe

DNBFC_S12_PCIe

(Bigger, Faster, Cheaper)
Spartan-6 FPGA Algorithm Acceleration System
For High Performance Computing (HPC)
12 Low-cost FPGAs with DDR3 memory
Hosted via 4-lane PCI Express (GEN1/GEN2)

  • PCI Express (4-lane) FPGA-based algorithm acceleration peripheral
    • 12 of the largest Xilinx Spartan-6 FPGAs: 6SLX150-2
      • 12, 128M x 16 (2Gb) DDR3-800 memories (1 per FPGA)
    • 1 Xilinx Spartan-6 FPGAs: 6SLX150T-2
      • 2, 128M x 16 (2Gb) DDR3-800 memories
  • Fixed 4-lane PCIe interface and controller
    • PCIe GEN1/GEN2
    • Full mastering DMA
      • 2 transmit (host memory -> card)
      • 2 receive (card -> host memory)
  • Xilinx FPGA Spartan-6 LX150-2 -> 12 total
    • 184,464 flip-flops per FPGA
      • 92K flip-flops with 6-input LUT
    • 182, 18×18 multipliers + 48-bit accumulator per FPGA
    • 268, 18 Kbit block RAM (603 Kbytes) per FPGA
      • Fully dual-ported
      • Each block RAM configurable as:
        • 16Kx1, 8Kx2, 4Kx4, 2Kx8/9, 1Kx16/18 or 512 x 32/36
    • Options for LX150-1L (lower power) or LX150-3 (higher frequency)
      • Also LX150T-3,-4
    • Options for LX100, LX75, LX45, LX25 (lower cost)
  • FPGA to FPGA interconnect single-ended
    • Source synchronous FPGA -> FPGA frequency: 150MHz
      • 300 Mb/s per pin when using DDR
  • Main Bus (MB) connects all Spartan-6 FPGAs (8 signals)
    • 90MHz
  • 128Mb x 16 fixed external DDR3 memory dedicated to each FPGA (12 total)
  • 2 – 128Mb x 16 fixed external DDR3 memories dedicated to USER Dataflow Manager FPGA
    • DDR3-800 is stuffed. Vccint is set to ‘Extended Performance’ operating range:
      • With LX150-2, LX150T-2: 667 Mbps (333.5 MHz)
        • 10.67 Gb/s maximum data rate
      • With LX150-3, LX150T-3,LX150T-4: 800 Mbps (400 MHz)
        • 12.8 Gb/s maximum data rate
    • Full support for FPGA memory block controller (MBC)
      • Up to 8 open banks
      • Configurable multi-port interface to FPGA fabric
        • 32-, 64-, or 128-bit data bus
      • Easy implementation with Xilinx CORE® Generator™
  • Expansion via high speed, low-power GTP transceivers (LX150T)
    • 3.125 Gb/s per lane, each direction with -3 or -4 (TX and RX)
    • 2.7 Gb/s per lane, each direction with -2 (TX and RX)
    • 4 lanes (4 RX and 4 TX) for daisy chain left
    • 4 lanes (4 RX and 4 TX) for daisy chain right
    • Board to board data communication
      • >1 GB/s per connector TX
      • >1 GB/s per connector RX
      • Non-proprietary, off-the-shelf Samtec cable assembly
    • Off-board daughter cards
  • Three independent low-skew global clock networks distributed differentially and balanced
    • G0: programmable in 1 MHz increments (ICS84314 clock synthesizer)
      • 32 MHz to 350 MHz
    • G1: 100MHz PCIe reference
    • G2: Main Bus (MB) clock
  • Fast and Painless FPGA configuration via PCIe
    • On-board battery for AES bitstream encryption
    • Unique Device DNA identifier for design authentication
  • Full support for embedded logic analyzers via JTAG interface
    • ChipScope, and other third-party debug solutions
  • FPGA-controller LEDs
    • Enough for emergency lighting in a small parking structure

Small Rackmount Servers the board will work with:

  • 1U
    • SuperMicro X8DTG-D
  • 2U
    • HP DL380 Gen8 2U
      The DNBFC_S12_PCIe requires FL/FH PCIe slot.
  • 4U
    • Any HP, IBM, DELL shoud work provided they support FL/FH PCIe cards.

Overview

Designed for High Performance Computing (HPC) applications, the DNBFC_S12_PCIe is a FPGA-based peripheral that allows algorithm developers to employ hardware-in-the-loop acceleration utilizing cost effective Xilinx Spartan-6 FPGAs. Data movement to/from the FPGA grid is accomplished via a fixed 4-lane, GEN1/GEN2 PCIe bridge. Each Spartan-6 FPGA has its own 128M x 16 DDR3 memory capable of clocked speeds up to 400MHz (800 Mb/s per data pin). Two additional 128M x 16 DDR3 memories are connected to the USER Dataflow Manager FPGA (LX150T) for bulk data storage.

Dedicated PCIe, 4-lane controller (GEN1 or GEN2)

We ship the DNBFC_S12_PCIe with a fixed, full function, 4-lane master/target PCIe controller. Drivers with ‘C’ source for several operating systems are included at no cost.

Spartan-6 FPGAs from Xilinx

The Xilinx LX150 (and LX150T) Spartan-6, 45 nm FPGA is utilized and it is the largest member of this cost effective (read: CHEAP) family. The Spartan-6 FPGA family has an impressive price/performance ratio for hardware-in-the-loop accelerators, with device power consumption much lower than the higher performance FPGA families.

Features of Spartan-6 include the efficient, dual-register 6-input look-up table (LUT) logic, 18 Kb (2 x 9 Kb) block RAMs, second generation DSP48A1 slices (includes 18 x 18 multipliers), and DDR3 memory controllers. Enhanced IP security with AES and Device DNA protection is a new addition to this family and helps keep your proprietary IP secret.

We use the largest device from this family, the LX150, in the FF484 package. 100% of the FPGA resources are dedicated to your application. All FPGAs, excluding the PCIe controller, are configured via PCIe. The PCIe FPGA can be updated in the field.

Memory

Each of the 12 FPGAs has a dedicated 2Gb DDR3 memory. We test the FPGA to memory interface at the fastest frequency allowed given speed grade of FPGA stuffed. If the FPGAs are stuffed with a -3 speed grade or faster, we test this interface at 400MHz. DDR3 is double data rate, multiplying to 800 Mb/s per pin. The configuration is 128M x 16, yielding 12.8 Gb/s maximum data rate per DDR3 memory. The power for the FPGAs is set to the ‘Extended Performance’ Vccint operating range. The -2 speed grade FPGAs are tested at 333 MHz (667 Mb/s)..

The Xilinx Spartan-6 family has integrated hard IP for controlling this dedicated DDR3. The fixed memory controller block (MCB) significantly eases the implementation of high performance dataflow. The MCB can have up to 6 ports, and each port can be configured to have a 32-bit, 64-bit, or 128-bit bus interface. Configurable arbitration is included and up to 8 memory banks can be open simultaneously.

The User FPGA Dataflow Manager has two of its own 128M x 16 DDR3 memories and these memories are useful for bulk memory storage.

As always, we provide examples and references designs to help you with all of your memory interface issues. Please check with us to make sure that what we ship for no charge meets your requirements.

Expansion via High speed GTP serial transceivers

The DNBFC_S12_PCIe has expansion capabilities using the gigabit transceivers on the LX150T, labeled on the block diagram as the USER DATAFLOW MANAGER FPGA. Assuming an LX150T-3 or faster, a total of eight, 3.125 Gb/s transceivers are available for data movement independent from the host computer. Two, non-proprietary Samtec connectors, one on the top and one on the bottom, contain 4 GTP lanes each. Eight general purpose FPGA I/Os are also included. A standard cable can be used to chain two or more DNBFC_S12_PCIe together. Four GTP lanes clocked at 3.125 GHz are capable of transmitting and receiving a data bandwidth of more than 2 GB/s (>1GB/s each for independent TX and RX). Future functions include A/Ds, D/As, 10G Ethernet, general purpose I/Os, and others. Contact the factory for the latest list of expansion features. This feature enables data movement
between boards wholly under FPGA algorithmic control, bypassing the host processor. Whole classes of FPGA accelerated algorithms get significant performance gains from this feature by eliminating the high-latency host
processor.

Power Consumption

The PCI Express specification limits slot power to 25 watts. The DNBFC_S12_PCIe is capable of consuming power significantly beyond that. In addition to the PCIe fingers, a separate HDD connector adds a second path for power. This product is shipped with adequate heatsinking to consume 75 watts, but airflow is required in the chassis to dissipate the heat. Contact the factory if you require high reliability, no-fan heatsinks.

Status LEDs, Debug

Although no specific testing was performed, sophisticated statistical finite element models and back of the envelope calculations are showing the number of status LEDs to be bright enough to provide emergency illumination for a small parking structure. These LEDs are user controllable from the FPGAs so can be used as visual feedback in addition to emergency lighting. A JTAG connector provides an interface to ChipScope and other third party debug tools.

Related Documents

Product Brief [HiRes - LoRes]

Block Diagram
Hardware Manual
Software Manual
Errata
2Gb DDR3 SDRAM
Dini Buses User FPGA Design Manual
PCIe DMA (ConfigFPGA design) User Manual
Related Resources

Emu SoftwareEmu Manual
Virtex6 Overview
Virtex6 Product Table