TCP Offload Engine IP (TOE)
For Latency Critical, FPGA-based Embedded Networking Applications
Overview TCP Offload (TOE) is FPGA-based IP that receives and transmits Ethernet/IP/TCP packets on Ethernet networks. TOE delivers payload data, in order, to the user's application with: The purpose is to offload the TCP/IP function from the CPU and perform it directly in FPGA-based hardware. TOE dramatically reduces the input to output response time and jitter by eliminating the need for host processor intervention when analyzing data packets. This IP is designed to be utilized in FPGA-based high frequency, low latency Wall Street trading applications. Input to output packet latency of less than 1μs can be achieved. Assuming a 100-byte payload (164-byte packet), the theoretical minimum input to output latency is about 500 ns. The TOE works at the full 10 GbE line rate and was developed internally at DINI Group. TCP Offload is a required function in low latency networking application. Data critical functions are executed directly in the FPGA. Infrequent, non-data TCP/IP functions such as setup/teardown, ARP, ping, DHCP, et al) are passed through to a standard Linux driver. Other, software based TCP sessions run normally with no changes required. At the intended target frequency of 156.25 MHz, the TOE operates at the full 10GbE line rate, generating no Ethernet pause frames. 10 GbE Media Access Controller from Xilinx In minimum latency approaches, it is necessary to avoid using external PHYs since they add significant latency. This IP assumes an FPGA PHY is used. The Xilinx PHY needs a MAC, and the 10 Gigabit Ethernet Media Access Controller (10GEMAC) is required to use this TOE IP. You purchase this separately from Xilinx as it is not included. Note that Xilinx has a free version that disables itself after a few hours. This free version contains all of the functionality of the full version and can be used for evaluation. The TOE IP can connect to the slower 1 GbE MAC and we can make modifications here in La Jolla to interface the TOE to different MACs. Contact Applistar sales for more information. This Xilinx core is compatible with the Virtex-6 HXT FPGAs and works fine on both Virtex-7 and Kintex-7. We use the AXI4-S bus interface option. Our testing and debug was performed using the unrestricted version of this core. PCIe Bridge (GEN1/GEN2) A host interface is required to handle a number of functions related to the TOE with configuration being the most important. A PCI bridge is supplied in encrypted net list format (.ngc) for this purpose. The PCIe Bridge has 4-lanes of GEN1/GEN2, and is a full function PCIe core. Configuration, BARs (base address registers), and master-moding DMA engines are included. Drivers with 'C' source for Linux are included. TOE The TOE implements the TCP function directly in FPGA gates. No external FPGA memory is required. TOEs can be cascaded to support multiple sessions. The TOE IP is intended to be clocked at the standard Ethernet interface frequency of 156.25 MHz, allowing fully synchronous and lowest latency data exchange with the MAC. At 156.25 MHz, the TOE operates at the full 10GbE line rate, generating no Ethernet pause frames. The IP is supplied either as an encrypted .ngc netlist for implementation in Xilinx-based FPGAs or as Verilog source to do with as you see fit. Altera Stratix-5 and ASIC versions will follow shortly. A host interface is required and this IP package includes an integrated 4-lane GEN1/GEN2 PCIe bridge. Simulation models and test fixtures are included. This IP is optimized for low latency: the host CPU is NOT involved in payload data transfer. Not all TCP functions are handled in the IP. High complexity/low importance network features such as setup/teardown, ARP, ping, DHCP, et. al. are passed to a Linux driver via the PCIe interface. 'C' source for this driver is included, allowing customization. All of the functions associated with TCP/IP layers 2, 3, 4, 5 (datalink, network, transport, and session) are implemented. The user is responsible for presentation layer 6 and application layer 7 and can be implemented in the FPGA or elsewhere. The maximum transmission unit (MTU) is 1536 bytes. CRC validation and checksum validation and reordering of out-of-order packets are done directly in the FPGA, along with packet retransmission upon error/lost/out of order packet reception. The TX and RX replay buffers are configurable: 4KB -> 64KB. Protection against wrapped sequence (PAWS) is handled in the FPGA. FPGA/ASIC Resources Required TOE IP Distribution Model The TOE IP is distributed in two different ways: Model 1: Xilinx .ngc file An .ngc file enables integration at the place and route stage into the Xilinx FPGA tools. Source is not provided, but full simulation libraries are supplied. You get this version when you get the FIX support package for our FPGA boards: DN_FBSP. Required operating system driver functions and APIs are supplied, with source, in 'C' for Linux. The TOE IP, supplied as part of the DN_FBSP, is restricted to DINI products and will not operate on other FPGA-based boards. You are welcome to deploy this IP free of royalties or restrictions on DINI Group products. A single DN_FBSP license is required for your company and allows your company to use it worldwide in any number of DINI boards and any number of applications. Model 2: Verilog Source Verilog is our native language. This second distribution option gets you the complete source. You are not allowed to redistribute the source. The license agreement has all the details and the information in the license agreement supersedes what is written here. Under extreme duress and only under extreme duress, we will convert to VHDL. Should we do this conversion, please note that new features and bug fixes will be first available in Verilog. We really don’t like VHDL and all reputable synthesis tools accept mixed language RTL anyway. A maintenance contract, for bug fixes and feature enhancements is probably a good idea. 1 year is required at the time of purchase, with optional extensions sold on a yearly basis. Contact Applistar sales for more details.
What basic functions are required?
Features of the Xilinx 10GEMAC include: