ACCELERATING COMPUTATIONS IN VERY LARGE APPLICATIONS USING DATA FLOW BASED ACCELERATORS

TITLE:

DATE:

Friday, February 11th, 2011

TIME:

3:30 PM

LOCATION:

GMCS 214

SPEAKER:

Mike Flynn, Chairman, Maxeler Corp., www.maxeler.com

ABSTRACT:

For many high performance computing applications the alternative to the multicore rack is to use an accelerator assist to each multicore node. There are a number of instances of these accelerators: GPGPU, specialized processors and FPGAs.

At Maxeler we’ve found that the FPGA array technology wins out on performance for most relevant applications. Given the initial area-time-power disadvantage of the FPGA compared to (say) a custom designed adder this is a surprising result. The sheer magnitude of the available FPGA parallelism overcomes the initial disadvantage.

For very large (more than 1020 operations per run or continuously running) applications we first identify the locus of dynamic activity (loosely termed the “kernels”). This is assigned to the accelerator. Next, where possible, the relevant program is configured as a streaming computation, with static instruction graph activated by data streams. Using the FPGA technology it’s possible to configure this data flow graph into a synchronous data flow machine to execute the computation. The array is synchronized to accept a new set of input arguments each cycle, spanning a pipeline of up to 500 stages.

As an example we consider modeling problems in geophysics. In a typical problem we realize a 2000 node array on 2 FPGA’s, with a resulting 50-100 times speedup over a conventional multicore server.

HOST:

Jose Castillo

DOWNLOAD:

PDF File