GTX Titanのnbody benchmark

とりあえずnbodyを。

float

[t_azu@linux]$ ./nbody -benchmark
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure perfomance.
	-fullscreen       (run n-body simulation in fullscreen mode)
	-fp64             (use double precision floating point values for simulation)
	-hostmem          (stores simulation data in host memory)
	-benchmark        (run benchmark to measure performance) 
	-numbodies=<N>    (number of bodies (>= 1) to run in simulation) 
	-device=<d>       (where d=0,1,2.... for the CUDA device to use)
	-numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
	-compare          (compares simulation results running once on the default GPU and once on the CPU)
	-cpu              (run n-body simulation on the CPU)
	-tipsy=<file.bin> (load a tipsy model file for simulation)

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "GeForce GTX TITAN" with compute capability 3.5

> Compute 3.5 CUDA device: [GeForce GTX TITAN]
14336 bodies, total time for 10 iterations: 30.587 ms
= 67.193 billion interactions per second
= 1343.858 single-precision GFLOP/s at 20 flops per interaction

double

[t_azu@linux]$ ./nbody --benchmark -fp64
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure perfomance.
	-fullscreen       (run n-body simulation in fullscreen mode)
	-fp64             (use double precision floating point values for simulation)
	-hostmem          (stores simulation data in host memory)
	-benchmark        (run benchmark to measure performance) 
	-numbodies=<N>    (number of bodies (>= 1) to run in simulation) 
	-device=<d>       (where d=0,1,2.... for the CUDA device to use)
	-numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
	-compare          (compares simulation results running once on the default GPU and once on the CPU)
	-cpu              (run n-body simulation on the CPU)
	-tipsy=<file.bin> (load a tipsy model file for simulation)

> Windowed mode
> Simulation data stored in video memory
> Double precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "GeForce GTX TITAN" with compute capability 3.5

> Compute 3.5 CUDA device: [GeForce GTX TITAN]
14336 bodies, total time for 10 iterations: 103.493 ms
= 19.858 billion interactions per second
= 595.751 double-precision GFLOP/s at 30 flops per interaction