GT240のdevicequery - t

195.30 BETAの元でのGT240のdevicequery結果です。

Device 0: "GeForce GT 240"
  CUDA Driver Version:                           3.0
  CUDA Runtime Version:                          2.30
  CUDA Capability Major revision number:         1
  CUDA Capability Minor revision number:         2
  Total amount of global memory:                 536674304 bytes
  Number of multiprocessors:                     12(ブロック数)（GTX260は27）
  Number of cores:                               96（ブロック数*8=スレッドが走るコアの数）
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 16384
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          262144 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    1.34 GHz
  Concurrent copy and execution:                 Yes
  Run time limit on kernels:                     No
  Integrated:                                    No
  Support host page-locked memory mapping:       Yes
  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

ついでに表示用のGeForce8400GSも。

Device 1: "GeForce 8400 GS"
  CUDA Driver Version:                           3.0
  CUDA Runtime Version:                          2.30
  CUDA Capability Major revision number:         1
  CUDA Capability Minor revision number:         1
  Total amount of global memory:                 267714560 bytes
  Number of multiprocessors:                     1
  Number of cores:                               8
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          262144 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    1.40 GHz
  Concurrent copy and execution:                 No
  Run time limit on kernels:                     Yes
  Integrated:                                    No
  Support host page-locked memory mapping:       No
  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

さらに、GTX260も（http://d.hatena.ne.jp/t_azu/20100109当時。ドライバは195.30 BETAでは無いことに注意）。CUDA Driver Versionが上の二つよりも古いですね。

 CUDA Driver Version:                           2.30
 CUDA Runtime Version:                          2.30
 CUDA Capability Major revision number:         1
 CUDA Capability Minor revision number:         3
 Total amount of global memory:                 939261952 bytes（グラフィックカードが持っているメモリ量）
 Number of multiprocessors:                     27(ブロック数)
 Number of cores:                               216（スレッドが走るコアの数）
 Total amount of constant memory:               65536 bytes
 Total amount of shared memory per block:       16384 bytes
 Total number of registers available per block: 16384
 Warp size:                                     32
 Maximum number of threads per block:           512
 Maximum sizes of each dimension of a block:    512 x 512 x 64
 Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
 Maximum memory pitch:                          262144 bytes
 Texture alignment:                             256 bytes
 Clock rate:                                    1.35 GHz
 Concurrent copy and execution:                 Yes
 Run time limit on kernels:                     No
 Integrated:                                    No
 Support host page-locked memory mapping:       Yes
 Compute mode:                                  Default (multiple host threads can use this device simultaneously)