GPU Performance Details: Tesla M2075

Contents:

System Configuration

Note that this is previously stored data and does not reflect your system configuration.

MATLAB Release: R2016a

Host

NameIntel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
Clock2201 MHz
Cache2048 KB
NumProcessors16
OSTypeWindows
OSVersionMicrosoft Windows 7 Enterprise

GPU

NameTesla M2075
Clock1147 MHz
NumProcessors14
ComputeCapability2.0
TotalMemory5.25 GB
CUDAVersion7.5
DriverVersion8.17.13.5390 (353.90)

Results for MTimes (double)

These results show the performance of the GPU or host PC when calculating a matrix multiplication of two NxN real matrices. The number of operations is assumed to be 2*N^3 - N^2.

This calculation is usually compute-bound, i.e. the performance depends mainly on how fast the GPU or host PC can perform floating-point operations.

Raw data for Tesla M2075 - MTimes (double)
Array size
(elements)
Num
Operations
Time
(ms)
GigaFLOPS
1,02464,5120.110.57
4,096520,1920.105.13
16,3844,177,9200.1139.37
65,53633,488,8960.21156.41
262,144268,173,3121.21221.35
1,048,5762,146,435,0727.06304.08
4,194,30417,175,674,88052.34328.12
16,777,216137,422,176,256414.74331.34
67,108,8641,099,444,518,9123505.54313.63
(N gigaflops = Nx109 operations per second)

Results for Backslash (double)

These results show the performance of the GPU or host PC when calculating the matrix left division of an NxN matrix with an Nx1 vector. The number of operations is assumed to be 2/3*N^3 + 3/2*N^2.

This calculation is usually compute-bound, i.e. the performance depends mainly on how fast the GPU or host PC can perform floating-point operations.

Raw data for Tesla M2075 - Backslash (double)
Array size
(elements)
Num
Operations
Time
(ms)
GigaFLOPS
1,02423,3810.230.10
4,096180,9070.310.58
16,3841,422,6770.602.38
65,53611,283,1152.245.04
262,14489,871,7014.2521.13
1,048,576717,400,74710.1071.02
4,194,3045,732,914,51737.61152.43
16,777,21645,838,150,315197.26232.38
67,108,864366,604,539,2211321.71277.37
(N gigaflops = Nx109 operations per second)

Results for FFT (double)

These results show the performance of the GPU or host PC when calculating the Fast-Fourier-Transform of a vector of complex numbers. The number of operations for a vector of length N is assumed to be 5*N*log2(N).

This calculation is usually memory-bound, i.e. the performance depends mainly on how fast the GPU or host PC can read and write data.

Raw data for Tesla M2075 - FFT (double)
Array size
(elements)
Num
Operations
Time
(ms)
GigaFLOPS
1,02451,2000.120.43
4,096245,7600.151.67
16,3841,146,8800.157.77
65,5365,242,8800.2026.63
262,14423,592,9600.4947.82
1,048,576104,857,6001.7260.87
4,194,304461,373,4407.8458.86
16,777,2162,013,265,92030.3866.27
(N gigaflops = Nx109 operations per second)

Results for MTimes (single)

These results show the performance of the GPU or host PC when calculating a matrix multiplication of two NxN real matrices. The number of operations is assumed to be 2*N^3 - N^2.

This calculation is usually compute-bound, i.e. the performance depends mainly on how fast the GPU or host PC can perform floating-point operations.

Raw data for Tesla M2075 - MTimes (single)
Array size
(elements)
Num
Operations
Time
(ms)
GigaFLOPS
1,02464,5120.100.62
4,096520,1920.105.30
16,3844,177,9200.0945.33
65,53633,488,8960.20171.72
262,144268,173,3120.72374.87
1,048,5762,146,435,0723.36639.35
4,194,30417,175,674,88025.21681.40
16,777,216137,422,176,256195.26703.80
67,108,8641,099,444,518,9121616.43680.17
268,435,4568,795,824,586,75213035.71674.75
(N gigaflops = Nx109 operations per second)

Results for Backslash (single)

These results show the performance of the GPU or host PC when calculating the matrix left division of an NxN matrix with an Nx1 vector. The number of operations is assumed to be 2/3*N^3 + 3/2*N^2.

This calculation is usually compute-bound, i.e. the performance depends mainly on how fast the GPU or host PC can perform floating-point operations.

Raw data for Tesla M2075 - Backslash (single)
Array size
(elements)
Num
Operations
Time
(ms)
GigaFLOPS
1,02423,3810.230.10
4,096180,9070.280.65
16,3841,422,6770.592.42
65,53611,283,1152.225.09
262,14489,871,7013.4925.73
1,048,576717,400,7478.2387.15
4,194,3045,732,914,51724.45234.48
16,777,21645,838,150,315109.72417.77
67,108,864366,604,539,221704.08520.69
(N gigaflops = Nx109 operations per second)

Results for FFT (single)

These results show the performance of the GPU or host PC when calculating the Fast-Fourier-Transform of a vector of complex numbers. The number of operations for a vector of length N is assumed to be 5*N*log2(N).

This calculation is usually memory-bound, i.e. the performance depends mainly on how fast the GPU or host PC can read and write data.

Raw data for Tesla M2075 - FFT (single)
Array size
(elements)
Num
Operations
Time
(ms)
GigaFLOPS
1,02451,2000.100.53
4,096245,7600.102.55
16,3841,146,8800.157.90
65,5365,242,8800.1535.09
262,14423,592,9600.3370.70
1,048,576104,857,6000.84124.33
4,194,304461,373,4403.20144.18
16,777,2162,013,265,92012.30163.71
67,108,8648,724,152,32087.8599.31
(N gigaflops = Nx109 operations per second)


Generated by gpuBench v1.7: 2017-02-19 15:03:01 Back to summary