GPU inference performance
Transcription:Batch Deployments:ContainerThis is a comparison of the performance and estimated running costs of transcription executing on standard Azure VMs.
CPU | GPU (T4) | |||
---|---|---|---|---|
Operating Point | Standard | Enhanced | Standard | Enhanced |
Lowest Processing Cost (US ¢ per hour) | 1.7 | 3.8 | 0.64 | 2.24 |
Cost vs CPU (%) | - | - | 38% | 59% |
Maximum throughput1 | 53.2 | 23.7 | 117.7 | 33.54 |
Minimum Real-Time Factor (RTF)2 | 0.14 | 0.33 | 0.043 | 0.088 |
Transcriber count | 20 | 20 | 50 | 13 |
The benchmark was using the following configuration:
Benchmark details | |
---|---|
CPU | D16ds_v5 |
GPU | Standard_NC8as_T4_v3 |
Price basis | Azure PAYG East US, Linux, Standard |
2 An RTF of 1 would mean that a one hour file would take one hour to transcribe. An RTF of 0.1 would mean that a one hour file would take six minutes to transcribe.