Skip to main content

GPU inference performance

Transcription:Batch Deployments:Container

This is a comparison of the performance and estimated running costs of transcription executing on standard Azure VMs.

CPUGPU (T4)
Operating PointStandardEnhancedStandardEnhanced
Lowest Processing Cost (US ¢ per hour)1.73.80.642.24
Cost vs CPU (%)--38%59%
Maximum throughput153.223.7117.733.54
Minimum Real-Time Factor (RTF)20.140.330.0430.088
Transcriber count20205013

The benchmark was using the following configuration:

Benchmark details
CPUD16ds_v5
GPUStandard_NC8as_T4_v3
Price basisAzure PAYG East US, Linux, Standard
1 Throughput is measured as hours of audio per hour of system runtime. A throughput of 50 would mean that in one hour, the system as a whole can transcribe fifty hours of audio.
2 An RTF of 1 would mean that a one hour file would take one hour to transcribe. An RTF of 0.1 would mean that a one hour file would take six minutes to transcribe.