Customer application framework:play audio files

From TBwiki
(Difference between revisions)
Jump to: navigation, search
(CPU usage on a TMG7800-ctrl host)
(CPU usage on a TMG3200 host)
Line 238: Line 238:
 
| vox + audio gain adjustment || 15% || 30% || 70% || 150% || 350%
 
| vox + audio gain adjustment || 15% || 30% || 70% || 150% || 350%
 
|-
 
|-
| alaw + alaw 2 channels mixing || low || 15% || 40% || 70% || 140%
+
| alaw + alaw (mixing 2 files + gain) || low || 15% || 40% || 70% || 140%
 
|-
 
|-
| vox + vox 2 channels mixing || 40% || 80% || 150% || 350% || -
+
| vox + vox (mixing 2 files + gain) || 40% || 80% || 150% || 350% || -
 
|}
 
|}
 
Notes:
 
Notes:

Revision as of 08:06, 14 March 2013

Contents

Overview

The CAF (customer application framework) API offers functions to play or record audio files:

On call legs (CTBCAFCallLeg):

  • PlayStream()
  • RecordStream()

On audio mixers (CTBCAFMixer):

  • MixerPlayStream()
  • MixerRecordStream()

It also offers callbacks, on the call flow and call behavior classes (based on CTBCAFCallFlow or CTBCAFCallBehavior) to notify the application when files have started playing/recording, and stopped playing/recording:

  • OnStreamPlayingStarted()
  • OnStreamPlayingDone()
  • OnStreamRecordingStarted()
  • OnStreamRecordingDone()
  • OnMixerStreamPlayingStarted()
  • OnMixerStreamPlayingDone()
  • OnMixerStreamRecordingStarted()
  • OnMixerStreamRecordingDone()

API parameters

PlayStream / MixerPlayStream

Class CTBCMC_PLAY_ATTRIBUTE is used to build play attributes. Available attributes are:

  • AddPlayFilePath: Function to add a file to the list of files to play. Has some parameters:
    • File path (or URI) (See here for details)
    • Start/end offsets (optional)
    • Repeat count (optional) specific for this file in the sequence of files
  • fAllowBargeInInterruption: Allow barge-in interruption (automatic stopping of playback upon detected DTMF)
  • fPrepareForRecording: Prepare hardware resources for recording (pre-reserve recording resources). Provides performance gain when both playing/recording on a call leg or mixer.
  • un32RepeatCount: Number of times to play the whole file sequence of files
  • fNotifyStartOfNewFile: Asks for OnStreamPlayingStarted (or OnMixerStreamPlayingStarted) events for each file in the sequence of files (otherwise, one for the whole sequence)
  • s8AudioGainDB: Gain (or loss) of audio level
  • fAllowMixing: Allow this playing sequence to be mixed with another file sequence simultaneously playing to the same call leg (or mixer)
  • un8PlayIndex: Sequence "index" (0 to 3) to assign to this playing sequence. Will replace a previous playing sequence using the same index, but will be mixed with simultaneously playing sequences with other indexes on the same call leg (or mixer).
  • fPaused: Prepare the playback, but start paused until further notice.

Play path format

The format of the path used for playing files can be relative, absolute, or a URI of a file on a HTTP server.

In a sequence of files to play, a mix of relative path, absolute path or URI can be used without restriction.

A relative path

Path are relative to tbstreamserver application's working directory:

  • /lib/tb/toolpack/setup/12358/2.7/apps/tbstreamserver/

(in the path above, replace 12358 by your System Id, and 2.7 by your current Toolpack major version)

Examples of relative paths:

 prompts/my_prompt.wav
 ../../../audio_files/hello_world.alaw

An absolute path

Absolute path can be used. Examples of absolute paths:

 /lib/tb/toolpack/pkg/prompts/welcome.vox
 c:/audio/test.pcm

The URI of a file on a HTTP server

The Stream Server application supports playing files that are located on a remote HTTP server. In that case, use a standard HTTP URI. Examples of HTTP URI for files on remote servers:

 http://www.my_files_server.com/ring_back_tone/user_1441.wav
 http://10.0.0.10:8080/dir/subdir/file.g723

Using file choice

The Stream Server application is able to choose the first available file among a list of files to choose from.

Choices can be a mix of relative path, absolute path or URI can be used without restriction.

Syntax of file choice

Providing a file choice is done by created a coma-separated list of file choices, under parenthesis:

(first_path,second_path,third_path)

As you can see, this allows the PlayStream API to be used to play a chain of files, each file in the chain can be a choice between multiple files (local, or on remote HTTP server)

Examples

  • User custom file, with fall-back to default: (prompts/user_1441/ring_back_tone.wav,prompts/default/ring_back_tone.wav)
  • User language, with default language: (prompts/fr/welcome.wav,prompts/en/welcome.wav,prompts/default/welcome.wav)
  • Per day of the week prompts: (prompts/monday/menu.wav,prompts/week_day/menu.wav,prompts/default/menu.wav)
  • HTTP server redundancy: (http://primary_server/prompts/welcome.wav,http://secondary_server/prompts/welcome.wav,prompts/service_unavailable.wav)

RecordStream / MixerRecordStream

Class CTBCMC_RECORD_ATTRIBUTE is used to build recprd attributes. Available attributes are:

  • AddRecFilePath: Function to set the path of the file to record (Note: URI not supported when recording)
  • fRecordTones: Indicates if tones (DTMF) must be recorded or suppressed from the recording
  • fPrepareForPlaying: Prepare hardware resources for playing (pre-reserve playing resources). Provides performance gain when both playing/recording on a call leg or mixer.
  • fPaused: Prepare the recording, but start paused until further notice.

Record path format

The format of the path used for recording files can be relative or absolute. But it cannot be a URI of a file on a HTTP server (not supported).

Features for playing files

Here is a list of features supported when playing files with PlayStream() or MixerPlayStream() API calls:

Load sharing

When a file play is requested, Toolpack will try to do load sharing among available Stream Server applications (on various Toolpack hosts):

  • See which server has the most known files among the sequence of files to play
  • If more than on server have the files, the least loaded Stream Server is chosen

Playing a chain of multiple files

When calling PlayStream() API, a sequence of multiple files can be provided. For each file in the sequence, a start offset, a end offset and a repeat count can be provided.

Providing choice between multiple files

As explained here, it's possible to provide multiple choices for a file to play, and the Stream Server application will play the first found file from the choices.

Start/end offsets, repeat count

For each file in a sequence of files to play, a start offset, a end offset and a repeat count can be provided. A global repeat count (for the whole sequence) can also be provided.

Pause/Resume

Playing streams can be paused or resumed at any time during a call flow, using the PauseStream(), ResumeStream(), MixerPauseStream() or MixerResumeStream() API calls. Pausing or resuming a stream does not modify any hardware resources, and thus has a minimal cost and impact on system performance.

Live transcoding

Stream server supports playing files from various formats. When a playing file is not already in TDM format (aLaw, 8Khz, mono), it will be transcoded "live" to TDM format.

Supported file formats are:

  • ".alaw": Raw alaw encoded audio file (8khz, 8 bits, mono, alaw)
  • ".ulaw": Raw ulaw encoded audio file (8khz, 8 bits, mono, ulaw)
  • ".pcm" : Linear 16 bits PCM audio file (8khz, 16 bits, mono)
  • ".g721": ADPCM 32Kbps, G721 encoding (Note: heavy CPU usage when transcoding)
  • ".g723": ADPCM 24Kbps, G723 encoding (Note: heavy CPU usage when transcoding)
  • ".g726": ADPCM 16Kbps, G726 encoding (Note: heavy CPU usage when transcoding)
  • ".vox" : ADPCM 32Kbps encoding
  • ".vox6": ADPCM 24Kbps (6Kbps sampling rate) encoding
  • ".wav" : Microsoft Wave file format using any of the following options:
    • Encodings:
      • aLaw / uLaw
      • PCM 8 bits / 16 bits
      • ADPCM (G721 32Kbps, G723 24Kbps or G726 16Kbps, VOX)
    • Channels:
      • Mono / Stereo
    • Sample rate:
      • 8Khz, 11.025Khz, 16Khz, 22.05Khz, 44.1Khz or 48Khz

Audio mixing

The CAF PlayStream() and MixerPlayStream() API calls both support to manage multiple simultaneously playing sequence of files (using the un8PlayIndex attribute):

  • Each play sequence is assigned an "index" (un8PlayIndex)
  • Each playing "index" can be controlled in an independent manner (started, stopped, paused, resumed)
  • Each playing "index" can have it's own independent audio gain
  • Audio output of all simultaneously playing sequences (all "indexes") is mixed by the Stream Server application, and sent to TMedia units as one audio stream (and thus does not require increased Ethernet bandwidth on the LAN between Stream Server and TMedia)

Audio gain

An audio gain (or loss) can be provided to a playing file, in case some audio files on disk are known to have inappropriate level. This is also very useful when "mixing" multiple playing streams to the same call leg (or same audio mixer).

Playing remote (HTTP) files

The Stream Server is able to play files that are located to remote HTTP servers.

  • Files are loaded from server chunk by chunk, when required (to avoid loading big files entirely when only beginning is being played)
  • HTTP server download chunk size can be adjusted (smaller chunks more suitable for short call durations, longer chunks more efficient due to reduced number of HTTP download requests per second)
  • A local disk cache is used on the Stream Server host to avoid loading multiple times recently accessed files from HTTP servers.
  • File modification date checks are performed once in a while (can be configured) so Stream Server can detect if a cached file has been modified on the server
  • Multiple download connections per server are used (can be configured) for increased performance, in particular when servers have multiple hard drives that can simultaneously perform multiple IO operations, or when network latency is high compared to available bandwidth

Caching

For performance reasons, different levels of caching are used when playing files:

  • Caching of remote HTTP files on local disk (size can be configured, we recommend 10GB to 1TB of local disk cache)
  • Caching of recently played files into RAM (size can be configured, we recommend 1G to 10GB of RAM cache)
  • Caching of known files (in order to Toolpack to always search all Stream Servers to know which server(s) have the requested file)

Features for recording files

Load sharing

When a record file request is made, Toolpack attempts to do load sharing among available Stream Server applications (on various Toolpack hosts):

  • First, see if a file with same path already exists on on of the Stream Servers. Override this file if found (to avoid two recorded files with the same path!)
  • If file does not exist anywhere, the least loaded Stream Server is chosen

Live transcoding

The same way Stream Server can perform live transcoding for playing files, it can also perform live transcoding when recording files. This will happen automatically whenever the record file path is not ".wav", or ".alaw".

The file formats that we recommend the most for recording are:

  • ".alaw" (no transcoding required, no file header write required)
  • ".vox" (light-weight transcoding required, saves 2x disk space compared to .alaw files)
  • ".wav" (no transcoding required, easier to open with audio editors due to standard "wav" file format)

Performance

Relatively cheap server hardware should be sufficient to feed the biggest Toolpack systems with up to 16 Tmedia units (over 30,000 simultaneous playbacks), in most situations.

However, it's important to properly understand the hardware requirements to avoid performance bottlenecks.

Typical bottlenecks

CPU usage

CPU usage is very rarely a bottle neck with the Stream Server application, unless heavy audio transcoding or audio mixing is used.

  • Without transcoding or mixing, CPU usage is almost always below 10% of one CPU core (even for large number of playing channels)
  • CPU usage of "live" transcoding vary from one file format to another. For example, transcoding of "vox" files uses little CPU. A typical quad-core Intel processor can transcode 10,000 to 20,000 simultaneously playing channels.
  • Audio mixing and audio gain use very little CPU. A typical quad-core Intel processor can perform "live" audio mixing for more than 20,000 simultaneously playing channels.

CPU usage on a TMG7800-ctrl host

Tested with release 2.7.22 TMG7800-ctrl host has Intel Xeon X3470 (2.93Ghz, Quad core + hyperthreading, 8M Cache)

Codec 1000 playbacks 2000 playbacks 4000 playbacks 8000 playbacks 16000 playbacks
aLaw (.alaw or .wav) low low 10% 20% 35%
pcm (.pcm or .wav) 8Khz, 16 bits, mono low 35% 65% 160% 200%
vox (.vox) 100% 200% 300% 450% 480%
44Khz, 16 bits, stereo (.wav) 150% 290% 430% 500% -
G.721, G.723, G.726 (.g721, .g723, .g726) 480% 600% - - -
aLaw + audio gain adjustment low low 20% 30% 82%
vox + audio gain adjustment 120% 220% 325% 550% 600%
alaw + alaw (mixing 2 files) 20% 50% 100% 180% 200%
vox + vox (mixing 2 files) 220% 325% 550% - -

Notes:

  • TMG7800-ctrl is limited to around 16,000 channels due to available Ethernet bandwidth (dual gigabit Ethernet)
  • Reaching these levels of performance may require to update the Stream Server configuration to allow 8 transcoding CPU cores, and 1GB of RAM
  • CPU usage may appear non-linear in some cases, mostly because of effects of HyperThreading (the CPU has 4 physcal cores + 4 virtual "hyperthread" cores which don't have the same processing capacity as the physical cores)

CPU usage on a TMG3200 host

Tested with release 2.7.22 TMG3200 host has Intel Atom 550 (1.5Ghz Dual core + hyperthreading, 1M Cache)

Codec 200 playbacks 400 playbacks 1000 playbacks 2000 playbacks 4000 playbacks
aLaw (.alaw or .wav) low low low 10% 20%
pcm (.pcm or .wav) 8Khz, 16 bits, mono low 15% 30% 60% 120%
vox (.vox) 15% 30% 60% 120% 300%
44Khz, 16 bits, stereo (.wav) 60% 120% 330% - -
G.721, G.723, G.726 (.g721, .g723, .g726) 160% 330% - - -
aLaw + audio gain adjustment low low 10% 20% 40%
vox + audio gain adjustment 15% 30% 70% 150% 350%
alaw + alaw (mixing 2 files + gain) low 15% 40% 70% 140%
vox + vox (mixing 2 files + gain) 40% 80% 150% 350% -

Notes:

  • Reaching these levels of performance may require to update the Stream Server configuration to allow 2 transcoding CPU cores, and 512MB of RAM
  • CPU usage may appear non-linear in some cases, mostly because of effects of HyperThreading (the CPU has 2 physcal cores + 2 virtual "hyperthread" cores which don't have the same processing capacity as the physical cores)

Disk performance

Disk performance can quickly become a bottleneck if the pool of frequently played files is larger than the RAM cache size (typically, more than a couple of GB).

Some quick tips:

  • A server-grade hard drive can sustain 50 to 100 file play requests per second (could be less if the file play requests include sequence of multiple files, or multiple files to choose from).
  • Performance of multiple drives joined as RAID is not always linear with the number of drives
  • SSD drives offer performance several order of magnitudes above the performance of hard drives (a single SSD can typically replace a RAID of more than 10 of the fastest hard drives)
  • For playback-only applications, SSD drives will last much longer than hard-drive due to absence of moving parts when reading data

=> We highly recommend to use SSD drives in servers that will be used for playback of large number of channels (>2000) with the Stream Server.

RAM usage

The Stream Server application RAM usage can be relatively high, based on the number of simultaneously playing channels. A simplified calculation would be that each GB of RAM allowed to the Stream Server application allows playback of about 4,000 to 8,000 simultaneous channels.

The Stream Server application will use all the RAM that it was assigned for caching of recently played files. So it may be useful to allow Stream Server to use more RAM than the bare minimum for the required number of playing channels.

Network bandwidth

The number of simultaneous playbacks with one Stream Server application is generally limited by the available network bandwidth. Expected performance of networks is:

  • A healthy Gigabit Ethernet network can support 8,000 playing channels (can support around 12,000, but we don't recommend going above 8,000)
  • A TMG7800-ctrl server can support 16,000 playing channels, due to load sharing between independent eth0 and eth1 networks
  • A TMG7800 system can support 32,000 playing channels, due to load sharing between eth0 and eth1 of both primary and secondary TMG7800-ctrl servers

Note: If playing from HTTP server, don't forget to also validate required HTTP bandwidth between the TMG7800-ctrl servers and the HTTP servers (presumably through the "mgmt" IP interfaces of the TMG7800-ctrl servers, rather than private eth0/eth1 interfaces)

HTTP server downloads

The performance of files playing from HTTP servers is rarely limited by the Stream Server itself. Stream Server can easily fill out a gigabit network with HTTP download requests, or reach a HTTP server disk performance limit. The HTTP server performance is generally limited by:

  • Network performance for file chunks download through HTTP
  • The disk performance of the HTTP server (number of "GET" requests per second that it can handle)
  • The proportion of files found in the local disk cache versus files that need to be downloaded from the server

Important tips:

  • Make sure to enable HTTP "keepalive" option of the HTTP server (this avoids TCP connections to be closed and re-opened between each file chunk download)
  • Use SSD drive (or RAID with high IO per second) on the HTTP server
  • Configure an appropriate HTTP download chunk size in Toolpack
    • Not too big to avoid wasting bandwidth downloading big file chunks which the end is almost never played
    • Not too small to avoid excessive number of HTTP "GET" requests per second
  • Configure an appropriate number of HTTP download threads per server in Toolpack (8 to 16 threads seems a sweet spot on a Gigabit LAN)
    • Large enough to compensate network latency
    • Large enough so HTTP server has multiple HTTP requests to process from separate hard drives of a RAID simultaneously
    • Not too large to avoid unnecessary overhead
  • Use a sufficient local disk cache size, ideally big enough to cache most files from the HTTP servers (we don't recommend over 1TB, however)
  • Adjust the delay between file modification time checks with the server
    • Not too small to avoid large number of empty HTTP "GET" requests to re-validate file modification dates with the server
    • Not too large to avoid unnecessary delays when a file content is modified, before the Stream Server notices the file was changed


Good example of well-balanced host hardware

Personal tools