Customer application framework:play audio files
Contents |
Overview
The CAF (customer application framework) API offers functions to play or record audio files:
On call legs (CTBCAFCallLeg):
- PlayStream()
- RecordStream()
On audio mixers (CTBCAFMixer):
- MixerPlayStream()
- MixerRecordStream()
It also offers callbacks, on the call flow and call behavior classes (based on CTBCAFCallFlow or CTBCAFCallBehavior) to notify the application when files have started playing/recording, and stopped playing/recording:
- OnStreamPlayingStarted()
- OnStreamPlayingDone()
- OnStreamRecordingStarted()
- OnStreamRecordingDone()
- OnMixerStreamPlayingStarted()
- OnMixerStreamPlayingDone()
- OnMixerStreamRecordingStarted()
- OnMixerStreamRecordingDone()
API parameters
PlayStream / MixerPlayStream
Class CTBCMC_PLAY_ATTRIBUTE is used to build play attributes. Available attributes are:
- AddPlayFilePath: Function to add a file to the list of files to play. Has some parameters:
- File path (or URI) (See here for details)
- Start/end offsets (optional)
- Repeat count (optional) specific for this file in the sequence of files
- fAllowBargeInInterruption: Allow barge-in interruption (automatic stopping of playback upon detected DTMF)
- fPrepareForRecording: Prepare hardware resources for recording (pre-reserve recording resources). Provides performance gain when both playing/recording on a call leg or mixer.
- un32RepeatCount: Number of times to play the whole file sequence of files
- fNotifyStartOfNewFile: Asks for OnStreamPlayingStarted (or OnMixerStreamPlayingStarted) events for each file in the sequence of files (otherwise, one for the whole sequence)
- s8AudioGainDB: Gain (or loss) of audio level
- fAllowMixing: Allow this playing sequence to be mixed with another file sequence simultaneously playing to the same call leg (or mixer)
- un8PlayIndex: Sequence "index" (0 to 3) to assign to this playing sequence. Will replace a previous playing sequence using the same index, but will be mixed with simultaneously playing sequences with other indexes on the same call leg (or mixer).
- fPaused: Prepare the playback, but start paused until further notice.
Play path format
The format of the path used for playing files can be relative, absolute, or a URI of a file on a HTTP server.
In a sequence of files to play, a mix of relative path, absolute path or URI can be used without restriction.
An absolute path
Absolute path can be used. Examples of absolute paths:
/lib/tb/toolpack/pkg/prompts/welcome.vox c:/audio/test.pcm
A relative path
Path are relative to tbstreamserver application's working directory:
- /lib/tb/toolpack/setup/12358/2.7/apps/tbstreamserver/
(in the path above, replace 12358 by your System Id, and 2.7 by your current Toolpack major version)
Examples of relative paths:
file://prompts/my_prompt.wav file://../../../audio_files/hello_world.alaw
A file name without a path
When providing a file name without a path, there are two possible behaviors:
- From routing scripts, the default "prompts" folder will be used: /lib/tb/toolpack/pkg/prompts/
- C++ API CTBCAFPlayAttributes::AddPlayList yields the same result: Using folder /lib/tb/toolpack/pkg/prompts/
- C++ API CTBCMC_PLAY_ATTRIBUTE::AddPlayFilePath assumes a relative path (see above)
The URI of a file on a HTTP server
The Stream Server application supports playing files that are located on a remote HTTP server. In that case, use a standard HTTP URI. Examples of HTTP URI for files on remote servers:
http://www.my_files_server.com/ring_back_tone/user_1441.wav http://10.0.0.10:8080/dir/subdir/file.g723
Using file choice
The Stream Server application is able to choose the first available file among a list of files to choose from.
Choices can be a mix of relative path, absolute path or URI can be used without restriction.
Syntax of file choice
Providing a file choice is done by created a coma-separated list of file choices, under parenthesis:
(first_path,second_path,third_path)
As you can see, this allows the PlayStream API to be used to play a chain of files, each file in the chain can be a choice between multiple files (local, or on remote HTTP server)
Examples
- User custom file, with fall-back to default: (prompts/user_1441/ring_back_tone.wav,prompts/default/ring_back_tone.wav)
- User language, with default language: (prompts/fr/welcome.wav,prompts/en/welcome.wav,prompts/default/welcome.wav)
- Per day of the week prompts: (prompts/monday/menu.wav,prompts/week_day/menu.wav,prompts/default/menu.wav)
- HTTP server redundancy: (http://primary_server/prompts/welcome.wav,http://secondary_server/prompts/welcome.wav,prompts/service_unavailable.wav)
RecordStream / MixerRecordStream
Class CTBCMC_RECORD_ATTRIBUTE is used to build recprd attributes. Available attributes are:
- AddRecFilePath: Function to set the path of the file to record (Note: URI not supported when recording)
- fRecordTones: Indicates if tones (DTMF) must be recorded or suppressed from the recording
- fPrepareForPlaying: Prepare hardware resources for playing (pre-reserve playing resources). Provides performance gain when both playing/recording on a call leg or mixer.
- fPaused: Prepare the recording, but start paused until further notice.
Record path format
The format of the path used for recording files can be relative or absolute. But it cannot be a URI of a file on a HTTP server (not supported).
Features for playing files
Here is a list of features supported when playing files with PlayStream() or MixerPlayStream() API calls:
Load sharing
When a file play is requested, Toolpack will try to do load sharing among available Stream Server applications (on various Toolpack hosts):
- See which server has the most known files among the sequence of files to play
- If more than on server have the files, the least loaded Stream Server is chosen
Playing a chain of multiple files
When calling PlayStream() API, a sequence of multiple files can be provided. For each file in the sequence, a start offset, a end offset and a repeat count can be provided.
Providing choice between multiple files
As explained here, it's possible to provide multiple choices for a file to play, and the Stream Server application will play the first found file from the choices.
Start/end offsets, repeat count
For each file in a sequence of files to play, a start offset, a end offset and a repeat count can be provided. A global repeat count (for the whole sequence) can also be provided.
Pause/Resume
Playing streams can be paused or resumed at any time during a call flow, using the PauseStream(), ResumeStream(), MixerPauseStream() or MixerResumeStream() API calls. Pausing or resuming a stream does not modify any hardware resources, and thus has a minimal cost and impact on system performance.
Live transcoding
Stream server supports playing files from various formats. When a playing file is not already in TDM format (aLaw, 8Khz, mono), it will be transcoded "live" to TDM format.
Supported file formats are:
- ".alaw": Raw alaw encoded audio file (8khz, 8 bits, mono, alaw)
- ".ulaw": Raw ulaw encoded audio file (8khz, 8 bits, mono, ulaw)
- ".pcm" : Linear 16 bits PCM audio file (8khz, 16 bits, mono)
- ".g721": ADPCM 32Kbps, G721 encoding (Note: heavy CPU usage when transcoding)
- ".g723": ADPCM 24Kbps, G723 encoding (Note: heavy CPU usage when transcoding)
- ".g726": ADPCM 16Kbps, G726 encoding (Note: heavy CPU usage when transcoding)
- ".vox" : ADPCM 32Kbps encoding
- ".vox6": ADPCM 24Kbps (6Kbps sampling rate) encoding
- ".wav" : Microsoft Wave file format using any of the following options:
- Encodings:
- aLaw / uLaw
- PCM 8 bits / 16 bits
- ADPCM (G721 32Kbps, G723 24Kbps or G726 16Kbps, VOX)
- Channels:
- Mono / Stereo
- Sample rate:
- 8Khz, 11.025Khz, 16Khz, 22.05Khz, 44.1Khz or 48Khz
- Encodings:
Audio mixing
The CAF PlayStream() and MixerPlayStream() API calls both support to manage multiple simultaneously playing sequence of files (using the un8PlayIndex attribute):
- Each play sequence is assigned an "index" (un8PlayIndex)
- Each playing "index" can be controlled in an independent manner (started, stopped, paused, resumed)
- Each playing "index" can have it's own independent audio gain
- Audio output of all simultaneously playing sequences (all "indexes") is mixed by the Stream Server application, and sent to TMedia units as one audio stream (and thus does not require increased Ethernet bandwidth on the LAN between Stream Server and TMedia)
Audio gain
An audio gain (or loss) can be provided to a playing file, in case some audio files on disk are known to have inappropriate level. This is also very useful when "mixing" multiple playing streams to the same call leg (or same audio mixer).
Playing remote (HTTP) files
The Stream Server is able to play files that are located to remote HTTP servers.
- Files are loaded from server chunk by chunk, when required (to avoid loading big files entirely when only beginning is being played)
- HTTP server download chunk size can be adjusted (smaller chunks more suitable for short call durations, longer chunks more efficient due to reduced number of HTTP download requests per second)
- A local disk cache is used on the Stream Server host to avoid loading multiple times recently accessed files from HTTP servers.
- File modification date checks are performed once in a while (can be configured) so Stream Server can detect if a cached file has been modified on the server
- Multiple download connections per server are used (can be configured) for increased performance, in particular when servers have multiple hard drives that can simultaneously perform multiple IO operations, or when network latency is high compared to available bandwidth
Caching
For performance reasons, different levels of caching are used when playing files:
- Caching of remote HTTP files on local disk (size can be configured, we recommend 10GB to 1TB of local disk cache)
- Caching of recently played files into RAM (size can be configured, we recommend 1G to 10GB of RAM cache)
- Caching of known files (in order to Toolpack to always search all Stream Servers to know which server(s) have the requested file)
Features for recording files
Load sharing
When a record file request is made, Toolpack attempts to do load sharing among available Stream Server applications (on various Toolpack hosts):
- First, see if a file with same path already exists on on of the Stream Servers. Override this file if found (to avoid two recorded files with the same path!)
- If file does not exist anywhere, the least loaded Stream Server is chosen
- If the file would be found on more than 1 stream server hosts (presumably because it happened to be recorded on a host while the other host was down, but that other host already had a copy of that file), then stream server will choose any of the stream servers for recording (based on load sharing) and will delete extra copies of that file with same path.
Live transcoding
The same way Stream Server can perform live transcoding for playing files, it can also perform live transcoding when recording files. This will happen automatically whenever the record file path is not ".wav", or ".alaw".
The file formats that we recommend the most for recording are:
- ".alaw" (no transcoding required, no file header write required)
- ".vox" (light-weight transcoding required, saves 2x disk space compared to .alaw files)
- ".wav" (no transcoding required, easier to open with audio editors due to standard "wav" file format)
Helpful variables to build play or record file paths
When providing the a path to play or record a file, Toolpack will automatically replace some variables found in the path by appropriate value.
The following variables can be used:
- @{Nap}: Nap of this call leg
- Example: "NAP_PROVIDER_3"
- @{CalledNumber}: Calling number of call leg
- Example: "15552211"
- @{CallingNumber}: Called number of call leg
- Example: "5551313"
- @{LegId}: Current LegId (Unique Id for this leg)
- Example: "F3D67B4B"
- @{Protocol}: Protocol type of current leg
- Example: "SS7"
- @{Direction}: Direction of current leg (IN our OUT)
- Example: "IN"
- @{DATE %Y...%S}: Current date/time, formatted according to C function stftime
- Example: "2014-06-14"
- @{TB_SETUP_HOME}: Path of Toolpack's "setup" folder.
- Example: "/lib/tb/toolpack/setup/12358"
- @{TB_SETUP_PATH}: Path of Toolpack's current version's "setup" folder
- Example: "/lib/tb/toolpack/setup/12358/2.7"
- @{PKG_HOME}: Path where packages are stored. Note: It's not recommended to use that path on redundant systems, package file replication may cause confusion in recorded files.
- Example: "/lib/tb/toolpack/pkg"
- @{CURRENT_PKG}: Version of current package
- Example: "2.6.45"
- @{PROMPT_PATH}: Default path where audio prompts are stored. Note: It's not recommended to use that path on redundant systems, package file replication may cause confusion in recorded files.
- Example: "/lib/tb/toolpack/pkg/prompts"
- @{RECORD_PATH}: Path of Toolpack's default folder to store recorded calls (*** Toolpack 2.8 and above only)
- Example: "/lib/tb/toolpack/setup/12358/recorded_calls"
- @{TBX_GW_PORT}: Current "System Id" (also called "Gateway Port")
- Example: "12358"
Performance
Relatively cheap server hardware should be sufficient to feed the biggest Toolpack systems with up to 16 Tmedia units (over 30,000 simultaneous playbacks), in most situations.
However, it's important to properly understand the hardware requirements to avoid performance bottlenecks.
Typical bottlenecks
CPU usage
CPU usage is very rarely a bottle neck with the Stream Server application, unless heavy audio transcoding or audio mixing is used.
- Without transcoding or mixing, CPU usage is almost always below 10% of one CPU core (even for large number of playing channels)
- CPU usage of "live" transcoding vary from one file format to another. For example, transcoding of "vox" files uses little CPU. A typical quad-core Intel processor can transcode 10,000 to 20,000 simultaneously playing channels.
- Audio mixing and audio gain use very little CPU. A typical quad-core Intel processor can perform "live" audio mixing for more than 20,000 simultaneously playing channels.
Codec | 1000 playbacks | 2000 playbacks | 4000 playbacks | 8000 playbacks | 16000 playbacks |
---|---|---|---|---|---|
aLaw (.alaw or .wav) | low | low | 10% | 20% | 35% |
pcm (.pcm or .wav) 8Khz, 16 bits, mono | low | 35% | 65% | 160% | 200% |
vox (.vox) | 100% | 200% | 300% | 450% | 480% |
44Khz, 16 bits, stereo (.wav) | 150% | 290% | 430% | 500% | - |
G.721, G.723, G.726 (.g721, .g723, .g726) | 480% | 600% | - | - | - |
aLaw + audio gain adjustment | low | low | 20% | 30% | 82% |
vox + audio gain adjustment | 120% | 220% | 325% | 550% | 600% |
alaw + alaw (mixing 2 files + gain) | 20% | 50% | 100% | 180% | 200% |
vox + vox (mixing 2 files + gain) | 220% | 325% | 550% | - | - |
Notes:
- TMG7800-ctrl is limited to around 16,000 channels due to available Ethernet bandwidth (dual gigabit Ethernet).
- A TMG7800 system (having redundant host) can reach 32,000 channels as stream server applications from both hosts are used in load sharing mode
- Reaching these levels of performance may require to update the Stream Server configuration to allow 8 transcoding CPU cores, and 1GB of RAM
- CPU usage may appear non-linear in some cases, mostly because of effects of HyperThreading (the CPU has 4 physcal cores + 4 virtual "hyperthread" cores which don't have the same processing capacity as the physical cores)
CPU usage on a TMG3200 host
Tested with release 2.7.22 TMG3200 host has Intel Atom 550 (1.5Ghz Dual core + hyperthreading, 1M Cache)
Codec | 200 playbacks | 400 playbacks | 1000 playbacks | 2000 playbacks | 4000 playbacks |
---|---|---|---|---|---|
aLaw (.alaw or .wav) | low | low | low | 10% | 20% |
pcm (.pcm or .wav) 8Khz, 16 bits, mono | low | 15% | 30% | 60% | 120% |
vox (.vox) | 15% | 30% | 60% | 120% | 300% |
44Khz, 16 bits, stereo (.wav) | 60% | 120% | 330% | - | - |
G.721, G.723, G.726 (.g721, .g723, .g726) | 160% | 330% | - | - | - |
aLaw + audio gain adjustment | low | low | 10% | 20% | 40% |
vox + audio gain adjustment | 15% | 30% | 70% | 150% | 350% |
alaw + alaw (mixing 2 files + gain) | low | 15% | 40% | 70% | 140% |
vox + vox (mixing 2 files + gain) | 40% | 80% | 150% | 350% | - |
Notes:
- Reaching these levels of performance may require to update the Stream Server configuration to allow 2 transcoding CPU cores, and 512MB of RAM
- CPU usage may appear non-linear in some cases, mostly because of effects of HyperThreading (the CPU has 2 physcal cores + 2 virtual "hyperthread" cores which don't have the same processing capacity as the physical cores)
Disk performance
Disk performance can quickly become a bottleneck if the pool of frequently played files is larger than the RAM cache size (typically, more than a couple of GB).
Some quick tips:
- A server-grade hard drive can sustain 50 to 100 file play requests per second (could be less if the file play requests include sequence of multiple files, or multiple files to choose from).
- Performance of multiple drives joined as RAID is not always linear with the number of drives
- SSD drives offer performance several order of magnitudes above the performance of hard drives (a single SSD can typically replace a RAID of more than 10 of the fastest hard drives)
- For playback-only applications, SSD drives will last much longer than hard-drive due to absence of moving parts when reading data
=> We highly recommend to use SSD drives in servers that will be used for playback of large number of channels (>2000) with the Stream Server.
RAM usage
The Stream Server application RAM usage can be relatively high, based on the number of simultaneously playing channels. A simplified calculation would be that each GB of RAM allowed to the Stream Server application allows playback of about 4,000 to 8,000 simultaneous channels.
The Stream Server application will use all the RAM that it was assigned for caching of recently played files. So it may be useful to allow Stream Server to use more RAM than the bare minimum for the required number of playing channels.
Network bandwidth
The number of simultaneous playbacks with one Stream Server application is generally limited by the available network bandwidth. Expected performance of networks is:
- A healthy Gigabit Ethernet network can support 8,000 playing channels (can support around 12,000, but we don't recommend going above 8,000)
- A TMG7800-ctrl server can support 16,000 playing channels, due to load sharing between independent eth0 and eth1 networks
- A TMG7800 system can support 32,000 playing channels, due to load sharing between eth0 and eth1 of both primary and secondary TMG7800-ctrl servers
Note: If playing from HTTP server, don't forget to also validate required HTTP bandwidth between the TMG7800-ctrl servers and the HTTP servers (presumably through the "mgmt" IP interfaces of the TMG7800-ctrl servers, rather than private eth0/eth1 interfaces)
HTTP server downloads
The performance of files playing from HTTP servers is rarely limited by the Stream Server itself. Stream Server can easily fill out a gigabit network with HTTP download requests, or reach a HTTP server disk performance limit. The HTTP server performance is generally limited by:
- Network performance for file chunks download through HTTP
- The disk performance of the HTTP server (number of "GET" requests per second that it can handle)
- The proportion of files found in the local disk cache versus files that need to be downloaded from the server
Important tips:
- Make sure to enable HTTP "keepalive" option of the HTTP server (this avoids TCP connections to be closed and re-opened between each file chunk download)
- Use SSD drive (or RAID with high IO per second) on the HTTP server
- Configure an appropriate HTTP download chunk size in Toolpack
- Not too big to avoid wasting bandwidth downloading big file chunks which the end is almost never played
- Not too small to avoid excessive number of HTTP "GET" requests per second
- Configure an appropriate number of HTTP download threads per server in Toolpack (8 to 16 threads seems a sweet spot on a Gigabit LAN)
- Large enough to compensate network latency
- Large enough so HTTP server has multiple HTTP requests to process from separate hard drives of a RAID simultaneously
- Not too large to avoid unnecessary overhead
- Use a sufficient local disk cache size, ideally big enough to cache most files from the HTTP servers (we don't recommend over 1TB, however)
- Adjust the delay between file modification time checks with the server
- Not too small to avoid large number of empty HTTP "GET" requests to re-validate file modification dates with the server
- Not too large to avoid unnecessary delays when a file content is modified, before the Stream Server notices the file was changed
Failure recovery
If the stream server application fails, or if the host on which it was running fails:
- Call control application is notified that playbacks and recording are stopped:
- OnStreamPlayingDone( with cause TBCMC_IVR_REASON_CODE_DISCONNECTED )
- OnStreamRecordingDone( with cause TBCMC_IVR_REASON_CODE_DISCONNECTED )
No action is automatically taken by Toolpack. The application will decide how to handle that interrupted playback or recording.
- If another stream server is available (on another redundant host), the application may decide to:
- Restart the playback from the beginning, or from the offset it had already reached in the played file
- Restart the recording
- For an IVR call, reset to "top" of the menu after playing a "sorry" prompt
- Drop the call
- etc...
RAM caching
The tbstreamserver application will make maximum use of the RAM that it's been allowed to use (as defined in it's Web Portal configuration page).
In fact, it's using available RAM to cache recently played files, so the next time they're played it's not required to read them from disk. This caching mechanism has the following advantages:
- Increased performance per hard drive (number of play per second, and/or number of sustainable simultaneous playbacks)
- Increased hard drive life span due to reduce number of head moves
- Improved responsiveness (latency) for starting to play files during higher load or bursts of new files to play
RAM usage and Linux Swappiness
However, modern operating systems (such as Linux) tend to swap out to hard drive portions of RAM that have not been accessed for long periods... and unfortunately, this can be the case for tbstreamserver's RAM cache. And having RAM cache swapped out to disk is the worse thing that can happen. It's actually much worse than not caching the file at all (it can even affect all playing channels by completely stalling the application for few seconds).
To prevent that, Linux can be instructed NOT to swap out untouched portions of RAM aggressively. This is known as "swappiness".
It is highly recommended, for any system running tbstreamserver (or toolpack system in general) to set swappiness to close to 0 (the default being 60, which is definitely not suitable for any real-time software).
You can find information on swappiness and how to modify it here: http://en.wikipedia.org/wiki/Swappiness