The goal of this tutorial is to set up a demo environment allowing users to place calls to Asterisk and invoke a speech application utilizing the AWS Transcribe and Polly APIs. The tutorial provides complete step-by-step instructions for installation of Asterisk and UniMRCP server with AWS Transcribe and Polly plugins.
In the scope of this tutorial a virtual machine (VM) will be launched on AWS, which is inline with the use of the AWS Transcribe and Polly APIs. A similar environment can be built on other cloud-based and/or on-prem infrastructures.
Both UniMRCP server and Asterisk are going to be installed on a single VM for simplicity.
While Ubuntu 20.04 LTS is used in this tutorial, the same or similar instructions apply to other Linux distributions supported by UniMRCP.
Select Ubuntu Server 20.04 LTS.
Select an instance type.
Configure the security group by specifying inbound rules for the SIP port and the RTP port required to place calls to Asterisk. This task can be done later, after the instance is launched.
Create a new or use an existing SSH key and launch the instance.
Take note of the public and private IP addresses, which will be used later in the Asterisk configuration.
Connect to the instance via SSH using the associated key and the public IP address. The default username is ubuntu.
ssh -i your-private-key ubuntu@public-ip-address
UniMRCP binary packages are available to authenticated users only. In order to register a free account, please visit the following page. Skip this step, if you already have an account.
A newly registered account needs to be verified by the user and then activated by the administrator prior further proceeding.
Account verification and activation.
Create a text file unimrcp.conf in the directory /etc/apt/auth.conf.d/.
sudo nano /etc/apt/auth.conf.d/unimrcp.conf
Supply login information in the following format.
machine unimrcp.org
login username
password password
The username and password fields must be replaced with the corresponding account credentials.
Account credentials.
Create a text file unimrcp.list in the directory /etc/apt/sources.list.d.
sudo nano /etc/apt/sources.list.d/unimrcp.list
Configure a repository by adding the following entry.
deb [arch=amd64] https://unimrcp.org/repo/apt/ focal main asterisk-16
For verification of binary packages, UniMRCP provides a public GnuPG key, which can be retrieved and installed as follows.
wget -O - https://unimrcp.org/keys/unimrcp-gpg-key.public | sudo apt-key add -
In order to check for updates and apply the changes in the APT configuration, use the following command.
sudo apt-get update
This section provides instructions for installation and configuration of the UniMRCP server with the Transcribe and Polly plugins.
In order to install the UniMRCP server with the Transcribe and Polly plugins, including all the dependencies, use the following command.
sudo apt-get install unimrcp-transcribe unimrcp-polly
As a result, apt-get will check and prompt to download all the required packages by installing them in the directory /opt/unimrcp.
In order to install the additional data files for the sample client application umc, the following command can be used.
sudo apt-get install umc-addons
This package is optional and provides additional data to be used for validation of basic setup.
The Transcribe and Polly plugins to the UniMRCP server are licensed software.
In order to obtain a trial license, node information must be retrieved and submitted for license generation.
Use the installed tool unilicnodegen to retrieve the node information.
sudo /opt/unimrcp/bin/unilicnodegen
As a result, a text file uninode.info will be saved in the current directory.
Consider the following procedure to place an order.
- Navigate to https://unispeech.io/transcribe.
- Under the section Obtain License, select
- License Variation: Trial
- License Term: 30-day
- License Type: Node-bound
- License Quantity: 2
- Click Add to cart.
- Navigate to https://unispeech.io/polly.
- Under the section Obtain License, select
- License Variation: Trial
- License Term: 30-day
- License Type: Node-bound
- License Quantity: 2
- Click Add to cart.
- Click View cart.
- Click Proceed to checkout and then Place order.
- Attach the retrieved uninode.info file to the placed order.
The orders are normally processed within one business day.
The provided license files need to be placed into the directory /opt/unimrcp/data.
sudo cp umstranscribe_*.lic /opt/unimrcp/data
sudo cp umspolly_*.lic /opt/unimrcp/data
Create a text file aws.credentials in the directory /opt/unimrcp/data.
sudo nano /opt/unimrcp/data/aws.credentials
Place your AWS IAM user credentials, collected in Section 2.2, in the following format.
{
"aws_access_key_id": "••••••••••••",
"aws_secret_access_key": "••••••••••••••••••••••••••••••••••••"
}
Open the configuration file unimrcpserver.xml, located in the directory /opt/unimrcp/conf.
sudo nano /opt/unimrcp/conf/unimrcpserver.xml
Leave enabled only the required MRCP resources such as speechsynth and speechrecog and disable the remaining resources such as recorder and speakverify. This step is optional.
<resource-factory>
<resource id="speechsynth" enable="true"/>
<resource id="speechrecog" enable="true"/>
<resource id="recorder" enable="false"/>
<resource id="speakverify" enable="false"/>
</resource-factory>
In order to load the Transcribe and Polly plugins into the UniMRCP server, add the following entries under the XML element <plugin-factory>. Take out or disable the entries for other plugins.
<!-- Factory of plugins (MRCP engines) -->
<plugin-factory>
<engine id="Demo-Recog-1" name="demorecog" enable="false"/>
<engine id="Demo-Synth-1" name="demosynth" enable="false"/>
<engine id="Transcribe-1" name="umstranscribe" enable="true"/>
<engine id="Polly-1" name="umspolly" enable="true"/>
</plugin-factory>
Open the configuration file logger.xml, located in the directory /opt/unimrcp/conf.
sudo nano /opt/unimrcp/conf/logger.xml
In order to enable log statements produced by the plugins, add the following entries under the element <sources>.
<source name="TRANSCRIBE-PLUGIN" priority="INFO" masking="NONE"/>
<source name="POLLY-PLUGIN" priority="INFO" masking="NONE"/>
The configuration files of the plugins umstranscribe.xml and umspolly.xml are located in the directory /opt/unimrcp/conf.
While the default settings are sufficient for use in the scope of this tutorial, refer to the Usage Guide of the plugins for more information.
Start the UniMRCP server as a service.
sudo systemctl restart unimrcp
Open the current log file of the server, located in the directory /opt/unimrcp/log.
cat /opt/unimrcp/log/unimrcpserver_current.log
Check whether the plugins are loaded normally.
[INFO] Load Plugin [Translate-1] [/opt/unimrcp/plugin/umstranscribe.so]
[INFO] Load Plugin [Polly-1] [/opt/unimrcp/plugin/umspolly.so]
Next, check for the license information.
[NOTICE] UniMRCP Transcribe License
-product name: umstranscribe
-product version: 1.0.0
-license owner: -
-license type: trial
-issue date: 2021-05-19
-exp date: 2021-05-19
-channel count: 2
-feature set: 0
[NOTICE] UniMRCP Polly License
-product name: umspolly
-product version: 1.0.0
-license owner: -
-license type: trial
-issue date: 2021-05-19
-exp date: 2021-05-19
-channel count: 2
-feature set: 0
The optional package umc-addons must be installed for this test to work.
Launch the sample UniMRCP client application umc.
cd /opt/unimrcp/bin
sudo ./umc
Run a typical speech recognition scenario by issuing the command run tsr1 from the console of the umc client application.
run tsr1
Run a typical speech synthesis scenario by issuing the command run bss1 from the console of the umc client application.
run bss1
Visually inspect the logs for any possible warnings or errors.
This section provides instructions for installation and configuration of the Asterisk and a sample Python application.
In order to install the Asterisk packages, use the following command.
sudo apt-get install asterisk
In order to install the app-unimrcp module for Asterisk, use the following command.
sudo apt-get install asterisk-app-unimrcp
In order to install the Python module for Asterisk, required for execution of sample AGI speech applications, use the following commands.
sudo apt-get install python-asterisk pip
sudo pip install pyst2
Open the configuration file of the app-unimrcp module mrcp.conf, located in the directory /etc/asterisk.
sudo nano /etc/asterisk/mrcp.conf
For the default profile ums2, set the private IP address of the instance. Since both the Asterisk and the UniMRCP server are installed on the same instance, the parameters server-ip, client-ip and rtp-ip must be set to the same IP address.
;
; General settings
;
[general]
; Default ASR and TTS profiles.
default-asr-profile = ums2
default-tts-profile = ums2
; UniMRCP logging level to appear in Asterisk logs. Options are:
; EMERGENCY|ALERT|CRITICAL|ERROR|WARNING|NOTICE|INFO|DEBUG -->
log-level = DEBUG
max-connection-count = 100
max-shared-count = 100
offer-new-connection = 1
; rx-buffer-size = 1024
; tx-buffer-size = 1024
; request-timeout = 5000
; speech-channel-timeout = 30000
;
; Profile for UniMRCP Server [MRCPv2]
;
[ums2]
; MRCP settings
version = 2
;
; SIP settings
server-ip = 172.31.22.59
server-port = 8060
; SIP user agent
client-ip = 172.31.22.59
client-port = 25097
sip-transport = udp
;
; RTP factory
rtp-ip = 172.31.22.59
rtp-port-min = 28000
rtp-port-max = 29000
;
; Jitter buffer settings
playout-delay = 50
max-playout-delay = 200
; RTP settings
ptime = 20
codecs = PCMU PCMA L16/96/8000 telephone-event/101/8000
; RTCP settings
rtcp = 0
Replace 172.31.22.59 with the private IP address of your instance collected in Section 3.1.
Deploy the following sample speech transcription AGI application agi_transcription.py in to the directory /usr/share/asterisk/agi-bin.
sudo nano /usr/share/asterisk/agi-bin/agi_transcription.py
#!/usr/bin/python3
"""
Asterisk AGI Speech Transcription Application
This script performs speech transcription in a loop by playing the transcribed text back to the caller.
* Revision: 1
* Date: Apr 28, 2021
* Vendor: Universal Speech Solutions LLC
"""
import sys
from asterisk.agi import *
class SpeechTranscriptionApp:
"""A class representing speech transcription application"""
def __init__(self, options):
"""Constructor"""
self.options = options
self.prompt = "Welcome to speech transcription application. Please speak."
self.instance = None
self.status = None
self.cause = None
def transcribe_speech(self):
"""Performs a streaming speech transcription"""
self.grammars = "%s,%s" % (self.compose_speech_grammar(), self.compose_dtmf_grammar())
self.synth_and_recog()
def synth_and_recog(self):
"""This is an internal function which calls SynthAndRecog"""
if not self.prompt:
self.prompt = ' '
args = "\\\"%s\\\",\\\"%s\\\",%s" % (
self.prompt, self.grammars, self.options)
agi.set_variable('RECOG_STATUS', '')
agi.set_variable('RECOG_COMPLETION_CAUSE', '')
self.action = None
agi.appexec('SynthandRecog', args)
self.status = agi.get_variable('RECOG_STATUS')
agi.verbose('got status %s' % self.status)
if self.status == 'OK':
self.cause = agi.get_variable('RECOG_COMPLETION_CAUSE')
agi.verbose('got completion cause %s' % self.cause)
else:
agi.verbose('recognition completed abnormally')
def compose_speech_grammar(self):
"""Composes a built-in speech grammar"""
grammar = 'builtin:speech/transcribe'
return grammar
def compose_dtmf_grammar(self):
"""Composes a built-in DTMF grammar"""
grammar = 'builtin:dtmf/digits'
return grammar
def append_grammar_parameter(self, grammar, name, value, separator):
"""Appends a name/value parameter to the specified grammar"""
grammar += "%s%s=%s" % (separator, name, value)
return grammar
def get_prompt(self):
"""Composes the next prompt"""
prompt = 'You said ' + self.instance + '. Please speak.'
agi.verbose('got prompt %s' % prompt)
return prompt
def check_dialog_completion(self):
"""Checks wtether the dialog is complete"""
complete = False
if self.instance == 'Exit' or self.instance == 'Quit' or self.instance == 'exit' or self.instance == 'quit' or self.instance == '0':
complete = True
return complete
def run(self):
"""Interacts with the caller in a loop until the dialog is complete"""
processing = True
while processing:
self.transcribe_speech()
processing = True
if self.status == 'OK':
if self.cause == '000':
self.instance = agi.get_variable('RECOG_INSTANCE(0/0)')
agi.verbose("got instance %s" % self.instance)
self.prompt = self.get_prompt()
if self.check_dialog_completion():
self.prompt = 'Thank you. See you next time.'
processing = False
elif self.cause != '001' and self.cause != '002':
processing = False
elif self.cause != '001' and self.cause != '002':
processing = False
agi.appexec('MRCPSynth', "\\\"%s\\\"" % self.prompt)
agi = AGI()
options = 'plt=1&b=1&sct=1500&sint=15000&nit=10000'
transcribe_app = SpeechTranscriptionApp(options)
transcribe_app.run()
agi.verbose('exiting')
Change the ownership of the agi-bin directory to asterisk and make the script executable.
sudo chown -R asterisk:asterisk /usr/share/asterisk/agi-bin
sudo chmod +x /usr/share/asterisk/agi-bin/agi_transcription.py
Open the configuration file extensions.conf, located in the directory /etc/asterisk.
sudo nano /etc/asterisk/extensions.conf
Add an extension 701 under the demo context.
exten => 701,1,Answer()
exten => 701,2,agi(agi_transcription.py)
Open the configuration file sip.conf, located in the directory /etc/asterisk.
sudo nano /etc/asterisk/sip.conf
Since Asterisk is located in a private network, the NAT settings must be configured accordingly in the general section.
localnet=172.31.22.59/255.255.255.0
externaddr=18.237.90.248
nat=yes
Replace 172.31.22.59 with the private IP address and 18.237.90.248 with the public IP address of your instance collected in Section 3.1.
Add a SIP profile by giving the profile a name and setting a password which would need to be used while registering a SIP phone to Asterisk.
[astums]
type=friend
context=default
secret=******
host=dynamic
disallow=all
allow=ulaw
Open the configuration file rtp.conf, located in the directory /etc/asterisk.
sudo nano /etc/asterisk/rtp.conf
Narrow the default RTP port range to match the inbound rule specified in Section 3.1. Set the intended rtpstart and rtpend port numbers.
;
; RTP Configuration
;
[general]
;
; RTP start and RTP end configure start and end addresses
;
; Defaults are rtpstart=5000 and rtpend=31000
;
rtpstart=10000
rtpend=10010
Start the Asterisk as a service.
sudo systemctl restart asterisk
Register a SIP phone to the configured profile astums on Asterisk and dial the extension 701 associated to the speech transcription application.
This section provides additional references to the tutorial.