UMS Nuance BS - Usage Manual | UniMRCP Documentation

Name	Unit	Description
license-file	File path	Specifies the license file. File name may include patterns containing '*' sign. If multiple files match the pattern, the most recent one gets used.
subscription-key-file	File path	Specifies the Nuance Mix credentials to use. File name may include patterns containing '*' sign. If multiple files match the pattern, the most recent one gets used.

Name	Unit	Description
streaming-recognition	String	Specifies parameters of streaming recognition method employed via gRPC.
results	String	Specifies parameters of recognition results set in RECOGNITION-COMPLETE events.
speech-dtmf-input-detector	String	Specifies parameters of the speech and DTMF input detector.
utterance-manager	String	Specifies parameters of the utterance manager.
rdr-manager	String	Specifies parameters of the Recognition Details Record (RDR) manager.
monitoring-agent	String	Specifies parameters of the monitoring manager.
license-server	String	Specifies parameters used to connect to the license server. The use of the license server is optional.

Name	Unit	Description
language	String	Specifies the default language to use, if not set by the client.
start-of-input	String	Specifies the source of start of input event sent to the client (use "service-originated" for an event originated based on a first-received interim result and "internal" for an event determined by plugin).
max-alternatives	Integer	Specifies the maximum number of speech recognition result alternatives to be returned. Can be overridden by client by means of the header field N-Best-List-Length.
alternatives-below-threshold	Boolean	Specifies whether to return speech recognition result alternatives with the confidence score below the confidence threshold.
single-utterance	Boolean	Specifies whether to detect a single spoken utterance or perform continuous recognition.
skip-unsupported-grammars	Boolean	Specifies whether to skip or raise an error while referencing a malformed or not supported grammar.
skip-empty-results	Boolean	Specifies whether to implicitly initiate a new gRPC request if the current one completes with an empty result.
transcription-grammar	String	Specifies the name of the built-in speech transcription grammar. The grammar can be referenced as builtin:speech/transcribe or builtin:grammar/transcribe, where transcribe is the default value of this parameter.
grammar-param-separator	String	specifies a separator of optional parameters passed to a built-in grammar. The separator defaults to ';'.
http-proxy	String	Specifies the URI of HTTP proxy, if used.
stream-creation-timeout	Time interval [msec]	Specifies how long to wait for gRPC stream creation. If timeout is set 0, no timer is used. Otherwise, if timeout is elapsed, gRPC stream creation is cancelled.
grpc-log-redirection	Boolean	Specifies whether to enable gRPC log redirection.
grpc-log-verbosity	String	Specifies gRPC logging verbosity. One of DEBUG, INFO, ERROR. See GRPC_VERBOSITY for more info.
grpc-log-trace	String	Specifies a comma separated list of tracers producing gRPC logs. Use 'all' to turn all tracers on. See GRPC_TRACE for more info.
inter-result-timeout	Time interval [msec]	Specifies a timeout between interim results containing transcribed speech. If the timeout is elapsed, input is considered complete. The timeout defaults to 0 (disabled).
auth-validation-period	Time interval [sec]	Specifies a period in seconds used to re-validate access token.
auth-request-timeout	Time interval [sec]	Specifies a timeout in seconds set on HTTP requests placed to re-validate access token.
selector-channel	String	Specifies the selector channel used in the Start request. Can be overridden by client.
selector-library	String	Specifies the selector library used in the Start request. Can be overridden by client.
session-timeout	Time interval [sec]	Specifies the session timeout used in the Start request. Can be overridden by client.
model-uri	String	Specifies the model URI used in the Start request. Can be overridden by client.
model-type	String	Specifies the model type used in the Start request. Can be overridden by client.
speech-domain	String	Specifies the speech_domain field in RecognitionParameters. Can be overridden by client.
formatting-scheme	String	Specifies the formatting scheme field in RecognitionParameters. Can be overridden by client.
auto-punctuate	Boolean	Specifies whether to enable automatic punctuation. Can be overridden by client.
filter-profanity	Boolean	Specifies whether to filter profanities in recognition result. Can be overridden by client.
include-tokenization	Boolean	Specifies whether to include tokenized recognition result. Can be overridden by client.
discard-speaker-adaptation	Boolean	Specifies whether to discard updated speaker data, if speaker profiles are used. By default, data is stored. Can be overridden by client.
suppress-call-recording	Boolean	Specifies whether to redact transcription results in the call logs and disable audio capture. Can be overridden by client.
mask-load-failures	Boolean	Specifies whether errors loading external resources shall not terminate recognition. Can be overridden by client.
suppress-initial-capitalization	Boolean	Specifies whether to suppress automatic capitalization of the first word in a sentence. Can be overridden by client.
allow-zero-base-lm-weight	Boolean	Specifies whether custom resources (DLMs, wordsets, and others) can use the entire weight space. Can be overridden by client.
filter-wakeup-word	Boolean	Specifies whether to remove the wakeup word from the final result. Can be overridden by client.
end-stream-no-valid-hypotheses	Boolean	Determines whether the dialog application or the client application handles the dialog flow when ASRaaS does not return a valid hypothesis. Can be overridden by client.

Name	Unit	Description
format	String	Specifies the format of results to be returned to the client (use "standard" for NLSML and "json" for JSON).
indent	Integer	Specifies the indent to use while composing the results.
replace-dots	Boolean	Specifies whether to replace '.' with '_' in the parameter names, used while composing an XML content. The parameter is observed only if the format is set to standard.
replace-dashes	Boolean	Specifies whether to replace '-' with '_' in the parameter names, used while composing an XML content. The parameter is observed only if the format is set to standard.
confidence-format	String	Specifies the format of the confidence score to be returned. The parameter is observed only if the format is set to standard. Use one of: auto for a format based on protocol version, mrcpv2 for a float value in the range of 0..1, mrcpv1 for an integer value in the range of 0..100
tag-format	String	Specifies the format of the instance element to be returned. The parameter is observed only if the format is set to standard. Use one of: semantics/xml for query result represented in XML [default] semantics/json for query result represented in JSON swi-semantics/xml for query result set in an inner `<SWI_meaning>` element represented in XML swi-semantics/json for query result set in an inner `<SWI_meaning>` element represented in JSON
tag-encoding	String	Specifies the encoding of the instance element to be returned. The parameter is observed only if the format is set to standard and tag-format is semantics/json. Use one of: none* for plain string escaped for string with escaped special characters such as <, >, ", ', & [default]
event-input-text	String	Specifies the input text to be filled in NLSML on a triggered activity. The parameter defaults to 'null', if not specified.

Name	Unit	Description
vad-mode	Integer	Specifies an operating mode of VAD in the range of [0 ... 3]. Default is 1.
speech-start-timeout	Time interval [msec]	Specifies how long to wait in transition mode before triggering a start of speech input event.
speech-complete-timeout	Time interval [msec]	Specifies how long to wait in transition mode before triggering an end of speech input event. The complete timeout is used when there is an interim result available.
speech-incomplete-timeout	Time interval [msec]	Specifies how long to wait in transition mode before triggering an end of speech input event. The incomplete timeout is used as long as there is no interim result available. Afterwards, the complete timeout is used.
noinput-timeout	Time interval [msec]	Specifies how long to wait before triggering a no-input event.
input-timeout	Time interval [msec]	Specifies how long to wait for input to complete.
dtmf-interdigit-timeout	Time interval [msec]	Specifies a DTMF inter-digit timeout.
dtmf-term-timeout	Time interval [msec]	Specifies a DTMF input termination timeout.
dtmf-term-char	Character	Specifies a DTMF input termination character.
speech-leading-silence	Time interval [msec]	Specifies desired silence interval preceding spoken input.
speech-trailing-silence	Time interval [msec]	Specifies desired silence interval following spoken input.
speech-output-period	Time interval [msec]	Specifies an interval used to send speech frames to the recognizer.

Name	Unit	Description
save-waveforms	Boolean	Specifies whether to save waveforms or not.
purge-existing	Boolean	Specifies whether to delete existing records on start-up.
max-file-age	Time interval [min]	Specifies a time interval in minutes after expiration of which a waveform is deleted. Set 0 for infinite.
max-file-count	Integer	Specifies the max number of waveforms to store. If reached, the oldest waveform is deleted. Set 0 for infinite.
waveform-base-uri	String	Specifies the base URI used to compose an absolute waveform URI.
waveform-folder	Dir path	Specifies a folder the waveforms should be stored in.
file-prefix	String	Specifies a prefix used to compose the name of the file to be stored. Defaults to 'umsnuancebs-', if not specified.
use-logging-tag	Boolean	Specifies whether to use the MRCP header field Logging-Tag, if present, to compose the name of the file to be stored.

Name	Unit	Description
save-records	Boolean	Specifies whether to save recognition details records or not.
purge-existing	Boolean	Specifies whether to delete existing records on start-up.
max-file-age	Time interval [min]	Specifies a time interval in minutes after expiration of which a record is deleted. Set 0 for infinite.
max-file-count	Integer	Specifies the max number of records to store. If reached, the oldest record is deleted. Set 0 for infinite.
record-folder	Dir path	Specifies a folder to store recognition details records in. Defaults to ${UniMRCPInstallDir}/var.
file-prefix	String	Specifies a prefix used to compose the name of the file to be stored. Defaults to 'umsnuancebs-', if not specified.
use-logging-tag	Boolean	Specifies whether to use the MRCP header field Logging-Tag, if present, to compose the name of the file to be stored.

Name	Unit	Description
refresh-period	Time interval [sec]	Specifies a time interval in seconds used to periodically refresh usage details. See `<usage-refresh-handler>`.

Name	Unit	Description
enable	Boolean	Specifies whether the use of license server is enabled or not. If enabled, the license-file attribute is not honored.
server-address	String	Specifies the IP address or host name of the license server.
certificate-file	File path	Specifies the client certificate used to connect to the license server. File name may include patterns containing a '*' sign. If multiple files match the pattern, the most recent one gets used.
ca-file	File path	Specifies the certificate authority used to validate the license server.
channel-count	Integer	Specifies the number of channels to check out from the license server. If not specified or set to 0, either all available channels or a pool of channels will be checked based on the configuration of the license server.
http-proxy-address	String	Specifies the IP address or host name of the HTTP proxy server, if used.
http-proxy-port	Integer	Specifies the port number of the HTTP proxy server, if used.
security-level	Integer	Specifies the SSL security level, which defaults to 1. Applicable since OpenSSL 1.1.0.

Sensitivity-Level	Vad-Mode
[0.00 ... 0.25)	0
[0.25 … 0.50)	1
[0.50 ... 0.75)	2
[0.75 ... 1.00]	3

Name	Unit	Description
start-of-input	String	Specifies the source of start of input event sent to the client (use "service-originated" for an event originated based on a first-received interim result and "internal" for an event determined by plugin).
alternatives-below-threshold	Boolean	Specifies whether to return speech recognition result alternatives with the confidence score below the confidence threshold.
speech-start-timeout	Time interval [msec]	Specifies how long to wait in transition mode before triggering a start of speech input event.
skip-empty-results	Boolean	Specifies whether to implicitly initiate a new gRPC request if the current one completes with an empty result.
interim-result-timeout	Time interval [msec]	Specifies a timeout between interim results containing transcribed speech. If the timeout is elapsed, input is considered complete.
logging-tag	String	Specifies the logging tag.
tag-format	String	Specifies the format of the instance element to be returned.
service-uri	String	Specifies the service endpoint uri.
topic	String	Specifies the topic field in RecognitionParameters.
speech-domain	String	Specifies the speech_domain field in RecognitionParameters.
formatting-scheme	String	Specifies the formatting scheme field in RecognitionParameters.
auto-punctuate	Boolean	Specifies whether to enable automatic punctuation.
filter-profanity	Boolean	Specifies whether to filter profanities in recognition result.
include-tokenization	Boolean	Specifies whether to include tokenized recognition result.
discard-speaker-adaptation	Boolean	Specifies whether to discard updated speaker data, if speaker profiles are used. By default, data is stored.
suppress-call-recording	Boolean	Specifies whether to redact transcription results in the call logs and disable audio capture.
mask-load-failures	Boolean	Specifies whether errors loading external resources shall not terminate recognition.
suppress-initial-capitalization	Boolean	Specifies whether to suppress automatic capitalization of the first word in a sentence.
allow-zero-base-lm-weight	Boolean	Specifies whether custom resources (DLMs, wordsets, and others) can use the entire weight space.
filter-wakeup-word	Boolean	Specifies whether to remove the wakeup word from the final result.
end-stream-no-valid-hypotheses	Boolean	Determines whether the dialog application or the client application handles the dialog flow when ASRaaS does not return a valid hypothesis. Can be overridden by client.
selector-channel	String	Specifies the selector channel used in the Start request. Can be overridden by client.
selector-library	String	Specifies the selector library used in the Start request. Can be overridden by client.
session-timeout	Time interval [sec]	Specifies the session timeout used in the Start request. Can be overridden by client.
model-uri	String	Specifies the model URI used in the Start request. Can be overridden by client.
model-type	String	Specifies the model type used in the Start request. Can be overridden by client.
recognition-resources-json	String	Specifies the resources field set in RecognitionInitMessage. The value of this parameter must transparently be specified in JSON.
start-request-payload-json	String	Specifies the payload of the Start request. The value of this parameter must transparently be specified in JSON.
execute-request-payload-json	String	Specifies the payload of the Execute request. The value of this parameter must transparently be specified in JSON.
client-data.*	String	Specifies transparent name/value parameters set in the client_data field in RecognitionInitMessage. The name must start with a prefix "client-data.".
user-id	String	Specifies the user_id field set in RecognitionInitMessage.

¶ 1 Overview

¶ 1.1 Installation

¶ 1.2 Applicable Versions

¶ 2 Supported Features

¶ 2.1 MRCP Methods

¶ 2.2 MRCP Events

¶ 2.3 MRCP Header Fields

¶ 2.4 Grammars

¶ 2.5 Results

¶ 3 Configuration Format

¶ 3.1 Document

¶ 3.2 Streaming Recognition

¶ 3.3 Results

¶ 3.4 Speech and DTMF Input Detector

¶ 3.5 Utterance Manager

¶ 3.6 RDR Manager

¶ 3.7 Monitoring Agent

¶ 3.8 Usage Change Handler

¶ 3.9 Usage Refresh Handler

¶ 3.10 License Server

¶ 4 Configuration Steps

¶ 4.1 Using Default Configuration

¶ 4.2 Starting Conversation

¶ 4.3 Stoping Conversation

¶ 4.4 Kicking off Conversation

¶ 4.4 Interacting with User

¶ 4.5 Specifying Recognition Language

¶ 4.6 Specifying Sampling Rate

¶ 4.7 Specifying Speech Input Parameters

¶ 4.8 Specifying DTMF Input Parameters

¶ 4.6 Specifying No-Input and Recognition Timeouts

¶ 4.9 Specifying Speech Recognition Mode

¶ 4.10 Specifying Vendor Specific Parameters

¶ 4.11 Specifying Recognition Resources

¶ External Reference

¶ Inline Wordset

¶ Builtin

¶ Wakeup Word

¶ 4.12 Specifying Client Data

¶ 4.13 Specifying User ID

¶ 4.14 Maintaining Utterances

¶ 4.15 Maintaining Recognition Details Records

¶ 5 Recognition Grammars and Results

¶ 5.1 Using Built-in Speech Contexts

¶ 5.1 Using Built-in Speech Transcription

¶ 5.2 Using Built-in DTMF Grammars

¶ 6 Monitoring Usage Details

¶ 6.1 Log Usage

¶ 6.2 Update Usage

¶ 6.3 Dump Channels

¶ 7 Usage Examples

¶ 7.1 Conversation Flow

¶ 8 Sequence Diagrams

¶ 8.1 MRCPv1

¶ 8.2 MRCPv2

¶ 9 Security Considerations

¶ 9.1 Network Connection

¶ 10 References

¶ 10.1 Nuance

¶ 10.2 Specifications