Mar 28, 2022
Google Speech Recognition (GSR) Plugin 1.23.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google APIs 1.5.0
gRPC 1.30.3
Protobuf 3.12.2
The plugin supports the following Google Speech-to-Text API versions:
v1
v1p1beta1
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.10-1.el7.x86_64.rpm)
Red Hat / CentOS 8 (unimrcp-gsr-1.7.10-1.el8.x86_64.rpm)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.10-bionic_amd64.deb)
Ubuntu 20.04 LTS (unimrcp-gsr_1.7.10-focal_amd64.deb)
This release provides a few improvements and fixes.
The detailed list of changes introduced in this release follows.
- Added support for sorting result alternatives based on confidence score (the highest first).
- Made separator of built-in grammar parameters configurable.
- Set the SSL security level to 1 by default. This setting applies to the use of the license server on Ubuntu 20.04 and does not affect other supported distributions.
- Fixed a possible memory leak occurred when two file entries (waveforms and/or RDRs) have the same creation time, which is a very rare occasion not involved in the regular operation.
- Added a new attribute 'sort-alternatives' to the element 'streaming-recognition'.
- Added a new attribute 'grammar-param-separator' to the element 'streaming-recognition'.
- Added a new attribute 'security-level' to the element 'license-server'.
- Updated the Usage Guide to reflect the changes introduced in this release.
Dec 20, 2021
Google Speech Recognition (GSR) Plugin 1.22.1 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google APIs 1.5.0
gRPC 1.30.3
Protobuf 3.12.2
The plugin supports the following Google Speech-to-Text API versions:
v1
v1p1beta1
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.9-1.el7.x86_64.rpm)
Red Hat / CentOS 8 (unimrcp-gsr-1.7.9-1.el8.x86_64.rpm)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.9-bionic_amd64.deb)
Ubuntu 20.04 LTS (unimrcp-gsr_1.7.9-focal_amd64.deb)
This is a patch release which fixes an issue in processing DTMFs and speech concurrently.
The detailed list of changes introduced in this release follows.
- Skip "Audio Timeout Error" message received from Google when the caller inputs a long sequence of DTMFs while streaming of audio data chunks to Google is in-progress. Such a condition could cause an empty NLSML result having set in the RECOGNITION-COMPLETE event.
Nov 1, 2021
Google Speech Recognition (GSR) Plugin 1.22.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google APIs 1.5.0
gRPC 1.30.3
Protobuf 3.12.2
The plugin supports the following Google Speech-to-Text API versions:
v1
v1p1beta1
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.8-1.el7.x86_64.rpm)
Red Hat / CentOS 8 (unimrcp-gsr-1.7.8-1.el8.x86_64.rpm)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.8-bionic_amd64.deb)
Ubuntu 20.04 LTS (unimrcp-gsr_1.7.8-focal_amd64.deb)
This release adds support for triggering a webhook upon completion of speech transcription. The release also provides a few fixes and improvements in the existing functionality.
The detailed list of changes introduced in this release follows.
- Added support for triggering a webhook upon completion of the speech transcription. The content received back as a result of the webhook invocation is set in the instance element of the NLSML result in the RECOGNITION-COMPLETE event.
- Output license and maintenance expiration dates to status files, if/when applicable.
- If a DTMF digit is detected after the end-of-utterance event is received with no prior transcription results but before the gRPC call is complete, then such an occurrence could cause license depletion.
- Fixed a possible segfault on processing of the header field Logging-Tag having an empty [no] value.
- Added a new element 'webhook' to the document 'umsgsr'.
- Updated the Usage Guide to reflect the changes introduced in this release.
Aug 9, 2021
Google Speech Recognition (GSR) Plugin 1.21.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google APIs 1.5.0
gRPC 1.30.3
Protobuf 3.12.2
The plugin supports the following Google Speech-to-Text API versions:
v1
v1p1beta1
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.7-1.el7.x86_64.rpm)
Red Hat / CentOS 8 (unimrcp-gsr-1.7.7-1.el8.x86_64.rpm)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.7-bionic_amd64.deb)
Ubuntu 20.04 LTS (unimrcp-gsr_1.7.7-focal_amd64.deb)
This release is built against a newer version of Google APIs and a patched version of the gRPC library. The release supports additional parameters introduced in the v1p1beta1 API.
The detailed list of changes introduced in this release follows.
- Implemented the SRGS XML grammar support for DTMF input too.
- Added support for word-level confidence available in v1p1beta1 only.
- Added support for spoken punctuation and emojis available in v1p1beta1 only.
- Added initial support for the new model adaptation concept available in v1p1beta1 only.
- Set the audience on requests sent to regional service endpoints.
- Added new attributes 'word-confidence', 'spoken-punctuation' and 'spoken-emojis' to the element 'streaming-recognition'.
- The entire data structure received from the service is now logged in the JSON format.
- Patched the gRPC library to allow the user application to set audience/scope on requests sent to Google.
- Upgraded Google APIs from unigoogleapis-1.4.0 to unigoogleapis-1.5.0.
- Updated the Usage Guide to reflect the changes introduced in this release.
Apr 26, 2021
Google Speech Recognition (GSR) Plugin 1.20.1 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google APIs 1.4.0
gRPC 1.30.2
Protobuf 3.12.2
The plugin supports the following Google Speech-to-Text API versions:
v1
v1p1beta1
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.6-1.el7.x86_64.rpm)
Red Hat / CentOS 8 (unimrcp-gsr-1.7.6-1.el8.x86_64.rpm)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.6-bionic_amd64.deb)
Ubuntu 20.04 LTS (unimrcp-gsr_1.7.6-focal_amd64.deb)
This is a patch release which fixes support for the inter-result timeout.
The detailed list of changes introduced in this release follows.
- Fixed support for the inter-result timeout. The regression was introduced in GSR 1.19.0 version.
Mar 5, 2021
Google Speech Recognition (GSR) Plugin 1.20.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google APIs 1.4.0
gRPC 1.30.2
Protobuf 3.12.2
The plugin supports the following Google Speech-to-Text API versions:
v1
v1p1beta1
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.5-1.el7.x86_64.rpm)
Red Hat / CentOS 8 (unimrcp-gsr-1.7.5-1.el8.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.7.5-xenial_amd64.deb)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.5-bionic_amd64.deb)
The Google Speech-to-Text API version as well as account credentials can now be specified by the user application per MRCP session. The release is built against a newer version of Google APIs.
The detailed list of changes introduced in this release follows.
- Allow the user application to specify the parameters 'gapp-credentials-file', 'service-uri', 'api', 'region' per MRCP session. The parameters set in the first RECOGNIZE request are observed for creation of a gRPC channel.
- Added new attributes 'service-uri' and 'region' to the element 'streaming-recognition'.
- Upgraded Google APIs from unigoogleapis-1.3.0 to unigoogleapis-1.4.0.
- Updated the Usage Guide to reflect the changes introduced in this release.
Oct 15, 2020
Google Speech Recognition (GSR) Plugin 1.19.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google Speech-to-Text API v1
gRPC 1.30.2
Protobuf 3.12.2
The plugin supports the following Google Speech-to-Text API versions:
v1
v1p1beta1
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.4-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.7.4-xenial_amd64.deb)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.4-bionic_amd64.deb)
This release introduces support for multiple versions of Speech-to-Text API. The API version can be specified from the plugin configuration or via a server profile. The release addresses a possible license depletion case. The release is built against a newer version of Google APIs.
Please note that the entities MRCP channel and gRPC stub are now decoupled to allow the channel to be associated to different versions of gRPC stub at run-time. In former versions, the plugin was pre-build using a specific version of gRPC stub.
Everyone is advised to fully revalidate the existing functionality in a test environment before using this version of the plugin in production.
The detailed list of changes introduced in this release follows.
- Introduced support for multiple versions of Speech-to-Text API, including v1 and v1p1beta1 APIs. The API version can be specified from the plugin configuration or via a server profile.
- Added support for speech adaptation boost available for v1p1beta1 only.
- Added support for alternate languages available for v1p1beta1 only.
- Added support for different formats of the instance element in NLSML which can be set via a new attribute 'tag-format'.
- Fixed a possible license depletion occurred when an end-of-utterance event is received without preceding interim results but with a follow-up final result. This problem was introduced in GSR 1.15.0
- Added a new attribute 'api' to the element 'streaming-recognition'. The attribute defaults to 'v1' and also accepts 'v1p1beta1' value.
- Added a new attribute 'tag-format' to the element 'streaming-recognition'. The attribute can be set to either 'default' (the original behavior) or 'semantics/xml' and 'semantics/json'.
- Upgraded Google APIs from unigoogleapis-1.2.0 to unigoogleapis-1.3.0.
- Updated the Usage Guide to reflect the changes introduced in this release.
Aug 20, 2020
Google Speech Recognition (GSR) Plugin 1.18.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google Speech-to-Text API v1
gRPC 1.30.2
Protobuf 3.12.2
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.3-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.7.3-xenial_amd64.deb)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.3-bionic_amd64.deb)
This release is built against newer versions of gRPC, Protobuf and Google APIs. The release adds support for speaker diarization parameters.
The detailed list of changes introduced in this release follows.
- Allow logging tag to be specified via built-in grammar.
- Added support for speaker diarization parameters.
- Added new attributes 'speaker-diarization', 'min-speaker-count', 'max-speaker-count' to the element 'streaming-recognition'.
- Changed default values of 'speech-incomplete-timeout' from 3000 msec to 15000 msec and 'input-timeout' from 10000 msec to 30000 msec.
- Upgraded the gRPC library from 1.20.0 to 1.30.2 version.
- Upgraded the Protobuf library from 3.7.0 to 3.12.2 version.
- Upgraded Google APIs from unigoogleapis-1.1.0 to unigoogleapis-1.2.0.
- Updated the Usage Guide to reflect the changes introduced in this release.
Apr 10, 2020
Google Speech Recognition (GSR) Plugin 1.17.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.7.0
Google Speech-to-Text API v1
gRPC 1.20.0
Protobuf 3.7.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.7.1-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.7.1-xenial_amd64.deb)
Ubuntu 18.04 LTS (unimrcp-gsr_1.7.1-bionic_amd64.deb)
This release introduces support for multiple Google account credentials. The release is built against an upgraded version of Google APIs.
The detailed list of changes introduced in this release follows.
- Added support for multiple Google account credentials, which can be specified either per server profile or per MRCP session via feature tags.
- Fixed processing of phrase hints (data in SRGS XML item element) spanning multiple lines.
- Upgraded Google APIs (unigoogleapis-1.1.0).
- Class tokens can now be specified as a text of 'item' elements in an SRGS XML grammar or 'phrase' elements in 'speech-context'.
- Added a sample speech context 'time' to the default configuration which makes use of the class token $TIME. The sample speech context is disabled by default.
- Speech adaptation boost is NOT supported in this release, since, as of now, this feature is available in Google Speech-to-Text API v1p1beta1 only, but not v1.
- Updated the Usage Guide to reflect the changes introduced in this release.
Mar 2, 2020
Google Speech Recognition (GSR) Plugin 1.16.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.6.0
Google Speech-to-Text API v1
gRPC 1.20.0
Protobuf 3.7.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.6.7-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.6.7-xenial_amd64.deb)
Ubuntu 18.04 LTS (unimrcp-gsr_1.6.7-bionic_amd64.deb)
This release improves support for speech recognition performed with barge-in-able prompts and introduces a new feature of completing input based on inter-result timeout.
The detailed list of changes introduced in this release follows.
- Reinitiate gRPC StreamingDetectIntent request if the current one completes with empty results. The behavior can be controlled by a new boolean parameter 'skip-empty-results', which can be overridden per individual request.
- Added support for inter-result timeout. If the specified timeout is elapsed, input is considered complete. The timeout defaults to 0 (disabled) and can be overridden per recognition request.
- When a STOP request is received or no-input timeout is elapsed, cancel the ongoing gRPC request, if any, in order to complete the operation straightaway, instead of waiting for normal completion of gRPC streaming by sending the final data chunk.
- Added a new attribute 'skip-empty-results' to the element 'streaming-recognition'. The attribute defaults to 'true'.
- Added a new attribute 'inter-result-timeout' to the element 'streaming-recognition'.
- Updated the Usage Guide to reflect the changes introduced in this release.
Nov 22, 2019
Google Speech Recognition (GSR) Plugin 1.15.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.6.0
Google Speech-to-Text API v1
gRPC 1.20.0
Protobuf 3.7.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.6.6-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.6.6-xenial_amd64.deb)
Ubuntu 18.04 LTS (unimrcp-gsr_1.6.6-bionic_amd64.deb)
This release introduces numerous supplementary features and enhancements to the existing functionality.
The detailed list of changes introduced in this release follows.
- Made service URI configurable mostly to allow to specify not only the address but also the port number of the service, which is required when UniMRCP server is behind TMG HTTP proxy.
- If a configuration parameter 'use-logging-tag' is set to 'true', the header field Logging-Tag, if specified, is used as a suffix while composing file names of utterances and RDRs.
- If a configuration parameter 'stream-creation-timeout' is specified, a timer is set to track gRPC stream creation. If the service is unavailable or cannot be reached due to a network problem, the timer allows to respond with an error in a timely manner without waiting for expiration of the default gRPC deadline.
- Implemented redirection of logs produced by the gRPC library. This feature is disabled by default and can be controlled by new configuration parameter 'grpc-log-redirection', 'grpc-log-verbosity' and 'grpc-log-trace'.
- Added support for HTTP proxy in communication with license servers available as a service.
- Added support for certain vendor-specific parameters, including 'speech-start-timeout'. See section 4.7 in the Usage Guide.
- Made default SRGS XML scope configurable. By default, the scope is considered as 'strict' but can implicitly be used as 'hint', if a new configuration parameter 'match-srgs' is set to 'false'.
- Compose the header field Waveform-URI based on the protocol version. Before, the format defined in MRCPv2 was used unconditionally.
- Added a new attribute 'service-uri' to the element 'streaming-recognition'. The attribute defaults to 'speech.googleapis.com' and can also be set to 'speech.googleapis.com:443'.
- Added new attribute 'match-srgs' to the element 'streaming-recognition'. The attribute defaults to 'true'.
- Added a new attribute 'stream-creation-timeout' to the element 'streaming-recognition'. The attribute defaults to 0 (unset).
- Added new attributes 'grpc-log-redirection', 'grpc-log-verbosity' and 'grpc-log-trace' to the element 'streaming-recognition'.
- Added new attributes 'http-proxy-address' and 'http-proxy-port' to the element 'license-server'.
- Added new attribute 'use-logging-tag' to the elements 'utterance-manager' and 'rdr-manager'.
- Updated the Usage Guide to reflect the changes introduced in this release.
May 9, 2019
Google Speech Recognition (GSR) Plugin 1.14.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.6.0
Google Speech-to-Text API v1
gRPC 1.20.0
Protobuf 3.7.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.6.5-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.6.5-xenial_amd64.deb)
Ubuntu 18.04 LTS (unimrcp-gsr_1.6.5-bionic_amd64.deb)
This release has been built against newer versions of gRPC, Protobuf and Google APIs and adds support for numerous new parameters settable in the RecognitionConfig message. The release also introduces binaries for Ubuntu 18.04 LTS.
The detailed list of changes introduced in this release follows.
- Added support for new parameters settable in the RecognitionConfig message such as 'profanity-filter', 'word-time-offsets', 'auto-punctuation', 'use-enhanced', 'model'. The parameters can be set globally in umsgsr.xml and be specified per recognition request either via vendor-specific parameters or optional attributes passed to a built-in grammar or via metadata set in an SRGS XML grammar. See the Usage Guide.
- Added support for the content type 'text/grammar-ref-list'.
- Do not set speech/result flag if the detector is already in the complete state. This could result in an attempt to send another audio chunk, when the input completion was already signaled.
- Having the gRPC library upgraded allows to specify different Google service credentials per GSR/GSS/GDF plugins loaded in to the same instance of UniMRCP server.
- Added new configuration parameters 'profanity-filter', 'word-time-offsets', 'auto-punctuation', 'use-enhanced', 'model'.
- Upgraded the gRPC library from 1.7.3 to 1.20.0 version.
- Upgraded the Protobuf library from 3.4.0 to 3.7.0 version.
- Introduced a new Googleapis library 1.0.0 containing the Google APIs used by UniMRCP server plugins.
- Updated the Usage Guide to reflect the changes introduced in this release.
Apr 3, 2019
Google Speech Recognition (GSR) Plugin 1.13.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.6.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.6.3-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.6.3-xenial_amd64.deb)
This release addresses an interoperability issue with Cisco Broadworks media server and also brings up a few minor improvements.
The detailed list of changes introduced in this release follows.
- Allow to explicitly set the 'speech-complete' flag in SRGS XML grammars by means of a new meta tag 'speech-complete'. If the 'speech-complete' flag is not explicitly set in an SRGS XML grammar or a speech context, then the parameter defaults to 'false' for speech contexts used as a hint, and to 'true' otherwise.
- Make sure START-OF-INPUT is sent before sending RECOGNITION-COMPLETE with a completion cause set to 'no-match' or 'success'. Fixed interoperability with Cisco Broadworks media server.
- When license server is in use, fixed processing of a connection hang-up event occurred when a license refresh request is being sent to the license server. This event could result in a few seconds-long outage of the service.
- Changed the default value of the attribute 'speech-start-timeout' from 300 ms to 50 ms.
Feb 21, 2019
Google Speech Recognition (GSR) Plugin 1.12.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.6.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.6.2-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.6.2-xenial_amd64.deb)
This release fixes the use of annual node-bound licenses, and an issue in interoperability with Aspect Prophecy. There are other minor improvements introduced in the release.
The detailed list of changes introduced in this release follows.
- Added support for an optional language parameter passed to a built-in grammar. Fixed interoperability with Aspect Prophecy. See the Section 4.2 in the Usage Guide.
- The HTTP proxy URI, if used, can now be specified from configuration instead of an environment variable.
- Set an alarm in the status file if the license server is not reachable for a certain period of time, but the service is not yet affected. Clear the alarm as soon as the license server becomes available. See Section 6.2 in the Usage Guide.
- Fixed grammar reference in an NLSML result when using SRGS XML for DTMF.
- Fixed confidence level set in an NLSML result for DTMF to conform to the format specified in configuration.
- Fixed processing of malformed parameters passed to a built-in grammar.
- If an annual node-bound license is used, the expiration time of the license could be provisioned incorrectly, requiring a restart of the service in order to continue normal operation.
- Added a new attribute 'http-proxy' to the element 'streaming-recognition'.
- Updated the Usage Guide to reflect the changes introduced in this release.
Dec 21, 2018
Google Speech Recognition (GSR) Plugin 1.11.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.6.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7 (unimrcp-gsr-1.6.1-1.el7.x86_64.rpm)
Ubuntu 16.04 LTS (unimrcp-gsr_1.6.1-xenial_amd64.deb)
This release provides numerous SRGS-related enhancements, primarily intended to address interoperability issues with VoiceXML platforms having a limited ability of referencing built-in grammars.
The detailed list of changes introduced in this release follows.
- Added support for predefined metadata in SRGS XML grammars that allows to reference a built-in grammar and/or specify various input parameters. See Sections 4.2, 5.1, 5.2, 5.3, 5.4 in the Usage Guide.
- A speech context is now can be dynamically specified based on the 'one-of' construct in SRGS XML grammar. See Section 5.2 in the Usage Guide.
- A no-match event is triggered if transcription result does not literally match any phrase specified in SRGS XML.
- Updated the Usage Guide to reflect the changes introduced in this release.
Oct 24, 2018
Google Speech Recognition (GSR) Plugin 1.10.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release fixes a few minor issues and allows for new alternate ways of specifying recognition language and referencing speech context.
The detailed list of changes introduced in this release follows.
- Implemented an alternate way of referencing built-in grammars via 'special' ruleref in SRGS XML grammars. See Section 5.1 in the Usage Guide.
- Added support for 'xml:lang' attribute optionally specified in SRGS XML grammars. This method has the highest precedence in selection of recognition language. The language can be otherwise specified either by the global configuration parameter or the header field 'Speech-Language'. See Section 4.2 in the Usage Guide.
- Added a new attribute 'language' to individual speech contexts, which allows to define speech contexts for various languages regardless of the global configuration parameter.
- Speech input is considered complete only when an interim result uniquely matches one of the phrases in the used speech context.
- Fixed concatenation of received results when single-utterance is set to false.
- All the internal common class names are declared under a unique namespace not to cause a conflict with other speech recognition plugins being loaded in to the same instance of UniMRCP server.
- In case of DTMF input, allow both 'maxlength' and 'term char' be specified at the same time. Before, 'malxength' was not observed, if 'term char' was specified.
- Added a new attribute 'language' to individual speech contexts.
- Fixed the digit format output in a log statement 'Detected Start of Event'.
- Updated the Usage Guide to reflect the changes introduced in this release.
Aug 10, 2018
Google Speech Recognition (GSR) Plugin 1.9.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
Speech contexts now can optionally be used not only as a hint but also a basic grammar. If the new attribute 'speech-complete' is set to true, then input will complete as soon as an interim result matches one of the phrases specified in the speech context. Each phrase may also be assigned an arbitrary tag to be set as an instance in the returned NLSML results.
This release also allows to specify the format of the confidence score returned in the NLSML results via a new configuration parameter 'confidence-format'. By default, the configuration parameter is set to 'auto', which means the format is implicitly determined based on the version of the protocol being used. In order to use the floating point format in the range of 0..1 consistently, the parameter must be set to 'mrcpv2'; and for the integer format in the range of 0..100 - 'mrcpv1'.
There is also a new configuration parameter 'alternatives-below-threshold', which allows to control whether or not to include alternatives with a confidence score below the specified threshold in the returned NLSML results.
Everyone is encouraged to upgrade.
The detailed list of changes introduced in this release follows.
- Speech contexts now can optionally be used not only as a hint but also a basic grammar.
- Allow to specify the format of the confidence score returned in the NLSML results based on a new configuration parameter.
- Allow to control whether or not to include alternatives with a confidence score below the specified threshold in the returned NLSML results.
- Keep track of the maximum number of channels used concurrently. This number is logged by default within the statement 'GSR Usage' and can also be written in the status file, if enabled.
- Fixed an issue in the SDI detector, encountered when 'speech-incomplete-timeout' is significantly longer than 'speech-complete-timeout'.
- Fixed processing of the header field 'Recognition-Timeout' to set the SDI 'speech-input-timeout' accordingly. Only the global configuration parameter was in effect before.
- Added a new attribute 'confidence-format' to the element 'streaming-recognition', which defaults to 'auto' and also accepts 'mrcpv2' and 'mrcpv1'.
- Added a new boolean attribute 'alternatives-below-threshold' to the element 'streaming-recognition', which defaults to 'false'.
- Updated the Usage Guide to reflect the changes introduced in this release.
Jul 25, 2018
Google Speech Recognition (GSR) Plugin 1.8.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release addresses interoperability with Cisco Unified Customer Voice Portal (CVP).
The release also fixes a few major issues encountered under certain circumstances and introduced in the previous 1.7.0 version of the plugin.
Everyone is encouraged to upgrade.
The detailed list of changes introduced in this release follows.
- Made the name of the built-in speech transcription grammar configurable, which defaults to 'transcribe'.
- Allow 'builtin:speech/name′and′builtin:grammar/name' to be used interchangeably.
- Accept SRGS speech grammars without processing of any rules defined in the grammar.
- When the configuration parameter 'start-of-input' is set to 'service-originated', then no-input timer might have not been stopped upon start of speech. This issue was introduced in 1.7.0 release.
- When the configuration parameter 'start-of-input' is set to 'internal', then under certain circumstances, this could have caused license depletion. This issue was introduced in 1.7.0 release.
- Escaped XML control characters which may appear in transcription results. For instance, "AT&T" will be replaced with "AT&T".
- Added a new attribute 'skip-unsupported-grammars', which defaults to 'true'.
- Added a new attribute 'transcription-grammar', which defaults to 'transcribe'.
- Changed grammar referencing routine to be, by default, tolerant to malformed and not supported grammars.
Jun 11, 2018
Google Speech Recognition (GSR) Plugin 1.7.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release provides an alternate way for start of input detection, derived from a first interim result. This behavior is configurable via a new parameter 'start-of-input'. The parameter can be set either to 'service-originated', which is the new and default behavior, or to 'internal', which is the old behavior, based on internal speech activity detector.
Also, when both the speech and DTMF detectors are activated, the DTMF detector remains active until a first interim result of speech transcription becomes available. In order to achieve this behavior, the new configuration parameter 'start-of-input' must be set to 'service-originated'.
The detailed list of changes introduced in this release follows.
- Added support for the start of input event being derived from a first interim result received from the service.
- Added a new attribute 'start-of-input' to the element 'streaming-recognition', which is set to 'service-originated' by default.
- Changed the default value of the attribute 'interim-results' from 'false' to 'true'.
- Changed the default value of the attribute 'speech-incomplete-timeout' from 1000 to 3000.
- Changed the default configuration parameters to improve barge-in experience.
- Improved the speech and DTMF input detector.
- Updated the Usage Guide accordingly.
May 23, 2018
Google Speech Recognition (GSR) Plugin 1.6.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release fixes a specific problem in communication with the license server.
The detailed list of changes introduced in this release follows.
- When a license refresh request, sent to the license server, timed out due to a sporadic network problem, the plugin would have permanently remained out of service until next restart of the UniMRCP server. Now, the behavior has been fixed to reattempt the connection to the license server in such an event, without causing any interruption in service.
- A new entry 'license permit' has been added to the usage status file. The parameter is set either to 'true' or 'false' and indicates the current status of the license enforcement.
- The plugin configuration will no longer be overwritten while upgrading the .deb package. The RPM package was not affected.
Apr 9, 2018
Google Speech Recognition (GSR) Plugin 1.5.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release adds support for Speech-Incomplete-Timeout and also provides a few minor improvements.
The detailed list of changes introduced in this release follows.
- Added support for the MRCP header field Speech-Incomplete-Timeout, which is observed only when the attribute 'interim-results' is set to 'true'.
- Made indents in generated NLSML results configurable.
- Added a new attribute 'speech-incomplete-timeout' to the element 'speech-dtmf-input-detector', which is set to 1000 msec by default.
- Changed the default value of the configuration parameter 'vad-mode' from 1 to 2.
- Added a new attribute 'results-indent' to the element 'streaming-recognition', which is set to 2 by default.
- Added session identifier to all the statements logged by speech input detector.
Dec 21, 2017
Google Speech Recognition (GSR) Plugin 1.4.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.7.3
Protobuf 3.4.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release adds support for continuous speech recognition and has gRPC and Protobuf libraries upgraded to newer versions.
The detailed list of changes introduced in this release follows.
- Added support for continuous speech recognition, which needs to be enabled from configuration. Recognition runs in the single utterance mode by default.
- Added a new attribute 'single-utterance' to the element 'streaming-recognition', which is set to 'true' by default.
- Updated Section 3.2 Streaming Recognition.
- Added Section 4.7 Specifying Speech Recognition Mode.
Sep 18, 2017
Google Speech Recognition (GSR) Plugin 1.3.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.5.0
Google Speech API v1
gRPC 1.3.1
Protobuf 3.2.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release introduces reporting of recognition details records, adds usage monitoring capabilities and also provides a couple of minor fixes in the existing functionality.
The detailed list of changes introduced in this release follows.
- Added reporting of recognition details records. This feature is optional and needs to be enabled from configuration.
- Added usage details reports. The number of in-use and total licensed channels can be maintained in logs and also be written to a separate status file.
- Reworked configuration parameters defining lifetime of saved utterances.
- When recognition completes normally but with no results returned from the Google Cloud Speech service, the header field Completion-Cause is set to "001 no-match" instead of "006 recognizer-error".
- Fixed the length of stored wave files which may be cut from the end and not contain the whole audio data actually sent to the Google Cloud Speech service. The recognition was not affected.
- Added a new element 'rdr-manager' and attributes related to the configuration of recognition details records.
- Added a new element 'monitoring-agent' and attributes related to the configuration of status usage monitoring.
- Added new attributes 'purge-existing', 'max-file-age' and 'max-file-count' to the element 'utterance-manager'.
- Removed no longer used attributes 'purge-waveforms', 'purge-interval' and 'expiration-time' from the element 'utterance-manager'.
- Updated Section 3.6 Utterance Manager.
- Added Section 3.7 RDR Manager.
- Added Section 3.8 Monitoring Agent.
- Added Section 3.9 Usage Change Handler.
- Added Section 3.10 Usage Refresh Handler.
- Updated Section 4.7 Maintaining Utterances.
- Added Section 4.8 Maintaining Recognition Details Records.
- Added Section 4.9 Monitoring Usage Details.
Aug 23, 2017
Google Speech Recognition (GSR) Plugin 1.2.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.4.0 (note that binaries for 1.5.0 are coming up next week)
Google Speech API v1
gRPC 1.3.1
Protobuf 3.2.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release fixes interoperability with Genesys Voice Portal (GVP) by supporting built-in grammars being referenced via DEFINE-GRAMMAR requests.
A major addition in this release is support for a license server having the following main capabilities.
- Allows to store and stuck up purchased licenses.
- Supports a flexible distribution of licenses over secure network connection among multiple nodes, instances of the GSR plugin.
- Available as a service and also can be installed on-premises.
For more information regarding the license server, please contact services@unimrcp.org.
Everyone is encouraged to upgrade.
The detailed list of changes introduced in this release follows.
- Added support for license server.
- Added support for built-in grammars being referenced via DEFINE-GRAMMAR requests.
- Fixed interoperability with Genesys Voice Portal (GVP).
- Added new elements and attributes related to the configuration of license server.
Jun 19, 2017
Google Speech Recognition (GSR) Plugin 1.1.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.4.0
Google Speech API v1
gRPC 1.3.1
Protobuf 3.2.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This release adds support for speech contexts, which can optionally be provided to the recognizer to improve the recognition accuracy. Consult the updated Usage Guide on how to define and reference speech contexts.
The detailed list of changes introduced in this release follows.
- Added support for built-in and dynamically loadable speech contexts.
- Made operating mode of VAD configurable.
- Fixed support for consecutive RECOGNIZE requests in an MRCP session.
- Fixed processing of error cases where gRPC methods may fail.
- Fixed input statistics provided on/after STOP.
- Added new elements and attributes related to the configuration of speech contexts.
- Added a new attribute 'vad-mode' in the element 'speech-dtmf-input-detector'.
¶
MRCP Methods and Header Fields
- Added support for the method DEFINE-GRAMMAR, which is used by the MRCP client to dynamically specify a speech context to the MRCP server.
- Added support for the header field Sensitivity-Level, which is used to adjust the operating mode of VAD.
- Set the header field Waveform-URI in RECOGNITION-COMPLETE events and STOP responses, if requested by the MRCP client.
- Added a new log statement providing the GSR plugin version on initial load.
- Updated the Usage Guide to provide guidance for all the relevant changes made in this release.
May 22, 2017
Google Speech Recognition (GSR) Plugin 1.0.0 to the UniMRCP Server (UMS) has been released.
The plugin is based on the following components:
UniMRCP Server 1.4.0
Google Speech API v1
gRPC 1.3.1
Protobuf 3.2.0
The binaries are currently available for the following Linux distributions:
Red Hat / CentOS 7
Ubuntu 16.04 LTS
This is one of the most requested integrations allowing IVR platforms to utilize the Google Cloud Speech services via MRCP.
The recognition accuracy is very impressive for a huge number of languages supported. The connection to the globally accessible services is maintained via gRPC, which ensures secure, reliable and fast transmission of data over the Internet.