One of the drawbacks of the Amazon Alexa is that there is no programatic way to make Alexa talk if a cloudless solution is desired where Alexa is to be used directly without a helper Amazon hosted application. You can talk to Alexa, and play the "Simon says" game, where Alexa will repeat a phrase spoken to it, but there is no software available to type a phrase and to make Alexa speak it.
There are many possible applications that could benefit from such a feature - notably, Alexa works well as a notifier, it already packs applications that are able to trigger reminders, alarms and even shopping alarms. So, why not extend Alexa to be able to send your own notifications for Alexa to utter them out loud. For instance, monitoring server status and using Alexa as a vocal notifier when a service state changes.
The entire setup described in this article consists in using good old text-to-speech (TTS) on a small Raspberry Pi board, coupled together with an Alexa remote that contains a microphone for the Raspberry Pi to speak into.
W
model is preferred due to its wireless module that can provide convenient connectivity. The price for a bare-bones Raspberry Pi Zero W is about USD25.the cost of a second-hand Alexa remote varies between USD10-18.
The following steps are roughly followed:
The Audio Injector uses only the I2S GPIO pins and the rest are free to use. Note that the Amazon Alexa remote used for this project has 6 buttons and that the entire remote could be wired up. In the end, for the purpose of this project, only a single pin needs to be used in order to open the microphone.
The software setup consists in installing an operating system onto the Raspberry Pi, configuring the Audio Injector via ALSA and making sure that it works properly by playing some sample sounds, installing some TTS software and, finally, installing the provided software to subscribe to an MQTT server and wait for phrases to speak.
So, let's get this right: you want to make a machine talk by making another machine ironically pretend that it is a legitimate human.
Indeed, one of the problems with this setup is that the TTS software is ironically more comprehensible to a human than to Alexa. Some of the good choices of TTS software is, in order, pico
, festival
and, lastly Google - although the latter will just create an additional dependency on Google providing a consistent service. . .
To use the provided software, subversion
, nodejs
, npm
and mpg123
must be installed:
aptitude install subversion nodejs npm
After which, the sources can be checked out from the Wizardry and Steamworks SVN server:
svn co http://svn.grimore.org/alexatts
The config.yml.dist
file has to be renamed to config.yml
and then edited to point to the MQTT server to subscribe to and to set the GPIO pin that is used for the microphone button. Additionally, a TTS engine must be chosen in config.yml
(Google TTS is the default choice and seems to work flawlessly).
Next, the packages required for alexatts
must be installed by issuing:
npm install
inside the alexatts
directory.
Once the configuration is in place, the AlexaTTS software can be started by issuing:
nodejs main.js
at which point AlexaTTS will subscribe to the MQTT topic configured in config.yml
and await for messages to speak.
Although an Alexa remote has been used in this project, an alternative would be to bypass the remote entirely and somehow directly attach the speaker to the Alexa somehow. Perhaps, via a snap-on strap that could be attached to the Alexa - however, it may be possible that the earbud speaker would be too silent for the Alexa to perceive the sound.