Even though Amazon Alexa supports shopping from Amazon, the Amazon store is not available to certain regions of the world. Furthermore, one may want to place orders with different businesses aside from the Amazon store such that a different mechanism is required.
This tutorial presents one alternative method for placing orders with any online business by using autmation software.
node-red
- node-red provides the means to couple different modules together using an UML-like diagram where actions are chained together and the flow through various elements can be conditioned.node-red-contrib-selenium-webdriver
- is a node-red module that can use selenium to automate browser actions and it will provide the main way to interact with shopping websites.node-red-contrib-amazon-echo
- is a node-red module that uses Philips Hue emulation to connect with Alexa and trigger various actions.node-red-contrib-image-output
- although not a required component, when creating headless browser automations blindly (ie: without actually seeing what is going on), this node-red module will allow a screenshot of the current webpage to be taken at any moment and displayed directly in the flow.
The node-red-contrib-amazon-echo
module runs locally and does not use an Alexa app hosted with Amazon. One drawback is that it is imperative that TCP port 80
is available to be used. When Alexa scans the network, it will only discover node-red-contrib-amazon-echo
when it is listening on port 80
. One solution to this problem is to create a virtual machine and dedicate it to node-red
to ensure that there are no interferences with Apache or other services that may be listening on the local network.
First, install node-red
globally:
npm -g install node-red
and create a user and group pair, say, named node-red
to run node-red under.
Then create a file at /etc/systemd/system/node-red.service
that will be responsible for automatically starting node-red on boot with the following contents:
[Unit] Description=Node Red Service After = network.target [Service] ExecStart=/usr/bin/node-red Type=simple Restart=always User=node-red # Note RHEL/Fedora uses 'root', Debian/Ubuntu uses 'nogroup' Group=node-red Environment=PATH=/usr/bin:/usr/local/bin Environment=NODE_ENV=production Environment=DISPLAY=:0 WorkingDirectory=/opt/node-red [Install] WantedBy=multi-user.target
The systemd service can then be enabled and started by issuing the commands:
systemctl enable node-red
respectively:
systemctl start node-red
The next step is to install the rest of the required modules:
npm -g install node-red-contrib-selenium-webdriver node-red-contrib-amazon-echo
node-red-contrib-selenium-webdriver
requires a bunch of packages that must be installed to work, namely the following command would have to be executed on a Debian system:
apt-get install -y xvfb x11-xkb-utils xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic x11-apps clang libdbus-1-dev libgtk2.0-dev libnotify-dev libgnome-keyring-dev libgconf2-dev libasound2-dev libcap-dev libcups2-dev libxtst-dev libxss1 libnss3-dev gcc-multilib g++-multilib default-jre-headless
Selenium is a browser automation package that uses real browsers and performs actions such as clicking on HTML elements, executing JavaScript, finding elements, etc. By consequence, Selenium requires that a graphics manager is available and running, such as Xorg. Since the purpose of this project is to run in headless mode with just node-red as the main interface, a virtual graphics manager Xvfb will be used to start a virtual X session in the background and run the various browsers automatically that will be used by Selenium and the node-red-contrib-selenium-webdrive
package.
Xvfb should have been installed along with the previous command such that it is only necessary to create a way to start the graphics manager in the background. In order to do that, create a file at /etc/systemd/system/xvfb.service
with the following contents:
[Unit] Description=X Virtual Frame Buffer Service After=network.target [Service] User=node-red Group=node-red ExecStart=/usr/bin/Xvfb :0 -screen 0 1024x768x24 -ac +extension GLX +render -noreset [Install] WantedBy=multi-user.target
where:
User
and Group
should be set to the same user and group that node-red runs under.Now Xvfb can be enabled and starting by issuing the following commands:
systemctl enable xvfb
systemctl start xvfb
Next, as a requirement of node-red-contrib-selenium-webdriver
, the webdriver server has to be installed:
npm install -g webdriver-manager
The webdriver manager is a tool that downloads and updates various browser connectors that will be used by selenium to perform the automation.
Similarly, as for node-red and xvfb, the webdriver-manager server will have to be started and stopped conveniently. This can be done by conveniently creating a file at /etc/systemd/system/webdriver-manager.service
with the following contents:
[Unit] Description=Webdriver Manager Starter for Selenium After=network.target [Service] User=root Group=root ExecStart=/usr/bin/webdriver-manager start Environment=DISPLAY=:0 [Install] WantedBy=multi-user.target
Note that the service files node-red.service
and webdriver-manager.service
set an environment variable DISPLAY
to :0
which is the display that Xvfb is started on in xvfb.service
. This allows the process started by node-red.service
, namely node-red, and the process started by webdriver-manager.service
to be able to connect to the graphics display and run various browsers.
webdriver-manager
can now be enabled and started by issuing the commands:
systemctl enable webdriver-manager
systemctl start webdriver-manager
Lastly, some browsers would have to be installed, for instance firefox can be installed by issuing the command:
aptitude install firefox-esr
Alexa can be set up by creating an Amazon Echo Hub
node that connects to an Amazon Echo Device
node both of which are components of the node-red-contrib-amazon-echo
module. The msg
node on top is useful for debugging what flows out of the Amazon Echo Hub
node and the node to the right resembling an arrow is a link to a flow on a different page.
As the node Food
would imply, this particular automation will order food online by telling Alexa, Alexa, turn Food on.
On a different page (a different node-red flow), the actual browser automation will take place. Since the browser automation will vary greatly depending on the online store to automate, it will not be provided here as an exported flow but will be described in detail.
The flow contains no branching and is a straight-through sequential execution of selenium browser actions that will be performing the following actions to order food from a foodpanda website (ignore the Start WebDriver
, Stop Webdriver
and catch uncaught
nodes in red for now):
FoodPanda
),Login
button on the page (Authentication
),Username
and Password
),Submit
),Orders
),Reoder
),Cart
),Checkout
),Go!
) - this is also the step where the payment is made based on formerly stored credit cards,node-red-contrib-image-output
package.As probably obvious by now, the automation will pick the most recently placed order and repeats it. This is done by visiting the "My Orders" page on FoodPanda and clickng the very first "Reoder" button. Alternatively, as an exercise, the flow could be altered to randomly select a previously placed order and then reorder, like a food raffle or mystery meat. :D
Most of the nodes offered by node-red-contrib-selenium-webdriver
will search for various HTML elements by either:
All elements can be picked using a web browser on a different machine. For instance, if the XPath would have to be copied for the "Sign In" button on the Google search engine, then the "Sign In" button can be right clicked, the "Inspect Element" menu item chosen which will display the HTML element for the "Sign In" button and then with a right-click on the element itself, the XPath can be copied to the clipboard:
The XPath can then be entered into the Target
field in node-red when By
is set to xpath
:
The same concept applies to selecting HTML elements by CSS selectors or by element ID.
As mentioned previously, the red nodes in the screenshots labeled Start WebDriver
, Stop WebDriver
and caught uncaught
are a special workaround that is not mentioned by the node-red-contrib-selenium-webdriver
and is most likely an inherent problem due to multiple components that have to run independently: the virtual X display Xvfb, the Selenium web driver, the browser itself inside the X display, … and to add node-red to that.
The main issue is that once node-red-contrib-selenium-webdriver
starts an instance of a browser using Selenium, the browser will run in Xvfb and, in case the flow collapses for any reason; for instance, some HTML element could not be found, then the browser will not be shut down and further calls to the flow will just end up running a bunch of web browsers inside Xvfb till memory is exhausted.
Indeed, node-red-contrib-selenium-webdriver
does provide the Close
node, that is responsible for closing the web browser instance, but the node might not be reached in case some intermediary node throws an exception.
The workaround permits the node-red user to start and stop the webdriver-manager
via sudo
- this is done by creating the following entries to sudo
:
Cmnd_Alias WEBDRIVER_CONTROL = /bin/systemctl start webdriver-manager, /bin/systemctl stop webdriver-manager, /bin/systemctl restart webdriver-manager node-red ALL=(ALL) NOPASSWD: WEBDRIVER_CONTROL
which will allow the user node-red
to execute the commands:
/bin/systemctl start webdriver-manager
,/bin/systemctl stop webdriver-manager
and,/bin/systemctl restart webdriver-manager
With the sudo permissions in place, whenever the node-red flow is triggered, node-red will restart webdriver-manager
and thereby clear any previous failed instances.
Similarly, the catch uncaught
node will catch any exception occurring along the flow pipeline and will then execute /bin/systemctl stop webdriver-manager
to clear any web browsers even if the Close
node cannot be reached anymore. The catch uncaught
node is part of the base node-red package and is local to the current flow: it will catch any exception in the current flow.
With everything set up, the only thing that remains to be done is to tell Alexa to discover new devices and say Alexa, turn on food.
Alternatively, is multiple reoder buttons were to be considered, and a more complicated node-red flow would be created, one could perhaps say Alexa, turn on food raffle - which makes slightly more sense than turning on food. . .
Unfortunately, the expression turn on food
is rather awkward but this is due to hard-coded Alexa semantics exposed by the node-red-contrib-amazon-echo
which, in turn tricks Alexa into believing that node-red-contrib-amazon-echo
is, in fact, a Philips Hue hardware device.
Of course, it is not sure what the converse would mean, namely, turning off food. . . With the current flow and no other changes, turning on food would be equivalent to turning off food and will result in the flow placing the order. Using a conditional, perhaps a "function" node, one would filter out the "turn off food" semantic and either do nothing or process it some other way.
Sky's the limit!
Bonus points: persuade McDonalds to expose a proper REST API for configuring food orders and placing them online.