How to Use Alexa Skills Kit SDK 2.0 with 3rd party APIs
Once a quarter we at SPR host a Solution Day, when people from different teams spend the day together in the office “learning by doing”. In this blog, I recap our project for the day: integrating a Node.js Alexa skill written with Alexa Skills Kit SDK ^2.0.0 and an asynchronous 3rd party REST API call.
The project
At a recent Solution Day, I had the pleasure of working with a few of my colleagues on improving an Alexa skill that lets us interact with an IoT device we built at a previous Solution Day. The IoT device was built using a Particle Photon and provides us telemetric data about our office Kegerator: what’s on tap, how cold the beer is, and how much beer is left.
The original version of the skill in question was ported from another installation that only featured a single faucet and was written using Node.js. Our office kegerator has two faucets. The resulting user experience was sub-optimal from both a performance and user experience perspective. The goal of this Solution Day was to make improvements to both.
During the course of implementing these improvements, we upgraded the Alexa Skills Kit SDK to the latest version (^2.0.0 from ^1.0.0). Our specific problem (and the solution we came up with) involved integrating a Node.js Alexa skill written with Alexa Skills Kit SDK ^2.0.0 and an asynchronous 3rd party REST API call.
While this applies to any asynchronous function, in this post we’ll cover an example that used the Particle Cloud API. Like nearly every other post I’ve written, a lot of this isn’t new information – there’s a lot of “standing on the shoulders of giants” in here. The value comes in being able to find this information in one place.
Returning results in Alexa SDK ^1.0.0
A key change in the latest SDK version is how an AWS Lambda invoked as part of a Skill request terminates and reports results. In Alexa SDK ^1.0.0, returning results involved emitting an event, like the example below:
self.emit(‘:tell’, ‘The keg is ‘ + tempVal + ‘ percent full.’);
lambdaContext.done(null, ‘Request for beer quantity successful’);
Terminating/reporting results in this manner makes integration with 3rd party REST APIs very easy. These lines can simply be placed in a callback or a promise. When the API responds and results are returned, an event can be emitted, and the context is told we’re done. Here is a more complete example:
‘HowMuchBeerIsLeft’: function () {
var tempVal = 0, self = this;
request(consumptionMonitorUrl).then(function (body) {
body = JSON.parse(body);
//other logic
self.emit(‘:tell’, ‘The keg is ‘ + tempVal + ‘ percent full.’);
lambdaContext.done(null, ‘Request for beer quantity successful’);
});
},
Returning results in Alexa SDK ^2.0.0
The same is not true in Alexa SDK ^2.0.0 – with the new handler model, terminating a skill request and returning results requires actual ‘return’ statements at the end of the function. There’s not a whole lot of documentation either, but there are plenty of example projects. The new handler model looks like the example below, which is the default ErrorHandler from an example project:
const ErrorHandler = {
canHandle() {
return true;
},
handle(handlerInput, error) {
return handlerInput.responseBuilder
.speak(‘Sorry, I can\’t understand the command. Please say again.’)
.getResponse();
},
};
This model is more difficult to use with a skill that requires integrating with a third party API because a ‘return’ requires us to essentially force our code to wait for our asynchronous API call to respond before returning. Using a ‘return’ statement in a callback or promise won’t work – the function will likely end before the ‘return’ statement is executed.
One of probably a dozen solutions to this problem involves async/await syntax, introduced in Javascript ES7. A good explanation of async/await can be found here: http://nikgrozev.com/2017/10/01/async-await/, though our examples do not use the request-promise library as this site does.
Functions
Our example solution will show a handler function waiting for a response from another function that returns a Promise. Our function returns a Promise because again, we’re not using the request-promise library (though we could). Our example also includes a ‘bodyTemplateMaker’ function (which is included below), because our skill runs on an Echo Spot, which has a screen. The ‘bodyTemplateMaker’ function (and it’s related functions) is a helper function for building a response that has visual components. This function is pulled from an example skill on Github. It is largely untouched.
Libraries we used:
const Alexa = require(‘ask-sdk-core’);
const request = require(‘request’);
const AWS = require(‘aws-sdk’);
Handler (for Alexa skill):
const TempHandler = {
canHandle(handlerInput) {
return handlerInput.requestEnvelope.request.type === ‘IntentRequest’
&& handlerInput.requestEnvelope.request.intent.name === ‘TempIntent’;
},
async handle(handlerInput) {
let temperature = await tempLookup();
return bodyTemplateMaker(
‘BodyTemplate1’,
handlerInput,
mainImage,
‘How cold is the beer?’,
`Current Temp: ${temperature} F`,
null,
null,
`The beer is being served at ${temperature} degrees Fahrenheit.`,
null,
null,
mainImgBlurBG,
false
);
}
};
Async function (for 3rd party API call):
const tempLookup = () => {
return new Promise(function(resolve, reject) {
request(servingTempUrl, function(error, response, body){
body = JSON.parse(body);
resolve(parseFloat(body.result).toFixed(2));
});
});
}
Helper function (bodyTemplateMaker and friends):
function bodyTemplateMaker(pBodyTemplateType, pHandlerInput, pImg, pTitle, pText1, pText2, pText3, pOutputSpeech, pReprompt, pHint, pBackgroundIMG, pEndSession) {
const response = pHandlerInput.responseBuilder;
const image = imageMaker(“”, pImg);
const richText = richTextMaker(pText1, pText2, pText3);
const backgroundImage = imageMaker(“”, pBackgroundIMG);
const title = pTitle;
response.addRenderTemplateDirective({
type: pBodyTemplateType,
backButton: ‘visible’,
image,
backgroundImage,
title,
textContent: richText,
});
if (pHint)
response.addHintDirective(pHint);
if (pOutputSpeech)
response.speak(pOutputSpeech);
if (pReprompt)
response.reprompt(pReprompt)
if (pEndSession)
response.withShouldEndSession(pEndSession);
return response.getResponse();
}
function imageMaker(pDesc, pSource) {
const myImage = new Alexa.ImageHelper()
.withDescription(pDesc)
.addImageInstance(pSource)
.getImage();
return myImage;
}
function richTextMaker(pPrimaryText, pSecondaryText, pTertiaryText) {
const myTextContent = new Alexa.RichTextContentHelper();
if (pPrimaryText)
myTextContent.withPrimaryText(pPrimaryText);
if (pSecondaryText)
myTextContent.withSecondaryText(pSecondaryText);
if (pTertiaryText)
myTextContent.withTertiaryText(pTertiaryText);
return myTextContent.getTextContent();
}
Unfortunately, we can’t share the entire codebase for this solution because it contains proprietary and/or private information. However, one thing we found difficult was navigating various source code repos to cobble our solution together. We’re hoping this is a succinct, yet complete, solution that others will find helpful.
Enjoy and good luck!