Build an app to practice a foreign language with AWS Amplify, Amazon Polly and Translate

5 min readFeb 21, 2021

My Sundays usually start with a cup of coffee and a German class 📚

I have an amazing teacher, who persistently corrects my mistakes and puts words I have difficulties with in a document we share.

As not such a good student, I rarely check that document later and almost never practice new phrases between classes. Which means that at the next lesson I continue making same mistakes.

Today after my class I came up with a better practice mechanism, which I can combine with my chores or evening walks.

The idea is simple — I need someone to tell me word in English, then give me a moment to remember the word 🤔 and say it out loud. Then this someone should tell me the correct answer, so I know If I was correct or not.

Neither my boyfriend, nor my imaginary cat 🐈 agreed to volunteer, so, as an engineer I decided to take matters into my own hands and develop a quick app that can help me.

What I have

A list of words or phrases I want to practice. Here is, for example, a list of words I got from the teacher by the end of the class today:

der Frühling
der Sonnenaufgang
die Blüte
verursachen
die Aufnahme
die Vorliebe
der Aufwand
mit jemandem übereinzustimmen
die Steuererklärung
die Herausforderung

What I want

Some kind of a guided recording, which would give me a hint about the word (can be its English translation), then would give me enough time to say the phrase out loud and finally tell me the phrase in German.

The goal is simple — I need to transform input into output!

Building blocks

What I need are:

1. A service to translate my German words to English

2. A service to generate audio for both English and German variant

3. An app to combine it all

Let’s go one by one. For translation itself we could use Amazon Translate, that would be a straightforward for item # 1.

In the past I was interested in trying out Amazon Polly, a cloud service that converts text into lifelike speech. That would do for item # 2.

As for # 3 I decided to use ReactJs and create-react-app as a quick way to bootstrap the application.

Both Polly and Translate can be set up and used as individual services. But recently I heard so much about Amplify, a set of tools to considerably speed up cloud integration, so I thought it is time to try it. More than that Amplify supports integration with Polly and Translate out of the box!

After an hour or two I had a working demo:

You can check the source code in the github and below you can find a step by step guide you can follow to create this app.

Instructions 🛠️

Setup project and dependencies

If you haven’t done so far you’ll need to install Amplify cli. You can find text and video instructions over here https://docs.amplify.aws/cli/start/install.

Then let’s create a react app. I’ve called it ForgetMeNot and left all default settings. They will be enough for our project.

npx create-react-app forgetmenot

Next we should initialise an Amplify project.

amplify init

When you run the command above it will ask you a number of basic questions. It will also give you suggestions, for simplicity I agreed to most of provided suggestions:

? Enter a name for the project forgetmenot
? Enter a name for the environment dev
? Choose your default editor: None
? Choose the type of app that you're building javascript
Please tell us about your project
? What javascript framework are you using react
? Source Directory Path:  src
? Distribution Directory Path: build
? Build Command:  npm run-script build
? Start Command: npm run-script start
Using default provider  awscloudformation
? Select the authentication method you want to use: AWS profile

We’re almost done with the dependencies. The last step is to install aws-amplify npm package.

npm install aws-amplify

Add AWS services

As mentioned above we’ll be using several AWS services, including Polly and Translate. The Amplify cli we have installed provides a quick way to configure the back end and to setup necessary resources and permissions.

Translating text from one language to another and converting text to speech belong to the predictions category. To add services for these category you’ll need to call

amplify add predictions

Note, we’ll need to run this command twice to select both translateText and speechGenerator.

If you later want to make changes to the selected settings you should run

amplify update predictions

When adding prediction services you’ll be also prompted to add authentication configuration. Again, you can go for suggested default configuration.

? You need to add auth (Amazon Cognito) to your project in order to add storage for user files. Do you want to add auth now? Yes
Using service: Cognito, provided by: awscloudformation
 
 The current configured provider is Amazon Cognito. 
 
 Do you want to use the default authentication and security configuration? I want to learn more.
 Do you want to use the default authentication and security configuration? Default configuration

After we make changes to backend through Amplify cli, we should push those changes to the cloud by running

amplify push

Basic App

With this we’re ready to dive into the code. Amplify has generated backend configuration. Let’s open App.js and include our backend services.

I’ve also modified the structure of App.js to include a simple layout and a couple of state variables we’ll need later:

Generate text

My initial plan was to combine all text together and use billingial voice and SSML tags to indicate the language of the text chunk and add pauses. However, Amplify still does not support this and I couldn’t find a way to mix languages within the same conversion.

That’s why, I opted to split the whole message into iterative English — German chunks and play them one after another.

As for pauses I’ve decided to use length of English message as an indicator how long the waiting time should be and play a muted recording to imitate a pause.

Here is how generateText function looked in the end

Methods translate and convertToSpeech I’ve moved to a separate helper.js file:

I used voiceId to indicate the language for textToSpeech. Marlene is a German voice, while Joanna speaks in English. You can find a list of other voices here.

Play recordings

To play generated recordings I’ve used HTMLAudioElement and an Audio class. My approach lacks sophistication, but works.

Summary

This was a quick experiment to use several AWS services in combination with Amplify. Although it is still missing some features of Polly, it was still possible to use textToSpeech and the Amplify cli made the setup quick.