Machine Learning in Action: Build a Universal Translator App for Android with Kotlin
June 20, 2018
8 minute read
I’ve started playing with some of the machine learning APIs that AWS provides. One of these is Amazon Translate — a service that translates from English to a variety of other languages. So I thought I would make an app that uses the on-board speech recognition of an Android app, then translates it to a new language.
What you will need
The requirements for this project are fairly simple. You will need an AWS account (which you can get for free), an Android phone (I had problems recording audio on the emulator so I recommend a real device), and Android Studio (which is a free download).
What we will do
We are going to implement two distinct pieces in this tutorial. Firstly, we are going to create a UI that records some audio and then converts it to text. Then we will send that text to the Amazon Translate service and display the result.
Speech to text
Start by creating a new Android project. Make sure you select an appropriate API level (which will depend on your device). Speech recognition was added in API level 8, so you have plenty of room to use older devices. Ensure you add Kotlin support as well. Select the blank template (Empty activity) to start.
Add the following permissions to the AndroidManifest.xml:
The first one allows us to access the microphone. The other two allow us to send the result to the Internet. By including ACCESS_NETWORK_STATE, you can add the ability to detect whether you are connected to the Internet so that the app doesn’t crash when the network is unavailable.
Now, let’s take a look at the res/layout/activity_main.xml file:
Layouts don’t get much simpler than this. There is a button at the top to activate the recording, and two text boxes — the first will show the text that we heard and the second will show the translated text.
Now let’s take a look at the MainActivity.kt file:
The button to initiate recording is only enabled if speech recognition is available. If that happens, then we call out to the speech recognizer service via an intent. This will pop up a small dialog. At this point, the user speaks. When the user stops speaking, the service will return the text using onActivityResult(). At that point, we update the UI to display the results.
That was so easy to get started! You can actually run this app and see the speech to text working.
There are a couple of caveats here. The most important is that Android calls out to a Google service to complete the process. As of API level 23, there is an additional option for the service called EXTRA_PREFER_OFFLINE that can be used to indicate you prefer to do this offline. Use it like this:
Set up text translation service
Before you can use the text translation service, you need to set it up. I’ve got an AWS CloudFormation template for this purpose, which I install using the AWS command line. First, copy it to an S3 bucket. I have a scratchpad S3 bucket that I use for this sort of thing:
The translate service doesn’t need anything special, but it does need an IAM role to approve the request. The CloudFormation template sets up an unauthenticated IAM role, then associates that unauthenticated IAM role to a newly created Amazon Cognito identity pool. The identity pool will give us temporary credentials to access the Amazon Translate service later on.
Create an AWS connection in the app
The app needs to know where and how to connect to the Amazon Translate API. For this, I created a JSON file in res/raw/aws.json with the information from my CloudFormation stack:
Don’t check in the aws.json file. It contains secrets!
To get the various values, take a look at the Outputs section of the CloudFormation stack. Once the stack is finished, you can use the following command:
This will show you the details for the named stack, which includes the three values that are relatively harder to obtain. The accountId is the 12 digit number for your account — available in the top banner of the AWS console.
To read this file, I’ve got a model:
This is a basic model for the five values we need to configure the AWS Mobile SDK. I’ve added two helper methods for converting from a JSON string to the object, and for reading the JSON string from a resource.
Next, let’s add the AWS Mobile SDK for Android libraries. In the app-level build.gradle file, add the following dependencies:
We’re now ready to configure the AWS translate client. You may have noticed the blank initializeClient() in the earlier code. This is now going to be replaced:
The code creates a client object (which wraps the actual HTTP-based API) and links a credentials provider so that our unauthenticated IAM user credentials are used for the request.
Translate some text
We next need a function to translate the text. Even with the setup we used to get here, it’s remarkably simple to use:
The translateRequest contains just three fields — the source and destination languages and the text to be translated. We are going over the network, so this is run asynchronously using a Future AsyncHandler. When the response is received, the result is the translated text.
In this case, I’ve opted to specify the language as spanish. You can use Arabic (ar), simplified Chinese (zh), French (fr), German (de), Portuguese (pt) and Spanish (es). Just change the targetLanguageCode accordingly. You can also add a settings panel or options list to choose the language.
The only thing left is to write the text to the prepared text view:
The main thing to remember here is that the translateClient runs on a background asynchronous thread. You cannot update the UI on that thread so you have to explicitly switch to the UI thread in the callback.
I hope you enjoyed this foray into mobile machine learning. There are many more capabilities in the Amazon Machine Learning suite, including natural language processing, text-to-speech, image recognition, and custom deep learning capabilities. AWS even has its own speech-to-text service if you don’t feel like sending more data to Google.
[AWS AppSync] is a managed [GraphQL] service that can (and probably should) act as the data layer for your app. I’m not going to go into the details of how to configure it since I’ve gone through that in excruciating detail recently (see blog #1, #2, #3, and #4). Rather, I want to take a look at how you can send a query to AWS AppSync from your React (or React Native) app.
I am currently developing a “restaurant reviews” app, written in React Native and using a suite of services surrounding [AWS AppSync] for the data backend. Yesterday, I ran into a problem. This is how I solved that problem.