In my second year of university we had a large amount of foreign exchange students from China. I thought it would super cool if I could understand what they were saying and be able to communicate to them in their own language. Of course, the issue was that I didn’t know Mandarin. During my years in high school I had successfully used google translate on my computer to do something similar to this but looking at my computer would get in the way of a flowing conversation. Why couldn’t you design some sort of earphone/speaker combination that you could wear which would send audio to your phone via Bluetooth and then a server which would google translate it and send it back to the wearable device? It would mean you could have real time translation of foreign languages on the go which would be useful for tourists, businesspeople working in a professional capacity, or those looking to surprise/impress others like myself. So, I set to work seeing what I could do.
Arduino is a great microcontroller for building prototypes of products. I had previously experimented with it while I was in school and so knew how it worked. For this project I bought an Adafruit Feather 32U4 Bluefruit Loose HDR microcontroller which works just like an Arduino microcontroller. With it I used a breadboard, two speakers, and one microphone.
For this project there were three pieces of software I had to build. Firstly, I had to write the C# script that would run on the Adafuit microcontroller. Secondly, I had to build an android phone app that would run on my phone. Thirdly, I had to build a python flask server that receive audio, translate it and send it back. I wanted the system to be able to both translate a foreign language to a native language and also translate a native language to a foreign language.
This script must be able to receipt the Analog input of the microphone and convert it into a pulse-code modulation (PCM) file. It must then send this PCM file to the phone app. An added complication is that the Adafruit has very limited memory storage so the Analog data must be sent to the SD card before it can be sent to the phone app. In the first versions of the system I created the PCM file by sampling the analog audio stream at a uniform interval and sending a collection of these samples to the phone app. The idea was that the server would add the headers to the PCM file so that it could be read as a waveform audio file (WAV) file which was supported by the language translation API that I used on the server. In the end, this system actually worked in that the server was able to convert the PCM file created from raw analog audio into a WAV that could be played back and translated. The issue was sending the collection of samples to the phone app was very difficult because the Adafruit did not have enough RAM to store a large variable. I should note that the phone app actually could receive the raw analog bytestream from the microphone in real-time via Bluetooth so arguably there was no need to save into a variable on the microcontroller as it could be saved as a variable on the phone app. The problem was that the software I used to create the app, MIT App Inventor, was a visual editor and had limited programming functionality. I did not have time to build an android app in a native language so I sought for other ways to send the PCM file to the phone app.
Eventually I found the open-source TMRpcm github repository by TMRh20 (https://github.com/TMRh20/TMRpcm) which provides a solution to this problem. While the repository focuses on playing back PCM/WAV files from an SD card, it also has code for saving analog input from a microphone to the SD card as a PCM file.
If I could use this framework and also find a way to transfer the WAV files from the SD to the phone app via Bluetooth, the C++ portion of my project would be complete.
Android Phone App
As mentioned previously, I chose to use MIT App Inventor to build this. This is because I wanted to build and test my prototype quickly. I have also used MIT App Inventor to build an android app before while in school and had liked how easy it was to use.
Flask Web Server
I decided to use a service called PythonAnywhere (https://www.pythonanywhere.com/) for my Flask server as it is free to use. I used the ‘speechrecognition’ library to recognize the language being spoken and convert it to text. Then I used the ‘googletrans’ library to translate the text to the desired language. Next I used the ‘gTTS’ (google text to speech) library to convert the text to into speech saved on an MP3 file. I then converted the MP3 file into a WAV file which I return to android phone app.
That was my work attempting to build an in-ear language translation device. I certainly learnt that writing software for audio is complicated and very under documented in public libraries. Nevertheless, if I had managed to get the Adafruit microcontroller to work as intended I think that the device would work. Whether it would be able to translate fast-moving conversation consistently at a distance is another question entirely.
Thanks for reading. If you want to get in touch, send me an email.