Note - 8th May 2023

I created this blog in 2020 to document the software development projects I'd worked on in university. This was both for my personal record and also to demonstrate the skills & knowledge obtained. The projects were: Percy AI, RNN Trading Bot, Language Translation Headset. I'd like to expand the scope of this blog and write articles on different topics in the future when I have time.


Percy - Automatic Receipt & Invoice Data Extraction

_config.yml _config.yml

From March to July 2020, I built a web application called Percy. It allows accountants to automatically extract financial information from receipts and invoices by simply uploading a photograph of the document. The core technology that makes this happen is a combination of an advanced machine learning optical character recognition (OCR) engine and a word categorization system that finds the financial information in the extracted text. This is a challenging task because the most advanced publicly available OCR engines struggle with interpreting the combination of numbers and letters present in receipts. To solve this problem, I had to build a smart word categorization system that could understand the nuances of the output from the OCR engine. Overall, I am proud of the product, as it accurately extracts financial information from well-lit images taken on a flat surface and with high enough definition to be human-readable. If I were to continue working on the product, I would seek to improve the machine learning system that the OCR engine is built on to enhance the accuracy of the output. I believe that if this could be improved, it would significantly enhance the quality of the financial information extraction system. Truthfully, my work on Percy was not so much an exercise in building a financial extraction system as it was learning how to build a fully functional, strongly branded web application from start to finish. For this reason, this blog post will focus on the full-stack development process that I used to create the application. If you wish to test my application yourself before reading this post, head over to percy.app, create an account, and upload a few receipts/invoices. While building the app, I worked hard to ensure that it is completely secure from a cybersecurity perspective so that all users’ personal and financial information is safe.

Origins of Percy: March 2020

The original mission of this application was to utilize the latest developments in machine learning to completely automate all the manual tasks that accountants have to carry out on a daily basis. During the summer of 2018, I spent three months working as an intern in the finance team for a start-up called Velocity Black. They had real problems managing the high number of transactions they handled daily. For this reason, they heavily relied on interns and accounting assistants to process the receipts and invoices and reconcile bank accounts. The issue was that this is an incredibly boring and difficult task that is hardly fit for humans. Not surprisingly, those that joined the accounting team in these roles did not stay for long. The following year, during my internship for Whave in Uganda, I saw a similar occurrence where an employee had to travel to the head office to spend a day a week adding receipts and invoices to the account software. Seeing the business need for automation in this area, I set out to build software that automated these human-unfit accounting tasks.

I really wanted my app to have a strong, memorable brand. As my software was originally going to be an artificially intelligent accounting assistant, I thought giving it a human name would work well. It is nowadays incredibly hard to find a short, memorable domain name, so I spent a while searching for something that worked. I’m certainly happy with the name Percy, as it does sound somewhat related to accounting and is definitely memorable. Conveniently, it is also generalizable enough that I could choose to use this domain name/brand for a different project in the future if I wish.

Version One

The first version of Percy was a React/Node/Express app that just had the core system. It allowed you to notonly extract information from receipts & invoices, but also included a system for monitoring trade receivables and trade payables. As it did not have a database system yet, the images and extracted text were stored on the server. The hardest part by far of building the React component was dynamically displaying the photos of the extracted receipts with their relevant financial information from the state. This was made increasingly difficult by two functionalities I built: adding and removing rows and displaying errors when the user added the wrong type of information into boxes. This had complicated implications for state management. It culminated in a maze of countless ‘undefined’ errors which I had to navigate through. At the completion of the first version, bugs were still present in the system. However, in later versions, I managed to make this intricate system work perfectly, and this is something I am proud of.

The automated receipt and invoice data extraction system worked by sending users’ images from the Node.js server to a Python Flask server, which carried out the OCR and word categorization processes. The output was then sent as a JSON file back to the Node.js server, which saved the outputted text and sent the file paths to the React component. The user could then edit the output and add/remove rows.

_config.yml _config.yml _config.yml _config.yml

The Finished Product

After the first version, I decided to make Percy focus only on the automated receipt & invoice data extraction component. For this reason, I switched from a purely ‘create-react-app’ to a static HTML website that used an Express server, a Python Flask server, a MongoDB database, and a React component embedded into the static HTML. Web developers might describe it as a ‘MERN’ app.

The homepage

_config.yml

The sign-up page, with working account verification tokens and forgotten password resets.

_config.yml

The dashboard page, where you can view previously uploaded receipts/invoices or update your billing/account settings. You can also download previously uploaded receipts as a CSV file that can be opened in Microsoft Excel. Connect your accounting software package to automatically have the receipts/invoices sent to your accounting software with the extracted information.

_config.yml

The upload page.

_config.yml _config.yml

The extracted information page, where you can view and edit the extracted information, and upload it to your Percy account and/or connected external software.

_config.yml

The support page.

_config.yml That concludes this blog post about Percy. I hope you have enjoyed learning about my web app. Don’t forget you can actually go and see it in action here.

Read More

Building A Recurrent Neural Network Trading Bot And The Infrastructure Behind It To Make Live Trades On The Stock And Forex Market (Python, Tensorflow, Keras, Interactive Brokers, Metatrader)

Towards the end of 2019, I developed a keen interest in building machine learning models. I was especially intrigued by Google’s work on Alpha Zero, which employs reinforcement learning to train a highly intelligent model quickly. During my first and second years at university, I had learned about financial markets, optimal investment portfolios, and short-term trading strategies. After winning my university’s trading competition with a 12.01% return in 90 days, I started exploring quantitative finance on platforms like Quantopian and CloudQuant. However, I realized that generating alpha from news sentiment and historical stock price movements required advanced technologies, and these platforms didn’t support minute data or machine learning libraries. So, I decided to build my own trading system using a random forest classifier, an efficient machine learning model for this application.

Jupyter Notebooks on Google Cloud

I enjoyed using Jupyter notebooks on Google Cloud for machine learning tasks since it provided the computational power required for complex tasks and large datasets. Additionally, it allowed me access to the Windows operating system, which proved necessary later on. For my stock choices, I focused on relatively small, fast-growing NASDAQ stocks, using data from a software company called Digital Turbine (APPS) to test my model.

The Model

_config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml _config.yml My model had a 63.04% accuracy for predicting buying opportunities in unseen future stock prices. With a 2% gain for each correct prediction and a 0.5% loss for each incorrect one, I achieved a 1,500% return within a 30-day period. However, I hadn’t accounted for the 0.50% buying/selling fee charged by brokers for small, fast-growing stocks, which rendered the system unprofitable. Despite this, I continued building the infrastructure for live trading using this model, as I thought it was an interesting project.

The Infrastructure

Unfortunately, I didn’t take any screenshots of the infrastructure in use on my Google Cloud Windows server, so I can’t share any images.

Interactive Brokers Trader Workstation

I wanted to trade stocks that had been rising for at least three months and had large percentage moves during the day. These were more likely to keep rising and provide more earning potential after deducting broker spreads. I found that Interactive Brokers allowed users to trade stocks using an API through their Trader Workstation (TWS) API. The TWS API runs a server that sends and receives messages to the Trader Workstation client.

_config.yml _config.yml

I integrated the random forest classifier model into the TWS API by importing the pickle file and feeding it price data. The model would then output a ‘1’ to buy or a ‘0’ not to buy. If the stock price rose by 2%, the system would automatically sell, and if it dropped by 0.5% below the purchase price, it would also sell automatically.

When I created my Interactive Brokers account to test my system on live data, I discovered that stock price feeds required a minimum of $2,000 in the trading account. This was too large an amount just for testing purposes, so I took a break from the project.

MetaTrader

Later, I learned that the forex market does not have restrictions on subscribing to live data feeds as it’s a decentralized market. So, I started looking for a way to use my machine learning model with MetaTrader.

I found an open-source wrapper library on GitHub called ‘dwx-zeromq-connector’ that connected Python 3 with MetaTrader 4. I wrote a Python script that imported my random forest classifier machine learning model as a pickle file and used it to make predictions, leading to opening trades on the MetaTrader software.

_config.yml

In the end, I had a setup that looked similar to the one shown in the images below. While these images are not mine, they are taken from the ‘dwx-zeromq-connector’ GitHub repository and included here to demonstrate how my system appeared in action.

_config.yml _config.yml

With the project complete, I had successfully built a machine learning model for trading and integrated it with the MetaTrader platform. Although the initial model was not profitable due to broker fees, it was an engaging and educational experience that allowed me to explore the intersection of machine learning and trading. The skills and knowledge gained from this project could potentially be applied to future trading strategies or refined to create more profitable models.

Read More

On-The-Move Language Translation Via A Phone App And A Custom Designed Headset - AI Hardware (C++, Arduino, Python, Flask)

Arduino is an excellent microcontroller for building product prototypes. I had experimented with it during school and was familiar with how it worked. For this project, I purchased an Adafruit Feather 32U4 Bluefruit LE microcontroller, which functions similarly to an Arduino microcontroller. Along with it, I used a breadboard, two speakers, and one microphone.

_config.yml

All the hardware components, small enough to fit in a wearable device.


Software

For this project, I had to develop three pieces of software. First, I needed to write the C# script that would run on the Adafruit microcontroller. Second, I had to create an Android phone app that would run on my phone. Third, I had to build a Python Flask server that could receive audio, translate it, and send it back. I wanted the system to be capable of translating a foreign language to a native language and vice versa.


C++ Script

This script needed to be capable of receiving the analog input from the microphone and converting it into a pulse-code modulation (PCM) file. It then had to send this PCM file to the phone app. An added complication was the limited memory storage of the Adafruit microcontroller, requiring the analog data to be sent to an SD card before being transmitted to the phone app. In the initial versions of the system, I created the PCM file by sampling the analog audio stream at uniform intervals and sending a collection of these samples to the phone app. The server would then add headers to the PCM file so that it could be read as a waveform audio file (WAV), supported by the language translation API used on the server. In the end, this system worked, and the server was able to convert the PCM file created from raw analog audio into a WAV that could be played back and translated. The challenge was sending the collection of samples to the phone app, as the Adafruit microcontroller did not have sufficient RAM to store a large variable. It is worth noting that the phone app could receive the raw analog bytestream from the microphone in real-time via Bluetooth, so arguably there was no need to save it as a variable on the microcontroller, as it could be stored as a variable on the phone app. However, the software I used to create the app, MIT App Inventor, was a visual editor with limited programming functionality. I did not have time to build an Android app in a native language, so I searched for other ways to send the PCM file to the phone app.

_config.yml

Testing the speakers.


_config.yml

Working with Adafruit’s Bluefruit library.


Eventually, I found the open-source TMRpcm GitHub repository by TMRh20, which provided a solution to this problem. Although the repository primarily focused on playing back PCM/WAV files from an SD card, it also included code for saving analog input from a microphone to the SD card as a PCM file. _config.yml

Experimenting with the TMRpcm library.


If I could use this framework and find a way to transfer the WAV files from the SD card to the phone app via Bluetooth, the C++ portion of my project would be complete.

Android Phone App

As mentioned earlier, I opted for MIT App Inventor to build this app because I wanted to develop and test my prototype quickly. I had previously used MIT App Inventor to create an Android app while in school and appreciated its ease of use.

_config.yml

The UI of the App.


_config.yml

The block code of the App.


Flask Web Server

For my Flask server, I chose to use a service called PythonAnywhere because it is free to use. I employed the ‘speechrecognition’ library to recognize the spoken language and convert it to text. Then, I used the ‘googletrans’ library to translate the text to the desired language. Next, I utilized the ‘gTTS’ (Google Text to Speech) library to convert the text into speech, saved as an MP3 file. Finally, I converted the MP3 file into a WAV file, which I returned to the Android phone app.

_config.yml _config.yml
That was my experience attempting to build an in-ear language translation device. I certainly learned that writing software for audio is complex and often under-documented in public libraries. Nevertheless, if I had managed to get the Adafruit microcontroller to work as intended, I believe the device would have functioned. Whether it would have been able to consistently translate fast-moving conversations at a distance is another question entirely.

Read More