Feature Detection for Receipts
Why feature detection for travel receipts is different
Over the last four years, I have been working for a Berlin-based startup on an AI solution for recognizing receipts and invoice documents. The goal was to recognize travel expense receipts at such a high level that users would trust the recognition performance for gross, net and tax amounts in particular, and be able to find their receipts quickly by displaying the invoice issuer.
In our initial tests, these two capabilities proved essential for user confidence in automatic receipt recognition. After all, the recognized data is used for reimbursement to the employer or declaration to the tax office. Therefore, it is very important to most users that only as few errors as possible occur. Only after several weeks of use, users build up this trust and only then does the added value of automated document recognition become fully apparent to users.
At the beginning of our work, we looked around the market for turn-key solutions. In 2016, however, there was no service that was up to the task, although the recognition of line items was not relevant at that point in time. Therefore, we decided to implement an appropriate service ourselves.
How did we start?
Since we did not have significant amounts of travel expense receipts at the beginning of the implementation, the first part of the implementation was done without any deep learning. We used various technologies ranging from regular expressions to different heuristics and matching with reference databases. From the analyses of the recognition algorithms logs, we learned quickly that there is no single component that is crucial for recognition. Only the combination of different algorithms lead to the desired success. This can be well explained by the following example:
For the recognition of the correct gross and net values of receipts from different countries, the analyses showed the importance of the correctly recognized commodity group of a receipt, such as taxi, airplane or hotel. These influence the possible tax rates of a service in many countries.
The recognition of a printed tax rate in percent on the receipts contributed almost nothing to the recognition performance. However, the mathematical combinability of all recognized currency amounts on a receipt with the possible tax rates of a recognized country contributed significantly to the recognition performance. Thus, the recognition of a country had a much stronger impact on recognition performance for gross and net values than the recognition of printed tax rates on the receipts.
Machine learning was only integrated into the solution when the confidence level of these solutions in productive use was significantly better than the algorithms used until then. Several hundred thousand documents were required for this.
But even after implementation, machine learning was only part of the recognition performance. For the recognition of an invoice date, for example, complex regular expressions are still more performant today, despite significantly enhanced learning databases.
Detection Network
Today, the recognition solution consists of many dozens of individual recognition blocks. These initially provide atomic recognition data with independent confidence levels. Only in further downstream stages is the recognized raw data from these blocks assembled into consistent overall data for invoices and receipts. Various methods, including deep learning, are also used in the downstream recognition stages. The various algorithms thus compete both individually and in combinatorics against the other algorithms for recognition and their combinations.
Only this multi-stage, learning heuristic leads to the above-mentioned recognition performance required for the user, with which the targeted trust can be built up and thus leads to a high level of customer loyalty for the solution based on it.
Feature recognition for receipts is therefore more than the general feature recognition of documents. This is particularly due to the internal structure of invoices, which have a logical connection and therefore also have to be recognized in context.
Although the technology until today was due to COVID-19 only used in company backends, you can now easily take advantage of this technology for your own invoices and receipts with the new my-Receipt app for iOS (https://apps.apple.com/app/id1567251540).
References:
[1] myReceipt App (May 30, 2021)
https://apps.apple.com/app/id1567251540