Receipt Recognition Using Microsoft Cognitive Services
Softjourn

Receipt Recognition

Using Microsoft Cognitive Services

February 15, 2019 by Softjourn


Background

In the initial stages of the development of a technology, best practices are still being discovered through experimentation. While this provides developers with maximum flexibility in designing systems, it also places a large burden on them to have to do everything for themselves, often including learning lessons that had already been learned by others before them. Artificial Intelligence (AI), and in particular machine learning, has been in this state for quite some time, requiring developers to assemble the necessary computational resources and algorithms. It is a sign of maturity of a technology when it can be provided “as a service” (think “software-as-a-service,” “platform-as-a-service,” or “infrastructure-as-a-service”). Microsoft Cognitive Services (MSCS) and other analytics providers can be seen as demonstrating the maturity of machine learning technology by providing “analytics-as-a-service,” i.e., making their AI platform available. That’s not to say that further improvements aren’t inevitable  ̶̶ but Microsoft believes that the basic operations are now well-enough understood that they can be performed behind the curtain of an API (Application Programming Interface). Basic services might just process search queries as can already be done from any browser. But Microsoft Cognitive Services can go much further.

The Breadth of use cases of Microsoft Cognitive Services

To take a now commonplace example, it is possible to create a standardized recommendation system using Microsoft Cognitive Services. An application would upload its catalog data (describing products) and usage data (specifying user interactions with products). The application can then request product recommendations for any user. Additional APIs allow users to constrain the recommendation algorithm to include or exclude particular recommendations. This API is used, for example, by allrecipes.com and Orkestra.

Is this really learning? Perhaps not yet. But Microsoft Cognitive Services also has an API that would allow the system to get feedback about its recommendations (perhaps by whether or not users click on them). Now, the recommendations are not based only on some preexisting theory or examples of user behavior, but can be shaped by that behavior as it manifests itself.

Pros and cons of analytics-as-a-service, such as Microsoft Cognitive Services

Pros:

  • Microsoft takes care of the details.  This makes it easier to create an application, and thus to try out new ideas.
  • Microsoft will (hopefully) update the algorithms on the basis of the latest research, so you don’t have to!
  • As with any public cloud-based solution, MSCS enables flexible use of resources.

Cons:

  • With the ease of use gained through APIs, some expressivity is lost. For applications such as trading, which are so fundamental to the value added by an organization that relative superiority to other users is critical, analytics-as-a-service may not be ideal if it does not provide adequate support for fine-tuning the algorithms.
  • Because this client is a Microsoft shop, using .NET, Azure, and VSTS (Visual Studio Team Service), it made sense to go with the Microsoft product for this project.

Other analytics-as-a-service providers

Microsoft is not the only game in town providing analytics services for financial and other applications. Another is IBM. Their Cloud offering leverages Watson (yes, the Jeopardy player) to provide analytics in domains including investment management. Bottlenose, yet another, sources from a wide variety of data streams to provide analytic insights in areas including finance, competitive intelligence, and risk estimation. Then there is Domo, that provides a platform by which organizations can access analytics not via APIs (at least not yet), but via numerous 3rd party apps.

The Need

One of our clients manages prepaid cards for their clients, making it possible for them to track their expenses. Often, corporate customers wish to track the expenses of their employees. In the old days, traveling employees laid out funds for their expenses and submitted expense reports for review and, ultimately, reimbursement. Our client allows these expenses to be prepaid by the company, removing the need for employees to use their own funds and to sift through old receipts. But how to validate these expenses, when employees might, through their error or malice, submit receipts that should not qualify for spending through the card? Most immediately, how to even pull information from them?

The Solution

For this problem, the balance tipped in favor of using analytics-as-a-service, and in particular Microsoft Cognitive Services.

The corporate customer’s employee scans and uploads a picture of their receipt, taken with their smart phone. Softjourn’s proof of concept (POC) sends it to MSCS’s Computer Vision API to perform optical character recognition (OCR). This pulls out editable lines of text from the receipt image, which are returned to the POC along with indications of Microsoft’s level of confidence in this result. Text with a low confidence level must be sent to be read by a human; other text contains errors that Softjourn can correct automatically. Next, templates encoding standard receipt formats are selected and applied to extract the various important pieces of information from the receipt, including transaction total, amount of tax, card charged, and address of the establishment. This allows for initial validation of transaction totals. Receipts from designated vendors may already have an approval code  ̶̶ for others there may be more work to do for validation. To this end, MSCS also returns a “raciness” indicator (checking for a situation in which the expense ought to be rejected for lack of relevance/appropriateness).

 


You may also like

Thought Leadership
Expense Control

Softjourn is a global technology services provider that finds custom solutions for our clients’ toughest challenges. We leverage our domain expertise in Fintech, Cards & Payments, and Media & Entertainment (with a special emphasis on ticketing), to apply new technology that brings our clients' growing needs to life. Contact us to discuss how we can make your idea a reality!