Using Python and Project Oxford Vision API to OCR a Online image from any URL

Article
03/30/2016

The Computer Vision API are a collection of state-of-the-art image processing algorithms designed to return information based on the visual content, and to generate your ideal thumbnail. With this API, you can choose which visual features you want to extract that best suit your need

Project Oxford Vision APi has the following features:-

Analyse an image

This module will produce visual features based on the input image's visual content -- image categories, pornographic detection, dominant colour, and more.

Use the adult and racy features to enable automated restriction of sexual content and protect your users.
Utilize categorization to help append tags to images, as well as group images into clusters.

Generate a thumbnail

Given an input image, you will be able to generate a high quality and storage efficient thumbnail.

Leverage thumbnail generation to help present images in the best form suited to your needs.
Use smart cropping for thumbnails that are different than the aspect ratio of your original image in order to preserve the region of interest.

OCR

Optical Character Recognition (OCR) detects text in an image and extracts the recognized characters into a machine-usable character stream.

Run on embedded images to generate text and enable searching.
Allow users to take photos of text instead of transcribing to save time and effort.

Looking for an end to end solution from Microsoft for monitoring image content?

Moderate your content

Scenario-based Content Moderator service brings together Project Oxford API’s and more to proactively alert you of content policy violations

Create and maintain custom image and text blacklists with automated alerts.
Fuzzy matching can detect permutations of content you have blacklisted.

Detect child exploitation

Use our PhotoDNA Cloud Service to automatically detect and report the distribution of child exploitation images.

Distinguish and flag illegal images when they are uploaded to your platform.
Service can identify images even when they have been altered.

In this blog I am going to cover using Python and OCR functionality of the Microsoft Project Oxford API

The Python Script simply passes the url of Image to the Project Oxford API and OCR the image

Python Script

import http.client, urllib.request, urllib.parse, urllib.error, base64

headers = {
   # Basic Authorization Sample
   # 'Authorization': 'Basic %s' % base64.encodestring('{username}:{password}'),
   'Content-type': 'application/json',
}

params = urllib.parse.urlencode({
   # Specify your subscription key
   'subscription-key': 'INSERT PROJECTOXFORD VISION KEY',
   # Specify values for optional parameters, as needed
   'language': 'en',
   'detectOrientation ': 'true',
})

try:
   conn = http.client.HTTPSConnection('api.projectoxford.ai')
   conn.request("POST", "/vision/v1/ocr?%s" % params, "{'Url': 'Simply add url of image}", headers)
   response = conn.getresponse()
   data = response.read()
   print(data)
   conn.close()
except Exception as e:
   print("[Errno {0}] {1}".format(e.errno, e.strerror))

The Target Image

The output

b'{"language":"en","textAngle":0.0,"orientation":"Up","regions":[{"boundingBox":"36,353,928,539","lines":[{"boundingBox":"372,353,262,80","words":[{"boundingBox":"372,353,262,80","text":"KEEP"}]},{"boundingBox":"341,478,317,84","words":[{"boundingBox":"341,478,317,84","text":"CALM"}]},{"boundingBox":"445,590,109,36","words":[{"boundingBox":"445,590,109,36","text":"AND"}]},{"boundingBox":"36,654,928,109","words":[{"boundingBox":"36,654,928,109","text":"System.out.println"}]},{"boundingBox":"115,778,774,114","words":[{"boundingBox":"115,778,348,114","text":"(\\"Hello"},{"boundingBox":"500,778,389,114","text":"World\\")"}]}]}]}'
Press any key to continue . . .

So overall the Project Oxford API is pretty impressive and from a simply 25 lines of code the image has been OCR..