Search our Blogs
Showing results for 
Search instead for 
Do you mean 
 

Extracting text from an image with OCR (Optical Character Recognition) using Node.js

Extracting text from an image with OCR (optical character recognition) can be a complex task requiring loads of development work to accomplish. Haven OnDemand’s OCR API removes this complexity and it’s easy to do in Node.js. Simply install Haven OnDemand’s official node module, POST an image containing text from either a publicly facing URL or local file to our OCR Document API, and obtain the result.

 

Code

Completed code

 

First, create a package.json file and click through to create:

 

npm init

 

Next, install the official Haven OnDemand Node.js wrapper:

 

npm install --save havenondemand

 

Next, open up the file you will write code in and require Haven OnDemand:

 

 

var havenondemand = require('havenondemand')
var client = new havenondemand.HODClient('APIKEY')

 

 

Replace ‘APIKEY’ with your API key, which can be found here after signing up.

 

Next, you’ll call the OCR Document API by submitting either a publicly facing URL or local file:

 

 

// var data = {file: '/path/to/file.jpg', mode: 'document_photo', languages: ['en']} // uncomment if using a local file
var data = {url: 'https://www.havenondemand.com/sample-content/images/bowers.jpg', mode: 'document_photo', languages: ['en']} // uncomment if using a URL
client.post('ocrdocument', data, function(err, resp, body) {
 var result = resp.body.text_block[0]
 console.log(result)
})

 

 

Note: the optional parameters - “mode” that lets you specify the type of photo (e.g. photo of a document, scanned image of a document, photo of a scene containing text, text superimposed on an image) and “languages” (i.e. the languages to extract from the image), which accepts multiple languages in an array. If not specified, “mode” defaults to “document_photo” (i.e. photo of a document) and “languages” defaults to “en” (i.e. english). For a full list of supported media formats, see here. For a full list of languages supported by the OCR Document API, see here.

 

When you run the file, it will output the response of the API with the extracted text from the image. It will look like this:

 

 

{ text: 'The Life and Work of\nFredson Bowers\nby\nG. THOMAS TANSELLE\nN EVERY FIELD or ENDEAVOR mm: ARE A FEW FIGURES wnosn ACCOM-\nplishment and influence cause them to be the symbols of their age;\ntheir careers and oeuvres become the touchstones by which the\nfield is measured and its history told. In the related pursuits of\nanalytical and descriptive bibliography, textual criticism, and scholarly\nediting, Fredson Bowers was such a figure, dominating the four decades\nafter 1949, when his Principles of Bibliographical Description was pub-\nlished. By 1973 the period was already being called "the age of Bowers":\nin that year Norman Sanders, writing the chapter on textual scholarship\nfor Stanley Wells\'s Shakespeare: Select Bi bliogmphies, gave this title to\na section of his essay. For most people, it would be achievement enough\nto rise to such a position in a field as complex as Shakespearean textual\nstudies; but Bowers played an equally important role in other areas.\nEditors of nineteenth-century American authors, for example, would\nalso have to call the recent past "the age of Bowers," as would the writers\nof descriptive bibliographies of authors and presses. His ubiquity in\nthe broad field of bibliographical and textual study, his seemingly com-\nplete possession of it, distinguished him from his illustrious predeces-\nsors and made him the personification of bibliographical scholarship in\nhis time.\nWhen in 1969 Bowers was awarded the Gold Medal of the Biblio•\ngraphical Society in London, John Carter\'s citation referred to the\nPrinciples as "majestic," called Bowers\'s current projects "formidable,"\nsaid that he had "imposed critical discipline" on the texts of several\nauthors, described Studies in Bibliography as a "great and continuing\nachievement," and included among his characteristics "uncompromising\nseriousness of purpose" and "professional intensity." Bowers was not\nunaccustomed to such encomia, but he had also experienced his share of\nattacks: his scholarly positions were not universally popular, and he\nexpressed them with an aggressiveness that almost seemed calculated to',
 left: 53,
 top: 157,
 width: 563,
 height: 817 }

 

 

Social Media
About the Author
Topics
† The opinions expressed above are the personal opinions of the authors, not of HPE. By using this site, you accept the Terms of Use and Rules of Participation