Skip to content

Latest commit

 

History

History
36 lines (28 loc) · 2.99 KB

readme.md

File metadata and controls

36 lines (28 loc) · 2.99 KB

SCALE - Spelling City Automated List Expediter

Create spelling lists automatically on Spelling City using Tesseract OCR via the node-tesseract-ocr wrapper and Puppeteer.

You'll will need a few things in order to use this project:

  • You need to install Tesseract. You can follow the installation instructions here
    • Make sure that once it's installed you can perform a tesseract --version in terminal and receive valid output. I created this using tesseract v5.0.0-alpha.20191030
  • You need to have Node JS installed. You can find it here
    • Just like with tesseract, once Node is installed, make sure you can perform a node -v in the terminal and not receive an error. I am currently using v10.16.2
  • You will need an account on Spelling City

You can see it working in this short video I created.

How to use

  • git clone https://github.com/caleywoods/scale.git
  • cd scale && npm install
  • cp ./credentials_example.json ./credentials.json
  • Open credentials.json and replace the placeholder username and password with your own
  • To create an example list using the default input photo, run node app.js
  • You should see a Chrome window open and quickly run through all the steps to create a new spelling list
  • The process will end and leave you at the list verification screen, here you can verify:
    • All words were created correctly
    • All definitions are the ones intended
    • Optionally rename the list
  • If you're satisfied with the list, you can save it or save and assign it to your child

Notes

There are two test files included in the repo. input.jpg is a photo taken with a Google Pixel 2 of a list of spelling words. The font is less than ideal but it seems to work even without training the Tesseract engine. I think if you were going to have to make due with a font like this in the long term you would want to feed Tesseract some more images with the same font.

The second file, input2.PNG is a screenshot of words from the internet in a very clear font, ideally this is what you would want to be working with albeit maybe with larger text. To use this second file, just provide it as the argument to tesseract.recognize() in app.js.

To use your own images, just drop them in the directory and point tesseract.recognize() at them.

Useful Links

  • Tuning Tesseract OCR - A nice article that digs into some of the details about how to tweak Tesseract to make it work better with the type of input you want to give it.
  • Node Tesseract Usage - Contains a link to all of the config parameters you can send in the config portion for Tesseract within app.js.