Description
% This function takes in a camera image of a page with Thai text
  % in a document format and process it to create a clean document format.
  % The camera format may:
  % – an RGB image
  % – contain noise
  % – regions that are not text (e.g. background that’s not on the page)
  % – be rotated
  % – have different lighting
% First convert the image into a grayscale image
% Use region labelling in 1D to find the number of characters
  % and the horizontal locations of each character.
% threshold the image using locally adaptive thresholding
% invert the binary image so that the text becomes foreground
% Remove unwanted background that’s not text
  % Do this by region labelling. Remove regions with sizes larger
  % than a certain threshold (assume they are not text)
  % Remove any labels with size smaller than a certain threshold
  % (assume these are noise)
% threshold is +- standard deviation of the area
% Images are AND to remove unwanted artifacts
% rotate the image to the correct orientation
  % Use Hough transform to find angle of rotation
% Only keep lines that are long enough to be considered
  % More than half the length of the longest line.
  % This removes any lines found that may correspond to small details of
  % the character structure of the Thai language that produces
  % weird/unwanted angles. (e.g. 45 degrees and -45 degrees appears often
  % even with a perfectly aligned/rotated document).
% Find the mean, mode, and median of the angles for reference.
% Use the mode value for rotation (concluded from running script on
  % many samples)
  % The rotation angle must be modified to make sure it rotates
  % correctly.
% find the areas where the sentences are and clean up noise
  % First remove any regions that have an area larger than 1 std above
  % the mean and with an extent of more than 1 std over the mean.
% Next find the bounding box for the text. Assuming the text is written
  % in a document style with margins around the text box.
  % Use an interpolation technique of the cumulation of number of pixels
  % to find the edges of the bounding box and remove any noise outside
  % the box.
% Then resize the image to the original size
% Do a final noise clean up and smoothing of the text by image erosion
  % and dilation (morphological image processing). Open filter.
% Separate the sentences out (OPTIONAL: for noisy images, this is
  % better used, if image not noisy then no need to do)
https://stackoverflow.com/questions/28935983/preprocessing-image-for-tesseract-ocr-with-opencv
 
 


Reviews
There are no reviews yet.