Why Are My Drawn Bounding Boxes Inverted?
I think I am missing some really simple concept or perhaps not understanding the directions in which things are read/drawn by either PIL.ImageDraw or the output created by pytesser
Solution 1:
You can also use image_to_data
. You don't need to do arithmetic operations.
import pytesseract
# Load the image
img = cv2.imread("cRPKk.jpg")
# Convert to gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR
d = pytesseract.image_to_data(gry, output_type=pytesseract.Output.DICT)
n_boxes = len(d['level'])
for i inrange(n_boxes):
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)
cv2.imshow("img", img)
cv2.waitKey(0)
Result:
Solution 2:
PyTesseract and PIL "scan" in different directions so the Y coordinates were incorrect
As suggested by the brilliant @jasonharper
Just subtract each Y value from the height of the image before using it.
The code has been adjusted where
bottom = tess_boxes['bottom'][idx]
top = tess_boxes['top'][idx]
became
bottom = h-tess_boxes['bottom'][idx]
top = h-tess_boxes['top'][idx]
where "h" is the height of the image ( w,h = input_image.size )
The result is as desired where the boxes wrap around the target characters.
Thank you @jasonhaper
Post a Comment for "Why Are My Drawn Bounding Boxes Inverted?"