Towards Automated Transcription of Label Text
from Pinned Insect Collections
Nitin Agarwal1
Nicola Ferrier2
Mark Hereld2
1Interactive Graphics & Visualization Lab
2Argonne National Laboratory




Abstract

We present a computer vision system that can transcribe the text on tiny printed labels stacked beneath pinned insects (as found in museum collections). The approach uses multiple views of each label because the labels are often occluded by the pin, the insect specimen, and other labels. Our approach handles occlusion and the extreme viewing angles required to image the stacked labels. Automated image analysis identifies the lines of text and then aligns and rectifies the images. Combining the aligned and rectified images from multiple viewpoints enables us to create a composite image that can be read using optical character recognition tools (OCR) to extract the text. We provide experimental demonstration using both museum specimens and experimental test labels.

Code (with demo)


Code


Paper

N. Agarwal, N. Ferrier, M. Hereld
Towards Automated Transcription of Label Text from Pinned Insect Collections
Winter Conference on Application of Computer Vision, 2018 [Paper]
Supplementary Document
[Bibtex]



Example Results





Related Work

M. Hereld, N. Ferrier, N. Agarwal, P. Sierwald Designing a high-throughput pipeline for digitizing pinned insects. In eScience, 2017. [PDF]




Project template was borrowed from Richard Zhang