Abstract: This paper is concerned with
the problem of establishing an index based on word matching. It is
assumed that the book was digitised as better as possible and some
pre-processing techniques were already applied as line orientation
correction and some noise removal. However two main factor are
responsible for being not possible to apply ordinary optical character
recognition techniques (OCR): the presence of antique fonts and the
degraded state of many characters due to unrecoverable original time
degradation. In this paper we make a short introduction to word
segmentation that involves finding the lines that characterise a word.
After we discuss different approaches for word matching and how they
can be combined to obtain an ordered list for candidate words for the
matching. This discussion will be illustrated by examples.
Keywords: Word matching, Pattern
Recognition, Textual Image
Analysis, Classification, Image Processing, Mathematical Morphology,
Document Mining.
Cited
by:
º
T.
Konidaris, B. Gatos, K. Ntzios, I. Pratikakis, S. Theodoridis and S.
J. Perantonis, "Keyword-guided Word Spotting in Historical Printed
Documents using Synthetic Data and User Feedback", in Int. Journal on
Document Analysis and Recognition, Vol. 9, Numbers 2-4, pp. 167-177,
Springer-Verlag, April, 2007.
º
Basilis Gatos, Thomas Konidaris, Ioannis Pratikakis and Stavros J.
Perantonis, "A Holistic Methodology for Keyword Search in Historical
Typewritten Documents", in Advances in Artificial Intelligence, LNCS,
Springer, Vol. 3955, pp. 490-493, 2006.
º B.
Gatos, I. Pratikakis, S.J. Perantonis, "Hybrid Off-Line Cursive
Handwriting Word Recognition", in 18th International Conference on
Pattern Recognition (ICPR´06), pp. 998-1002, 2006.
º B. Gatos, I. Pratikakis, A.L.
Kesidis, S.J. Perantonis, "Efficient Off-Line Cursive Handwriting Word
Recognition", in IWFHR10, Int. Workshop on Frontiers in Handwriting
Recognition, La Baule, France, Oct. 23-26, 2006.
º W.T.
Chan, Y. Zhang, S.P.Y Fung, D. Ye, and H. Zhu, "Efficient Algorithms
for Finding a Longest Common Increasing Subsequence", in Proc. of the
16th Annual International Symposium on Algorithms and Computation
(ISAAC 2005), Sanya, Hainan, China, 19-21 Dec 2005.
º B. Gatos, T. Konidaris, K.
Ntzios, I. Pratikakis, S.J. Perantonis, "A Segmentation-free Approach
for Keyword Search in Historical Typewritten Documents", in Eighth
International Conference on Document Analysis and Recognition
(ICDAR'05), pp. 54-58, 2005.
Related
Works:
31. Map
Segmentation by Colour Cube
Genetic K-Mean
Clustering.
55. Exploiting
and Evolving Rn
Mathematical Morphology
Feature
Spaces.
51. Evolving
a Stigmergic Self-Organized
Data-Mining.
53. Swarming
around Shellfish Larvae
Images.
29. Artificial
Ant Colonies in
Digital Image Habitats - A
Mass Behaviour Effect Study on Pattern Recognition.
63. Social
Cognitive Maps, Swarm
Collective Perception and Distributed Search on Dynamic Landscapes.
45. Swarms
on Continuous Data.