Refine your search
Collections
Co-Authors
Year
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Kumar, Atul
- Automatic Text Correction for Devanagari OCR
Abstract Views :164 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Punjabi University, Patiala – 147002, Punjab, IN
1 Department of Computer Science, Punjabi University, Patiala – 147002, Punjab, IN
Source
Indian Journal of Science and Technology, Vol 9, No 45 (2016), Pagination:Abstract
Objectives: This paper proposes a new technique for correcting errors done by Devanagari OCR (Optical Character Reader) system based on confusion matrix. Methods/Statistical Analysis: Confusion matrix is generated from large corpus of Hindi. The system takes each word of OCR output and generate number of strings from topmost five confused characters for each character of input word along with probability of these strings for ranking. Each string is validated with the character trigram dictionary and these valid strings are used for best suggestions. Findings: The topmost five words is taken as suggestions. The system has been tested for variety of OCR outputs documents of Devanagari script. The system provides suggestions for all the correct words at top position. For more than 10000 unique words in Devanagari OCR output, system gives the accuracy of 97%. Application/Improvements: This system is used in post-processing of Devanagari OCR. With some improvements, the system can also be used for Gurumukhi Script and Urdu script.Keywords
Automatic Text Correction, Confusion Matrix, Devanagari, OCR, Trigram.- Binarization of Degraded Documents using Local Thresholding based on Moving Averages
Abstract Views :201 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Punjabi University, Patiala, Punjab, IN
1 Department of Computer Science, Punjabi University, Patiala, Punjab, IN
Source
Indian Journal of Science and Technology, Vol 9, No 32 (2016), Pagination:Abstract
Objectives: Image Binarization is the method which converts coloured and grayscale images into black and white images, black as foreground and white as background. This paper presents the new technique for binarization of degraded documents using moving averages. Methods/Statistical Analysis: Binarization of highly degraded images is very challenging task. In the proposed technique, moving averages is calculated horizontally and vertically by setting up the value of parameters. After this, local threshold value is calculated for every pixel based on moving averages within the local rectangular block both horizontally and vertically. Findings: The proposed method has been tested with large number of degraded documents and compared with existing techniques. The proposed technique makes use of the intensity of neighbouring pixels both upward and from left to right. The proposed method produce binarization results better when compared with existing Sauvola and Niblack’s Method. Applications: The proposed method produces a bianrized image which can be used in various applications like Optical character recognition, document layout analysis.Keywords
Binarization, Degraded Documents, Grayscale, Local Thresholding, Moving Averages.- Speaker Adaptation on Hidden Markov Model using MFCC and Rasta-PLP and Comparative Study
Abstract Views :245 |
PDF Views:0
Authors
Affiliations
1 KIIT College of Engineering, Gurgaon - 122102, Haryana, IN
2 Ansal University, Gurgaon - 122003, Haryana, IN
1 KIIT College of Engineering, Gurgaon - 122102, Haryana, IN
2 Ansal University, Gurgaon - 122003, Haryana, IN