|
- 2017
Fast Image Embedded Chinese Text Extracting by Homogeneous Space MappingKeywords: Chinese Text Embedded Image, Homogeneous Mapping, Text Extraction, Information Security Abstract: Text-embedded images are popular in the mobile Internet to spread malicious information. A fast text-embedded image Chinese text extracting algorithm based on homogeneous space mapping is proposed. Image enhancement functions are used to highlight edge and texture features of images. Sobel operator is used to extract the edge feature and wavelet packet is used to extract the 24-dimensional texture feature vectors in the enhanced images. The texture features and edge features are used to describe the homogeneity of an image, which construct the homogeneous feature map of the image. The differences between the non-text and the text region homogeneity are used to distinguish them and reduce non-text region further. Thus the text regions are highlighted. Then, homogeneous text samples are used to train the text region detector, which greatly reduces the computational complexity. Finally, the characters are segmented and recognized. Some experiments to verify the validity and practicability of the proposed algorithm have been conducted. The recognition rate achieves 86%, which is higher than that of other methods in industry. The algorithm is verified on the operator's malicious information monitoring system, which provides secure malicious filtering performance.
|