Journal of System Simulation ›› 2019, Vol. 31 ›› Issue (5): 1010-1018.doi: 10.16182/j.issn1004731x.joss.17-0163

Previous Articles     Next Articles

Thai Language Names, Place Names and Organization Names Entity Recognition

Wang Hongbin1,2, Gao Hongkui1,2, Shen Qiang1,2, Xian Yantuan1,2   

  1. 1.College of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China;
    2. Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, Kunming 650500, China
  • Received:2017-04-19 Revised:2017-07-28 Online:2019-05-08 Published:2019-11-20

Abstract: Named entity recognition in Thai language is aimed to identify the names of a person, a locality,an organization or an institution,and so on. Due to the complexity of Thai word formation method and grammar rules, to solve this problem, the idea of the approach proposed is to treat the task of named entity recognition in Thai language as labeling the sign of a series of words in Thai sentence. Given the characteristics of Thai language itself, certain features in the context of the samples in the Thai entity recognition corpus are extracted to train the hidden Markov model and the conditional random field model respectively, and then the labeling model is built based on the training corpus. We verify the labeling model on the test corpus through experiments. The experiment result shows that the method adopting the hidden Markov model and the conditional random field model is feasible to accomplish the task of recognizing the identification of the person, the location, and the organization or the institution; and the recognition effectiveness is well.

Key words: named entity recognition, hidden Markov statistical model, conditional random field statistical model, sequence labeling

CLC Number: