Total Pageviews

Friday, November 19, 2010

Reading#17 Distinguishing Text From graphics in On-line Handwritten Ink

Comment:
Chris
Summary:


Another paper for shape vs text. But different from previous text vs shape papers we've read. The most important difference is that in this paper it actually utilize the context information. And finally they build HMM model for the sequence of all strokes. Besides individual stroke features, they use temporal information and gap information for sequence of strokes.

1. Independant stroke model. For each stroke, eleven-valued features are extracted. The process needs training samples. And for the testing phase, for given stroke s, they actually calcualte the probability of being text, can be described as P(TextStroke s). So, given input stroke, we can simply calcualte the probability by using trained model. Independant stroke model is only for each stroke, and in some case, it results in much error, so author proposed another improvement for this.

2. Hidden Markov model. The author also observe that probability of states transformation provides valuable information, which served as context information for given stroke sequence. In this case, there are two states, text and shape. And there are four transformation for these two states. Text to text, text to shape, shape to shape and shape to text. For the shape vs text problem, the problems is actually to assign state to each stroke. Thus, the problem can be easily modeld as Hidden markov process. The hidden layers obviously are the two states. and observation layers are the features we acutally calculate. So the problem changes to make the following formula have the maximum value.




However, P(XT) cannot be directly calculated from the models. So they used baysian rule to calculate it.

3. Bi-partite HMM In this step, they used the gap information between two strokes. In order to characterize the gap, they choose 5 features, and the training and mode process is similar as the first step for individual stroke model. After that, we simply incorporated this information into the HMM we got from step 2. The incorporation is very straighforward.
The result shows that their approach gains very good accuracy for shape vs text, especially the step 2 promote the accuracy very much although the step 3 does not impact the accuracy rate very much.

Disccussion:


What a great paper!!! i really like it and gives me so much valuable information!! First, it consider the context information for shape vs text task. this is greate improvement over the traditional method that only based on individual stroke features. Second, they build HMM model for the whole stroke sequence, which is purely probability approach and gives pretty good result. Third, it provides an easy way to add other context information into the HMM that already built. Besides, gap context information,we can futhur incoporate other context information very easily. The beautiful idea of this paper can be very useful for other research. Only one disadvantage that might have is people will not draw the text in natural order. Because HMM greatly depends on the temporal order, once people violate this order, the method might tends to get lower accuracy.

1 comment:

  1. Also that the HMM is based on the initial probability distribution for each of its components, this makes it relies heavily on the domain data. I'm not sure how feasible this kind of a system for online recognition, because it takes time to train the system to an appropriate level of recognition accuracy.

    ReplyDelete