Commented on :
Sam
Summary :
This survey chapter gives us brief overview of what is gesture recogntion and how to write a simple gesture recognizer by Introducing Rubine,Long and 1$ recognizer. Although these recognizers are very simple and easy to implement, this chapter clearly shows us the basic idea of sketch recognition, which still is the main idea of implementing much more complicated things.
For these three recognizers, the first two use classifer to recognizer gesture while 1$ recognizer use template matching technique to classify gesture. Rubine provided 13 features which still beging widely used today. For the extension of the these 13 features, long provides 22 features to slightly improve the acuracy. After calculating all the features for given stroke, they use linear classifer to recognize this stroke. ( the major part of training is to calcualte the weight value for each feature i of each class label c, noted as Wci, then used linear combination of these weight values to calculate the labe c which makes the confidence value maximum)
For 1$ recognizer, it used template matching technique. It calcualtes the distance between new gesture and every sample gesture stored in database, finally the least pair will be returend. In 1$ recognizer, it uses indicative angle value to handle the rotation problem. 1$ recognizer is so simple that can be embedded in any application easily which is the main advantages of this recognizer.
Discussion :
1. Rubine recognizer has good recognition rate, Long extends the feature set and makes it contain 22 features. However, most of features actually derives from the Rubine features, I am still wondering whether it really improves the accuracy ? More features also means more computation, and accuracy does not increase as the the number of features increase, what we should do is to find the features mostly distinguish different class symbols..
2. Template Matching technique has very high accuracy, it is true. But the most important problem is computation costs. Thus, for 1$ recognizer, the number of symbol class should be small, and we should provide limited number of samples gestures. Obviously 1$ recognizer can not apply to large system, but the idea is excellent. However I prefer to use linear or non-linear classier first to get the top N list and then use template matching algorithms which might be fast and gain higher accuracy... ?? 1$ recognizer use fixed length and width to preprocessing the stroke, however, we might use fixed length while makes width flexible to keep the ratio unchanged.
3. These three recognizers can only applied to unistroke. If we write one symbol as multistrokes, it becomes impossible to recognize it. Especially when we want to use template matching for multistrokes, time complexity will become exponential to the number of strokes.
4. How do you think of if we use the same feature set but different machine learning algorithms?
For example, we just assume using Rubine Features, and we can choose any algorithm from linear classifier, ANN, HMM, DT, SVM, Boosting..etc. We might try to guess what result we can get. Does the algorithm we choose strongly related to the recognition accuracy.. ?
No comments:
Post a Comment