Total Pageviews

Sunday, September 5, 2010

Reading #5: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (Wobbrock)

Comment :
Sam


Summary :

This paper introduces the 1$ recognizer developed by Dr.Wobbrock. 1$ recognizer is so simple that can be integrated into any system without any trouble and only requires hundreds lines of code, which is the major feature of 1$ recognizer. 1$ recognizer only handles single stroke, and new stroke class can be easily added to the exsting stroke set. It does not use complicated algorithms like training, but template maching is used which means no training is required. It use the distance measure to calcualte similarity of two strokes. It is very easy to understand and easy to impelement, that's why 1$ recognizer can be used even without any prior knowledge about AI and gesture recognition.

There are four steps for recognizing; Resample, Rotate, Scale and Translate, Find the optimal angle and best scores, respectively. For resample, he uses fixed number of points per stroke, and distance between each neighboring points are equal. For Rotate, he used "Indicative angle", which formed between centroid of gesture and gesutre's first point. This step can eliminate angle variance. For scale and translate step, he made the stroke as fixed size bounded by a fixed lengh of squre, then moves the gesture to the reference point so that the centroid will be (0,0) after translating. After the first three steps,we still does not guarantee that two strokes are both at best angle when comparing these two strokes. Thus, in this step, he furthur calculates the best angle, which makes the distance between given two strokes is minimum. Finally, the minumum distance among all the comparision is selected, then output the class label.

1$ recognizer : http://depts.washington.edu/aimgroup/proj/dollar/
N$ recognizer: http://faculty.washington.edu/wobbrock/pubs/gi-10.2.pdf

Discussion :

1$ recognizer is so simple that can be integrated into any system, despite of its simplicity, the accuracy is very high. However, there are some drawbacks of 1$ recognizer.
  1. Low Efficiency, template maching is time consuming, especailly when there are many gesture classes and many sample gestures per class, 1$ recognizer will become unfeasible to use in that situation. No training required, but recognizing process costs too much time.
  2. Only handle single stroke. There are ways can improve 1$ recognizer to handle multstrokes. But it needs some trickes to do that, because when we use template matching, the number of sample gestures will be exponentially grow as the number of stroke per gesture grows.

6 comments:

  1. Nice summary. Even though the $1 Recognizer has lower efficiency, it still has high accuracy with even a single example (97%). And compared to the time the Dynamic Time Warping template matcher took, the $1 Recognizer looks super fast!

    ReplyDelete
  2. Yes I agree that the $1 Recognizer is comparatively successful (even with its light weight), but I'm not convinced with authors claim by just comparing it to two algorithms in only gesture recognition. They talk about HMM and ANN as possible solution, its nice if they compared to one of that too, just to say "hey our algorithm is working well in the gesture recognition plus compared to other algorithms"..........

    ReplyDelete
  3. Training and Recognizing can be treated as a tradeoff. Training fast, while recognizing slow. No free lunch.

    DTW, in my opinion, is always used in many off-line projects. It seems not fair to make comparison between these two.

    ReplyDelete
  4. It's true that $1 recognizer works well with single stroke.But I think the way of rotating must be fixed. I think the original angle have to remembered and calculated with indicative angle.This can solve the problem with different way of arrow. May be this is not simple than my thinking.

    ReplyDelete
  5. I liked the summary very much, I think it really gets the most important parts of the paper in an easy to read pair of paragraphs.

    I agree that there are drawbacks in the growth of the time consumption when adding new gestures and examples. However the implementations I have seen so far seem to have no problem to handle a relatively large amount of gestures, and get good accuracy with few examples, so I think this drawback is not too bad with todays computer processing power found even in a cellphone.

    ReplyDelete
  6. Yes, $1 is very cheap for beginners. I tried the web you posted, the accuracy seems good. Also, template-matching itself is a natural idea of dealing with classification. We are using it in our final project.

    ReplyDelete