Total Pageviews

Tuesday, December 14, 2010

Reading #18: Spatial Recognition and Grouping of Text and Graphics (Shilman)

Comment:
Chris

Summary:
This paper shows a framework for simultaneous grouping and recognition of shapes and symbols in free-form ink diagrams. Their approach is completely spatial, that not require any ordering on the strokes. There framework works as follows:
1. Build a proximity graph.Each node corresponds to stroke, and edges are added when strokes are in close to one another.
2. Search through this graph and find the optimal groupings. They use cost function to control to find the optimal groupings.They uses dynamic programming and A* algorithm to make search. In this paper, they focused on A* search. Each state in search space corresponds to cost value. Due to this brute force search, they propose two optimiaztion approaches.
1) grouping is valid only if its vertices are connected in the neighborhood graph.
2) Restrict the size of each subset V in the graph to be less than constant k, which can greatly decrease the time complexity.
3. For the recogniton of each part, they use the Adaboost classifier which can be automatically learned from the training dataset.


The result shows that their method gains about 97% accuracy for their testing data set.


Discussion :
Fairly good paper. Instead of seperating the steps of segmentation and classifying each part, they simultaneously find the optimal grouping as well as recognition They use fairly general method A* to search through all the search space, which has great time complexity. They use another fairly general optimization approaches to control this searching. Even though the accuracy the reported is very high, there are some problems here. The threshod to control the build of proximity graph can be set unappropriately so that can miss good important groupings, even they threshod values works very good, we cannot avoid some missing groupings in pratice.

No comments:

Post a Comment