Two approaches to Robust Stochastic Parsing
NLP applications in all domains require more than a formal grammar to process the input in a practical way, because natural language contains phenomena that a formal grammar is usually not able to describe. Such phenomena are typically disfluencies and extra-grammaticality. Some robust technique is needed to deal with them. An important issue in the development of robust parsing techniques is the choice of flexibility. What precise phenomena outside the systems grammar shall the parser be able to handle? Another question is how to select the correct analysis among the great number of solutions, which are produced as a consequence of the flexibility. This report presents experiments done with two different techniques. One is based on the combination of partial parses, the other on controlled relaxation of grammar rules. In both techniques the selection of the ?best? analysis is done with a statistically based ranking procedure. The grammars and test sentences are extracted from two treebanks, ATIS and Susanne. Experimental results show that the first technique has the advantage of full coverage, while the other has a better accuracy. The best performance is achieved by parsing in three passes, first with the initial grammar, then with the rule-relaxation approach and finally, if still no analysis was found, with combination of partial analyses.
IC_TECH_REPORT_200497.pdf
openaccess
1.18 MB
Adobe PDF
c1d16d296061b9dd64da240bb1cf61dd