Semi-automatic Grammar Recovery

Authors
Ralf Lämmel and Chris Verhoef

Abstract
We propose an approach to the construction of grammars for existing languages. The main characteristic of the approach is that the grammars are not constructed from scratch but they are rather recovered by extracting them from language references, compilers, and other artifacts. We provide a structured process to recover grammars including the adaptation of raw extracted grammars and the derivation of parsers. The process is applicable to possibly all existing languages for which business critical applications exist. We illustrate the approach with a non-trivial case study. Using our process and some basic tools, we constructed in a few weeks a complete and correct VS COBOL II grammar specification for IBM mainframes. In addition, we constructed a parser for VS COBOL II, and were the first to publish a (web-enabled) grammar specification so that others can use this result to construct their own grammar-based tools for VS COBOL II or derivatives.

Keywords
Reengineering, System renovation, Software renovation factories, Grammar engineering, Grammar recovery, Grammar reverse engineering, VS COBOL II, COBOL

Bibtex entry
@article{LV01-SPE,
author  = {L\"ammel, R. and Verhoef, C.},
title   = {{Semi-automatic Grammar Recovery}},
volume  = {31},
number  = {15},
pages   = {1395--1438},
month   = {December},
journal = {Software---Practice \& Experience},
year    =  2001,
}

Article