email:
hercules@dsv.su.se
Maria Bergholtz
Dr. Paul Johannesson, (Docent), Scientific Advisor
Dr. Eduard Hovy, Scientific Advisor, (ISI/USC)
Project period July, 1996 - December 31, 1999.
The aim is to study formal specifications, schemas as well as instances, expressed in EXPRESS/ STEP and propose Natural Language descriptions for them. In order to validate our results we will interview Automotive Designers and Constructors to obtain the "correct" natural language expressions describing the formal specifications.
We will implement a Text-and Sentence Planner and a Natural Language Surface Grammar and Lexicon in Prolog for the generation of Natural Language (English). We will also investigate in which current EXPRESS/STEP tools our generation tool could be integrated.
We will investigate other domains, Application Protocols (APs), e.g. for ships, electrotechnical plants etc, and propose guide lines for how to create a lexicon for other domains with minimal work by reusing the results from our work. The Text-and Sentence Planner and the Natural Language Surface Grammar will be similar in other domains (APs).
The importance of this project is that we will create a support tool for the EXPRESS/ STEP standard which will help designers and constructors of cars to understand their EXPRESS specifications by reading Natural Language output. This support tool will not only help designers and constructors but also other persons involved in the car design process to validate the formal specification by reading it in natural language. The natural language generation is important since not all persons are knowledgeable in the EXPRESS language.
One strength of this project is a synergy effect: that other domains (other APs) in the STEP/EXPRESS world could make use of our results and our guide lines to easily create natural language generation systems.
One of the first attempts to make sentence generation from a conceptual representation is described in [Goldm75]. The next effort was to translate first order logic formulas to natural language by [Chest76]. Arguments to use natural language generation for validation of formal specifications are presented in [Swart82]. A set of translation rules for translating entity relationship diagrams to natural languages (NL) was defined in [Chen83]. Other approaches for natural language generation for validation of formal specifications expressed in conceptual models are proposed in [Rolla92]. A suggestion to generate a whole NL-discourse built on Hobbs coherence relations [Hobbs85,90] for validation of a conceptual model has been made in, [Dalia92a,92b] and a refinement of the generation by using aggregation rules has been suggested in [Dalia93, 95c,95d,96b].
Examples on complete support tools for validation and specification with graphics and also possibilities to execute the formal specification is e.g. AMADEUS, [Black87], which uses a combination of graphics and single sentence parsing and generation. Other tools are MGI (MOLOC Graphical Interface) and MOLOC [Johan91] which is used at Stockholm University for educational purposes. WATSON, [Kelly91], is used for formal specification of telephone switches. WATSON can read informal natural language scenarios and from these create a formal specification, execute the specification with both simulation- and theorem proving techniques. Yet another tool is AT&T's Visionnaire [Henju91]. However none of the support tools mentioned above has any natural language generation component except of AMADEUS mentioned previously and VINST [Breta95]. VINST's NL-generator is described in [Dalia95a].
Within the NUTEK (Swedish National Board for Industrial and Technical Development) supported project Concise Natural Language Generation from Formal Specifications under the contract P3672-1, in collaboration with Ericsson Utvecklings AB (former Ellemtel Utvecklings AB) we have developed methods [Dalia95b] for paraphrasing the instances expressed in the Delphi formal language [Höök93,Ridle94]. The Delphi language is a conceptual modelling language extended with First Order Predicate Logic. The Delphi formal language is used for expressing the functionality of telephone services.
It has been shown within Ericsson, that in order to reduce lead-times in the sales and production process, it is necessary to comprehend early in the requirements engineering process the requirements of a customer. These requirements can be elicited by means of a tool, where the customer and the salesman together specify the functionality of a telecom service and translate it to a formal language, Delphi. The natural language generation from the Delphi specification can be used during the whole sales, requirements engineering and constructing process and even in the tutorial process to inform users at various levels about the functionality of the specification [Engst92].
EXPRESS is a currently a static data modelling language [ISO-91], and provides constructs such as entities, relations, attributes, etc. EXPRESS is part of STEP (STandard for the Exchange of Product model data) [ISO-94] within STEP there are Generic Resources which are domain independent and Application Protocols (AP) which are domain specific. The APs are expressed in the EXPRESS language. An AP describes the processes, information flow and functional requirements of a specific application, an example on an AP is AP-214 Core Data for Automotive Mechanical Design Process, (ISO-10303-214). There are lot of developed tools for the EXPRESS language but non have tried yet to generate natural language from any EXPRESS specifications, [Gitti94].
The Application Protocols, specifically the AP 214 Core Data for Automotive Design Process within the STEP/EXPRESS world , will be a crucial component in this work.
The concepts used there will be implemented in our base dictionary and then will each EXPRESS specification contribute to new words and expressions to be implemented.
We will write a generation grammar and dictionary in DCG (Definite Clause
Grammar) format [Clock80].
One possibility is to use a ready surface grammar e.g. FUF (Functional Unification Formalism) och SURGE [Elhad92] which is implemented in LISP.
A basic query interface will be designed as part of
the content selection process of the natural language generation system.
The second part of the project will be to generalize our work to see if our approach is feasible for other application protocols (APs) for e.g. ships and electrotechnical plants and write guidelines of how to create natural language generation systems for the other APs, and also see if there are EXPRESS tools which could make use of a natural language generation system.
Similar studies on proposed texts of conceptual models and formal language have been made in [Dalia92a,92b,93,95c,96a,b], where prototype tools where designed in [Dalia92a, 92b,95a,95b,95c] respectively. A discourse is a set of coherent natural language sentences (a text) and a discourse theory is a theory of how a set of sentences are related to each other in a discourse.
The aim of aggregation rules [Dalia93, 95c, 96a,b] are to remove redundant and repeated text in discourses but keep the content in the discourse.
e.g one simple example on aggregation.
John has a car.
Mary has a car
John's car is red
Mary's car is red
Aggregation =>
John and Mary have red cars.
We will use this technique on the EXPRESS domain but also try to extract new aggregation rules from the answers of the interview forms.
The tool will be designed according to the technique with text and sentence planners in [Dalia95a,95b,96a,b] and [McKeow88].
If we take a slightly simplified view of the text generation process as a pipeline of three stages: Text planning (which determines the content and overall discourse structure of the text material), and is followed by sentence planning (which decides on the sentence structure and scope), which in turn is followed by the surface generation which is the surface form realization (which is based on syntax) and lexical selection.
Questions which will be posed and hopefully answered are: How to use EXPRESS for Natural Language Generation (NLG)? What is lacking the EXPRESS language to be used for NLG? Are the concepts used in the AP 214 easy to express in Natural language? What can be added to the AP 214 and to other APs to make NLG easier? Which EXPRESS tools could be integrated with a natural language generation component? What about parsing of natural language to EXPRESS?
Project period July, 1996 - December 31, 1999.
Initial contact with design departments at car company Volvo via Volvo Data Corporation.
Try to find relevant EXPRESS parsing tools.
Literature search of methods and tools.
Proposing natural language texts describing EXPRESS specifications.
Initial tests of translating EXPRESS to Prolog format.
Writing first draft prototype for generation of natural language from EXPRESS
Write scientific papers for publications and to obtain input about our research at conferences.
Putting together a stable prototype of all components
Writing scientific papers about the prototype and the preliminary findings of the project
Carry out a study of all tools around STEP/EXPRESS to see if our generation tool will fit in.
What will be needed to create a natural language to EXPRESS tools?
and doing last literature search. Writing scientific papers for publication.
This time schedule is a rough one and may be revised
Breta95 I. Bretan et al. A Multimodal Environment for Telecommunication
Specifications. In Proceedings of the 1st International Conference on
Recent Advances in Natural Language Processing, pp. 191-198,
Tzigov
Chark, Bulgaria, September, 1995.
Chen83 P. P-S. Chen: English Sentence Structure and Entity Relationship Diagrams, Information Sciences 29, p.p. 127-149, 1983.
Chest76 D. Chester: The Translation of Formal Proofs into English, Journal of Artificial Intelligence, no 7 , pp. 261-278, 1976.
Clock84 W.F. Clocksin & C.S. Mellish: Programming in Prolog, Springer Verlag 1984.
Dalia92a H. Dalianis: A method for validating a conceptual model by natural language discourse generation. CAISE-92 Int. Conf. on Advanced Information Systems Engineering, Loucopoulos P. (Ed.), Springer Verlag Lecture Notes in Computer Science, no 593, pp. 425-444, 1992.
Dalia92b H. Dalianis. User adapted natural language discourse generation
for validation of conceptual models. Licentiate Thesis (SYSLAB Report No. 5). Dept. of Computer and Systems Sciences, The Royal Institute of Technology and Stockholm University, Sweden.
Dalia93 H. Dalianis & E. Hovy: Aggregation in Natural Language Generation. EWNLG-93, Proceedings of the 4th European Workshop on Natural Language Generation, Pisa, Italy 1993. Also in Trends in Natural Language Generation: an Artificial Intelligence Perspective, Springer Verlag Lecture Notes in Computer Science (forthcoming 1995).
Dalia95a H. Dalianis: Aggregation in the NL-generator of the VIsual and Natural
language Specification Tool. In Proceedings of The Seventh
International Conference of the European Chapter of the Association for
Computational Linguistics (EACL-95), Student Session, pp 286-290,
Dublin, Ireland, March 27-31, 1995.
Dalia95b H. Dalianis: Aggregation, Formal Specification and Natural Language Generation. In Proceedings of the NLDB'95, First International Workshop on the Applications of Natural Language to Data Bases, pp 135-149, Versailles, France, June 28-29, 1995.
Dalia96a H. Dalianis: Concise Natural Language Generation from Formal Specifications., Ph.D. dissertation, (Teknologie Doktorsavhandling), Department of Computer and Systems Sciences, Royal Institute of Technology/Stockholm University, June 1996, Report Series No. 96-008, ISSN 1101-8526, SRN SU-KTH/DSV/R--96/8--SE.
Dalia96b H. Dalianis & E, Hovy: On Lexical Aggregation and Ordering. In the Proceedings of the 8th International Workshop on Natural Language Generation, INLG-96, Herstmonceux, Sussex, UK, June 13-15, 1996,
Elhad92 M. Elhadad & J. Robin: Controlling Content Realization with Functional Unification Grammars, in Aspects of Automated Natural Language Generation, R.Dale, E.Hovy, D.Rosner and O.Stock eds, Springer Verlag, pp 89-104, 1992 .
Engst92 M.Engstedt & S. Preifelt: Results from the user tests of VINST, Ellemtel Utvecklings AB, (F92 0684), 1992.
Gitti94 A. Gittinger et al: EXPRESS Tools: Esprit Project Kactus P8145 Working paper WT1, April 5 1994.
Goldm75 N. Goldman: Conceptual generation, in Conceptual Information
Processing, Ed. R.C. Schank, North Holland Publishing Company,
pp.289-374, 1975.
Henju91 O. I. Henjum & O.B.H. Clarisse:Confirming Customer Expectations, in Proceedings of the National Communications Forum pp. 657-664, vol 45, 1991.
Hobbs85 J.R Hobbs: On the Coherence and Structure of Discourse, Center for the Study of Language and Information, Report No.CSLI-85-37, October 1985.
Hobbs90 J.R. Hobbs: Literature and Cognition, CSLI Lecture Notes Number 21, Center for the Study of Language and Information, 1990.
Höök93 H. Höök: A General Description of the Delphi
Language. Ellemtel internal
report, 1993.
ISO-91 The EXPRESS Language Reference Manual, ISO TC184/SC4/WG5, N14, Leeds, April 29, 1991.
ISO-94 Product Data Representation and Exchange. Overview and fundamental principles, ISO TC184/SC4, ISO 10303-1, 1994.
Johan91 P. Johannesson: MOLOC: Using Prolog for conceptual Modeling, In the
Proceedings of the 9th International Conference on Entity- Relationship
Approach, Ed. H, Kangassalo, pp. 289-302,
North Holland 1991.
Kelly91 V. E. Kelly & U. Nonnenmann: Reducing the Complexity of Formal Specification Acquisition, in Automating Software Design, ed by M. R. Lowry & R.D. McCartney, AAAI Press, Menlo Park, California, 1991.
Mann84 W. C. Mann: Discourse Structures for Text Generation, Proceedings of the 22nd annual meeting of the Association of Computational Linguistic, Stanford, CA, June 1984.
Mann88 W.C. Mann et al: Rhetorical Structure Theory: Towards a Functional Theory of Text Organization, In TEXT Vol 8:3, 1988.
McKeo88 K. McKeown & W.R. Swartout, Language generation and explanation,
in Advances Natural Language Generation, edited by Zock M & G. Zock,
Pinter Publishers Ltd1988.
Perei80 F.C.N Pereira & D.H.D. Warren: Definite Clause Grammars for
Language Analysis - A Survey of the Formalism and a Comparison with
Augmented Transition Networks. J. of Artificial Intelligence 13, 1980,
pp
231-278.
Ridle94 G.Ridley: Formal Methods for Requirement Specification - A Practical Approach using the EUA-Delphi Technology, Ellemtel Utvecklings AB, 1994
Rolla92 C.Rolland & C.Proix: A Natural Language approach for Requirements Engineering, CAISE-92 Int. Conf. on Advanced Information Systems Engineering, (Ed.) P. Loucopoulos, Springer Verlag Lecture Notes in Computer Science, no 593, pp. 257 - 277, 1992.
Swart82 B.Swartout: GIST English Generator: In Proceedings of AAAI-92, American Association of Artifical Intelligence, Carnegie-Mellon University and University of Pittsburgh, Pittsburgh, Pennylvania, 1982.