NASA
Ames Research Center,
Code IC,
Robust Software Engineering Group,
Bernd Fischer,
Johann Schumann.
Journal of Functional Programming,
Vol. 13, No. 3, May 2003, pp. 483-508.
Abstract
Data analysis is an important scientific task which is required whenever
information needs to be extracted from raw data.
Statistical approaches to data analysis, which use methods from
probability theory and numerical analysis, are well-founded but
difficult to implement: the development of a statistical data analysis
program for any given application is time-consuming and requires
substantial knowledge and experience in several areas.
In this paper, we describe AutoBayes, a program synthesis system for the
generation
of data analysis programs from statistical models. A statistical model
specifies the properties for each problem variable (i.e., observation or
parameter) and its dependencies in the form of a probability distribution.
It is a fully declarative problem description, similar in spirit
to a set of differential equations.
From such a model, AutoBayes synthesizes optimized and fully commented
C/C++ code which can be linked dynamically into the Matlab and
Octave environments.
Code is generated in a schema-guided deductive synthesis process.
A schema consists of a code template and applicability constraints which
are checked against the model during synthesis using theorem proving
technology. AutoBayes augments schema-guided synthesis by symbolic-algebraic
computation and can thus derive closed-form solutions for many problems.
It is well-suited for tasks like estimating best-fitting model parameters
for the given data. Here, we describe the system architecture, in
particular the schema-guided synthesis kernel. AutoBayes's capabilities are
illustrated by a number of advanced textbook examples and benchmarks.
Key words
formal methods, formal specifications, program synthesis, deductive synthesis,
Bayesian networks, statistical models, data analysis.
Contents
- Introduction
- Probabilistic and Grahpical Reasoning
- An Example: Mixture of Gaussians
- System Architecture
- The Synthesis Kernel
- Examples and Results
- Related Work
- Conclusions and Future Work
Download complete paper
(162 KBytes, compressed PostScript)
Go to the home page of the Automated
Software Engineering Group
Bernd Fischer /
fisch@email.arc.nasa.gov