Audience(s):
|
Software systems are becoming increasingly complex. This leads to an inevitable increase in bugs. Model learning is becoming a popular technique for finding these bugs. Instead of viewing a software system via its internal structure, model learning algorithms construct a model from observations of the system’s behaviour. The design of algorithms for models and model learning is a fundamental problem, and research in the area is important for making the techniques more applicable and accessible. This thesis presents the following contributions in this regard:
1. Model learning algorithms work by iteratively constructing a hypothesis for the system under learning. We are the first to formalize the notion of quality for these hypotheses, and we present a modification for the learning process that ensures that the quality of subsequent hypotheses never decreases (Chapters 3 and 4).
2. Hypotheses are refined by finding counterexamples for them. Input sequences that separate the states of an hypothesis play an important role in finding counterexamples. We present an algorithm for constructing these separating sequences efficiently (Chapter 2). In addition, we introduce fuzzing as a new, complementary technique for finding counterexamples (Chapter 6).
3. We explain how different model formalisms and observations of their behaviour can be expressed in a logic formula, and we show that satisfiability modulo theories solvers can be used to construct a hypothesis from this formula (Chapter 5). This new approach to model learning removes redundancy that is present in the classical approach when applied in practice.
4. A prerequisite for model learning is that the system’s input format is known. We give an overview of the tools and techniques for re- verse engineering the input format, and their applications in security (Chapter 7). |