Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Christian Colombo on FSM-based Monitoring and Runtime Verification

Christian Colombo on FSM-based Monitoring and Runtime Verification


1. We are here at Erlang Factory 2011 in London and I am sitting here with Christian Colombo. So, Christian, who are you?

I work at the University of Malta, I am an assistant lecturer there, and we do some research mostly focused on runtime monitoring. And now we are shifting our focus to runtime monitoring on Erlang. So that is why I am here.


2. What is monitoring? What is so interesting about monitoring?

Usually we just receive maybe quite blurred specifications and we as developers we have to implement it and come up with some sort of working thing according to those specifications. And then we give it to our client and we receive feedback, maybe he has encountered a bug or something like that, so how do we ensure that we don’t get so many bugs? What should we do? Ok there is testing there are many techniques but monitoring would ensure that you would insert some monitors inside the code so that even when the code has left you and has been delivered to the customer, there are still checks going on.

So if something goes wrong it is detected, you don’t need the client to tell you, the monitors would detect the bugs themselves. And you can do something about them from the monitors themselves. So the monitor can detect the bug and fix it on the fly so that is the difference, whereas for example in testing you just detect bugs in runtime verification you can detect bugs and fix them there and then.


3. Does that mean I have to write a lot of logging statements? Do I have to log everything? How does that work?

Not really logging, what you need to do is use aspect oriented programming so that you can use sort of pattern matching to detect relevant events in your program, so for example you want to say "On this method call tell the monitor that it has to do a next step", "On this exception handler call, you tell the monitor that it has to do a step". So, essentially yes you are sort of generating a trace of relevant events. But the code is instrumented automatically, you don’t have to go in certain places in the code and insert the code manually, so it’s done automatically. And that way it’s less error prone as well.


4. So you use AspectJ to insert these probes?

Yes, in the case of Java we use AspectJ, obviously then according to the technology where we apply monitoring you have to use the relevant technology for aspect oriented programming for that language.


5. But the people who write, if I write an application and I want to use monitoring do I have anything to do with AspectJ or is that in your tool?

You don’t have to use Aspect J you just have to write the specification using finite state automata, you just specify the events that you are interested in and then my tool automatically generates the aspect oriented code depending on the technology that you are using. So in the case of Java it generates AspectJ in the case of Erlang, it generates code for the tracer. So you configure the tracer to generate the events automatically.


6. With AspectJ, do you handle static class files, or do you do that at load time?

You have the choice, you can do it before you run the code but you can do it actually on the fly because AspectJ supports that. So it really depends on the technology, it’s not sort of part of our tool, that is handled by the aspect oriented programming technology that you are using.


7. Your tool is called Larva for Java and eLarva for Erlang. How do I use your tool? Do I have to include a library? How do I write my specifications?

All you have to do is you follow the manual, you just write a specification, you compile the specification, and after the compilation process you would have two set of files, you would have the aspect oriented files, and the Java files which basically the Java files implement the automaton that you have coded as your specification, that is implemented as a Java program. Then the aspect oriented program that the compiler generates would glue the monitors to your system. So essentially then all you have to do is run the system, with the AspectJ code and the monitor code and you are automatically running the monitor concurrently.

All you have to do is just write the specifications, compile it, and just put the files with your system and now it will run with the monitor running as well.


8. So the monitor runs in the same process as the program?

In the case of Java, yes you would have them running in the same address space. In the case of Erlang because it supports more easily distribution, you can run them on a different separate machine so you can avoid direct overheads, into your system.


9. With the JVM would it also be possible to move the process of monitoring out?

The only problem would be that you'd have more latency probably and so probably it would slow down a bit your system because it’s synchronous so that would probably tend to slow down a bit the system.


10. The specifications that you talked about - are they a model, are they some state machine in a custom language?

They are basically automata, a finite state machine with some extensions where you can have clocks to monitor real time properties, we also have some translators from your specification language to our specification language, so that shouldn’t really be a problem. Because these are quite expressive as property languages so all you have to do possibly if your specifications are in another language, probably it would be easy also to just translate them into this accepted language. Which Larva handles.


11. What would the specification describe? Is it the behavior of the application?

Yes, maybe I should give an example to make it easier. So for example imagine that I have a system where I have a login module. And I want to check that no more that three bad logins successively can happen. So all you have to do is describe this in an automaton and then you say "Ok, if I encounter a bad login and then another bad login, and then another bad login then I go to a bad state" and the monitor would signal the bad state with some flag or something or take some action to block the user for example. So that would be a typical property that you would write in Larva. Then you can make it more interesting, so for example you want to say no more than three bad logins successively within five minutes.

So you can insert the real time aspect or you can parameterize it and say no more that three bad logins for each user so you are parameterizing your monitoring and then essentially Larva would replicate the monitor for each user the system encounters. That would make it much more easy for the monitor.


12. How does this relate to model driven development, or model based development? How does it relate?

In fact it’s very similar, for example if you have model based testing, you can use your models and instead of just using them for testing you can keep them for after deployment so you are not just testing the code before deployment but then the checks keep on being there and if something bad happens you detect it because usually in the testing phases it’s difficult to generate all the possible scenarios that you would encounter in real life so that would be a great advantage with no cost because you already have your models all you are doing is you are running them during the normal use of the system.

Obviously there will be some overhead cost which you have to pay but usually it is acceptable because usually it runs for example ten percent or five percent of the system execution time. So usually it’s an acceptable cost to pay for that extra assurance that your code is really respecting the model that it was intended to implement.


13. How does this relate to things like design by contract, or simply assertions or something like that?

Basically runtime monitoring inserts assertions automatically into your code. So instead of manually going into your code and adding assertions here and there, essentially that is handled by the aspect-oriented system that just injects assertions into your code. More than that sometimes the properties are quite complex, so it’s not easy just to put an assertion somewhere sometimes you would need extra variables, you would need some timers. So you end up cluttering your code with the possibility that while you are adding extra checks you’re possibly inserting other bugs.

Sometimes it’s not really an option, so having the monitor you would specify the properties separately, then you compile everything automatically so the possibility of adding bugs assuming that the compiler is acting correctly there is none, and automatically it’s inserted into your system. So the way I see it you are avoiding a lot of error prone extra assertions into your program and you have the checking still there.


14. You are separating the concern.

Exactly, that’s basically the way I see it that’s the most advantageous aspect of monitoring, you can organize your code much better and you can separate the modules such that you can focus while you are implementing your system you can focus on what the system does and when you are coding the exception handling tasks and the extra checks you just put them in a separate module which is the monitor. And another advantage is that the monitor you specify it in another language, which is at a higher level so usually in assertions you are just using the same language for example if you are writing an assertion in Java you are using Java again which might tend to repeat the same problem.

But if you are writing it now in a mathematical notation you are probably reducing the possibilities of repeating the same mistakes that you had in the implementation. So you sort of have the specification at the mathematical level, you have the implementation at the code level, and you are sort of connecting the two together, because of monitoring.


15. Do you think that you could call that an application of aspect oriented programming or is it just accident that you use AspectJ for instrumentation?

I would say that it is quite, aspect oriented programming plays a crucial role in this because you re using it to inject code. But then there is also another level which is the mathematical notation, the application of formal methods. So you can use aspect oriented programming for checking - that is great but then using our framework that is another added value which is the mathematical notation, so you are not coding again using the same language, in this case Java, but you are taking it at a higher level and you are specifying your properties mathematically. So the way I see it there are two aspects: one, which is the insertion of the code and one which is the compilation of mathematical specifications into code.


16. What’s your experience with, should the monitor impact the way the program runs or should it be possible to turn the monitor off and the program still runs correctly?

I think it highly depends on the administrator of the system. If the administrator of the system is very cautious and maybe he doesn’t want the monitor to interfere so in that case the monitor plays a minor role, so that it just detects problems if they occur. And in that case you can even take the monitoring offline not necessarily in sync with the system. But then you can take another approach which is the error handling code is handled by the monitor so the monitor plays a crucial role in the system so while the system is executing the monitor can even take actions. I think that really depends on the mentality of who is designing the system. If you trust the monitor it can play a crucial role and it would be a nice way to sort of separate the modules and the concerns.

But for example we had a case study on a financial system where we didn’t really want to interfere with the system because the system had already been designed and monitoring was sort of not part of the design. So they wanted monitor sort of like an automatic "auditor" of their financial system. In the case where you have a new system being built and being designed with monitoring in mind, I think that’s where monitoring can play a much bigger role in the system.


17. I think you also mentioned that monitoring can help with migrating of code if you move from one version of the code to a next and to make sure it still works. Have you seen this?

The advantage is this if you use assertions they are very brittle so if you update your code you have to update your assertions. When you have the mathematical specification, if you change your code the specification would usually not change. So I think that is a big advantage as well so taking the same property even across different technologies so for example now we are looking at another version of Larva where the same specifications can be compiled to different technologies. So in reality not only different versions of the same code but even across different technologies.


18. It’s also harder to cheat on mathematics than in code.

Usually even the description in mathematics is shorter so usually you kind of pin point problems and bugs more easily and in our case we use finite state automata which given that they are pictorial maybe they would be more helpful to sort of look at it and you can understand what is happening.


19. You can have two teams working on it, testers or modelers and the implementers.

Exactly and also monitoring offers a way you can connect the two together so you sort of have an interface between them and assuring that every step in the code would reflect a step in the specification. So that is really great even if you find the bug it would be usually easily traceable to your specification.


20. Talking of actions that the monitor can take is there a difference between Java and Erlang in what the monitor can do?

Actually the versions that we have now in Java we have synchronous monitoring in Erlang we have asynchronous monitoring so that really affects what actions you can take. So in Java we have the freedom to take any action because you know the error has happened now and you want to take an action now and you have the freedom to take the action. In Erlang it’s a bit different because since it is asynchronous if a problem happens now the monitor might detect it a few seconds late or a few minutes late so you might be a bit late to take actions but the advantage is that Erlang processes are separate so usually it’s much more easy to isolate errors.

Even though errors might be detected a bit late you can still usually either kill a process or kill and restart a process or maybe if you have another process which would replace the violating process and you want to replace it. So that would be the advantage of Erlang that even though they have asynchronous monitoring since the processes are isolated and do not share memory usually taking actions is quite easier because errors usually are isolated.


21. If we want to look at Larva or eLarva where could we find it?

We have our website our research is called SVRG so just Google it [Editor's note: and ]and you would find our website and on demand Larva is available for free and even eLarva we would be pleased to share it with other developers.

Sep 23, 2011