DPEF (Durable and Plain Experiment Format)¶
Introduction¶
When we started the development of Optsicom Optimization Suite, we designed it like a personal tool to facilitate the develop of optimization algorithms. In fact, Optsicom Suite born with the name JMH (Java Metaheuristics) and was developed during Micael Gallego's phD presented in December of 2008. But now things are changed. Several researchers develop and improve Optsicom Suite and more people use it to develop optimization algoritms.
During this time, we have detected an important issue with the format that we use to save the experiment results. This information need to be inspected by different researchers with different programming skills, operating systems, programming languages, etc. In some cases, researchers belongs to different organizations and the communication is not easy and direct. Since beginning and until end of 2010, Optsicom Framework stored experiment results in files with serialized Java objects format (SJOF). Obviously, this format is not well suited for public consumption. Moreover, it is not easy to manage several experiments. To use information in this format, it is neccesary to load the whole experiment to perform analysis.
Due to the problems with serialized java objects format, we have decided to store the experiment results in a database. This format is more suitable for manage several experiments, it is possible to query the information in powerful ways and can be used (saved and retrieved) remotely. We have experimented several issues with the database format. We have a very naive database structure and it is very inefficient. We also detected several important bugs that can offer incorrect results when performing analysis. In order to correct this issues in the database, we need to change the database schema. The problem is that we have data in the database and schema evolution with data is very difficult. Also, we have the same problem with database that we have with serialized java objects format, database format is not well suited to share information between reserchers.
Another important issue that we need to tackle is the durability of the format. We can not change the format of the experiment results every year. We need a format easy to share between researchers and easy to process with other programming language, not in Java. The most suited formats for durability and sharing are text based ones. If a researcher can open a file and understand its content, it is a durable format and it is easy to share.
For this, we have decided to create DPEF, the Durable and Plain Experiment Format. In the next sections, the format will be describe. This format will be used as a canonical format to store all experiment results. This format can be used also to perform schema evolution in database, because we can always save and restore information in this format.