MINOS is a research project in the Helsinki Institute for Information Technology (HIIT), funded by the Academy of Finland. The research is carried out during 2002-2005 by the Complex Systems Computation Group (CoSCo).
The general objective of MINOS is to study and develop minimum encoding approaches to predictive modeling and their relationship to statistical procedures. A strong basic research component The basic research component of MINOS is concerned with the fundamental relationships between the MDL approach and other probabilistic modeling frameworks, and with the problems of developing computationally feasible solutions to the minimum encoding inference for important model families including graphical models (e.g., Bayesian networks) and finite mixture models. An applied methodological component In the applied component the developed minimum encoding techniques are applied to predictive modeling and data analysis in various real-world problem domains involving both industrial applications (e.g., telecommunications, fault diagnosis problems, customer segmentation) and scientific problems in the areas such as medicine, social sciences and ecology. With the above general objective in mind, the research concentrates on the following more specific research areas. Minimum encoding model selection Our purpose is to study and develop computationally feasible approximations of minimum encoding model selection criteria, in particular for the new Normalized Maximum Likelihood formulation of MDL. From the applied research point of view, one promising approach is to use the predictive sequential formulas for computing the minimum encoding model selection criteria, which can be used for developing computationally efficient, on-line model selection algorithms. Minimum encoding predictive inference From the predictive inference point of view the minimum encoding approach faces a number of interesting problems, e.g.: Straightforward application of the NML criterion does not produce a random process, which may cause anomalies in certain prediction tasks. If the utilities attached to possible outcomes do not follow the standard logarithmic loss framework, the predictive inference procedures have to be modified accordingly. The minimum encoding predictive inference becomes a more complex issue when one wishes to use linear combinations of several model classes. The methods of the research vary from theoretical analysis to functional prototypes of the developed methods.
With the above general objective in mind, the research concentrates on the following more specific research areas.
Complex Systems Computation Group DeepC Project Minimum Description Length on the Web
J.Rissanen, Complexity and Information in Data. Chapter 15 in A.Greven, G.Keller, and G.Warnecke (Eds.), ENTROPY, Princeton University Press, Princeton and Oxford, 2003. P.Kontkanen, W.Buntine, P.Myllymäki, J.Rissanen, H.Tirri, Efficient Computation of Stochastic Complexity. Pp. 181-188 in Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics (AISTATS), Society for Artificial Intelligence and Statistics, 2003. P.Kontkanen, P.Myllymäki, W.Buntine, J.Rissanen, H.Tirri, An MDL Framework for Data Clustering. In P.Grünwald, I.J.Myung and M.Pitt (Eds.), Advances in Minimum Description Length: Theory and Applications, MIT Press, 2005. T.Roos, P.Myllymäki, H.Tirri, On the Behavior of MDL Denoising. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS), 2005.
Professor Henry Tirri Complex Systems Computation group Helsinki Institute for Information Technology University of Helsinki and Helsinki University of Technology