» Complex Systems Computation Group CoSCo

MDL-Based Methods for Image Denoising (KUKOT)

Duration: 01.01.2006-31.07.2008
Funding: Tekes
Project leader: Professor Petri Myllymäki
Keywords: minimum description length (MDL), denoising

Abstract

We can consider digital bit streams processed in the ICT sector as consisting of two overlapping parts, where one part is useful information and the other is useless noise. There is noise in all digital media; it is generated by the faults in original information sources (such as poor image resolution) and errors in signal transmission (such as disruptions in wireless communications or faults in hard drives). Noise can be filtered if the features of the source are known (in some degree at least), but it is very difficult to build general methods for denoising since they have to be able to construct adaptive models of random noise sources. The main problem with such adaptive modelling is the regularization of models; too complex (over-adaptive) models will interpret noise as part of the information and thus be rendered useless.

Minimum Description Length (MDL) (see www.mdl-research.org) is an information-theoretical framework developed by the father of arithmetic encoding, Jorma Rissanen. It provides an elegant solution for the regularization problem. Unfortunately, the methods based on the MDL theory are often very challenging computationally. The project team has studied how to implement MDL in a manner that is feasible for practical applications, and managed to develop computationally efficient methods suitable for many interesting model classes. In addition, the team has developed two new variants of the NML criterion: sequential NML and factorized NML. The analysis of these new methods is still in progress.

The research consortium consists of two sub-groups: the Complex Systems Computation group at the Department of Computer Science at the University of Helsinki (Prof. Petri Myllymäki, the coordinator) and the Laboratory of Computational Technology at Helsinki University of Technology (Dr. Jukka Heikkonen).

Publications

P. Kontkanen and P. Myllymäki, MDL Histogram Density Estimation. In Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS 2007), Puerto Rico, March 2007.
P. Kontkanen and P. Myllymäki, A linear-time algorithm for computing the multinomial stochastic complexity. Information Processing Letters 103 (2007) 6 (September), 227–233.
T. Silander and P. Kontkanen and P. Myllymäki, On Sensitivity of the MAP Bayesian Network Structure to the Equivalent Sample Size Parameter. Pp. 360–367 in the Proceedings of the The 23rd Conference on Uncertainty in Artificial Intelligence (UAI-2007), edited by R. Parr and L. van der Gaag. AUAI Press, 2007.
H. Wettig, P. Kontkanen and P. Myllymäki, Calculating the Normalized Maximum Likelihood Distribution for Bayesian Forests. IADIS International Journal on Computer Science and Information Systems 2 (2007) 2 (October). Also: In Proceedings of the IADIS International Conference on Intelligent Systems and Agents 2007. Lisbon, Portugal, 2007 (Outstanding paper award).
T. Mononen and P. Myllymäki, Fast NML Computation for Naive Bayes Models. Pp. 151-160 in the Proceedings of the 10th International Conference on Discovery Science, edited by V. Corruble, M. Takeda and E. Suzuki. Lecture Notes in Artificial Intelligence 4755, Springer 2007.
J. Rissanen, P. Grünwald, J. Heikkonen, P. Myllymäki, T. Roos, and J. Rousu, (2007). Editorial: Information Theoretic Methods for Bioinformatics, EURASIP Journal on Bioinformatics and Systems Biology.
P. Kontkanen, H. Wettig and P. Myllymäki, NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks. EURASIP Journal on Bioinformatics and Systems Biology, Volume 2007 (2007), Article ID 90947, 11 pages.
J. Rissanen, and T. Roos, (2007). Conditional NML Universal Models, in Proc. 2007 Information Theory and Applications Workshop (ITA-07), pp. 337–341.
T. Roos, (2007), Statistical and Information-Theoretic Methods for Data Analysis, Doctoral Dissertation, Department of Computer Science, University of Helsinki.
J. Rissanen, Information and Complexity in Statistical Modeling. Springer, 2007.
P. Grünwald, P. Myllymäki, I. Tabus, M. Weinberger and B. Yu (eds.), Festschrift in Honor of Jorma Rissanen. TICSP Series #38, Tampere International Center for Signal Processing, 2008. Presented at the 2008 IEEE Information Theory Workshop.
P. Myllymäki, T. Roos, T. Silander, P. Kontkanen and H. Tirri, (2008), Factorized NML Models. Chapter 11 in Festschrift in Honour of Jorma Rissanen , edited by P. Grünwald, P. Myllymäki, I. Tabus, M. Weinberger and B. Yu.
T. Mononen and P. Myllymäki, Computing the NML for Bayesian Forests via Matrices and Generating Polynomials. In Proceedings of the 2008 IEEE Information Theory Workshop (ITW-08). IEEE 2008.
T. Roos, (2008). Monte Carlo Estimation of Minimax Regret with an Application to MDL Model Selection, IEEE Information Theory Workshop 2008 (ITW-08), Porto, Portugal, May 5–9.
T. Mononen and P. Myllymäki, On the Multinomial Stochastic Complexity and its Connection to the Birthday Problem. To appear in Proceedings of the 2008 International Conference on Information Theory and Statistical Learning (ITSL 2008).
P. Kontkanen and P. Myllymäki, An Empirical Comparison of NML Clustering Algorithms. To appear in Proceedings of the 2008 International Conference on Information Theory and Statistical Learning (ITSL 2008).
T. Roos, T. Silander, P. Kontkanen, and P. Myllymäki, (2008). Bayesian Network Structure Learning using Factorized NML Universal Models, 2008 Information Theory and Applications Workshop (ITA-08), San Diego, CA, January–February 2008.
T. Silander, T. Roos, P. Kontkanen, and P. Myllymäki, (2008). Factorized NML Criterion for Learning Bayesian Network Structures, 4th European Workshop on Probabilistic Graphical Models (PGM-08), September 17–19, Hirtshals, Denmark (to appear).
T. Mononen, and P. Myllymäki, (2008). Computing the Multinomial Stochastic Complexity in Sub-Linear Time, 4th European Workshop on Probabilistic Graphical Models (PGM-08), September 17–19, Hirtshals, Denmark (to appear).

University of Helsinki | Department of Computer Science | Helsinki Institute for Information Technology

cosco@hiit.fi