# Discretization of continuous features

In statistics and machine learning, **discretization** refers to the process of converting or partitioning continuous attributes, features or variables to discretized or nominal attributes/features/variables/intervals. This can be useful when creating probability mass functions – formally, in density estimation. It is a form of discretization in general and also of binning, as in making a histogram. Whenever continuous data is discretized, there is always some amount of discretization error. The goal is to reduce the amount to a level considered negligible for the modeling purposes at hand.

Typically data is discretized into partitions of *K* equal lengths/width (equal intervals) or K% of the total data (equal frequencies).^{[1]}

Mechanisms for discretizing continuous data include Fayyad & Irani's MDL method,^{[2]} which uses mutual information to recursively define the best bins, CAIM, CACC, Ameva, and many others^{[3]}

Many machine learning algorithms are known to produce better models by discretizing continuous attributes.^{[4]}

## Software

This is a partial list of software that implement MDL algorithm.

- discretize4crf tool designed to work with popular CRF implementations (C++)

