Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions

Title	Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions
Publication Type	Journal Articles
Year of Publication	2011
Authors	Tzoumas K, Deshpande A, Jensen CS
Journal	Proceedings of the VLDB Endowment
Volume	4
Issue	7
Date Published	2011///
Abstract	As a result of decades of research and industrial development, mod-ern query optimizers are complex software artifacts. However, the quality of the query plan chosen by an optimizer is largely deter- mined by the quality of the underlying statistical summaries. Small selectivity estimation errors, propagated exponentially, can lead to severely sub-optimal plans. Modern optimizers typically maintain one-dimensional statistical summaries and make the attribute value independence and join uniformity assumptions for efficiently esti- mating selectivities. Therefore, selectivity estimation errors in to- day’s optimizers are frequently caused by missed correlations be- tween attributes. We present a selectivity estimation approach that does not make the independence assumptions. By carefully using concepts from the field of graphical models, we are able to fac- tor the joint probability distribution of all the attributes in the data- base into small, usually two-dimensional distributions. We describe several optimizations that can make selectivity estimation highly efficient, and we present a complete implementation inside Post- greSQL’s query optimizer. Experimental results indicate an order of magnitude better selectivity estimates, while keeping optimiza- tion time in the range of tens of milliseconds.

Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions

Publications