Abstract
We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sense, and was first proposed for certain vision problems by Peleg, Werman, and Rom. For image retrieval, we combine this idea with a representation scheme for distributions that is based on vector quantization. This combination leads to an image comparison framework that often accounts for perceptual similarity better than other previously proposed methods. The EMD is based on a solution to the transportation problem from linear optimization, for which efficient algorithms are available, and also allows naturally for partial matching. It is more robust than histogram matching techniques, in that it can operate on variable-length representations of the distributions that avoid quantization and other binning problems typical of histograms. When used to compare distributions with the same overall mass, the EMD is a true metric. In this paper we focus on applications to color and texture, and we compare the retrieval performance of the EMD with that of other distances.
Similar content being viewed by others
References
Ahuja, R.K., Magnanti, T.L., and Orlin, J.B. 1993. Network Flows. Prentice Hall: Englewood Cliffs, NJ.
Belongie, S., Carson, C., Greenspan, H., and Malik, J. 1998. Colorand texture-based image segmentation using EM and its application to content-based image retrieval. In IEEE International Conference on Computer Vision, Bombay, India. pp. 675–682.
Bentley, J.L. 1975. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18:509–517.
Bigün, J. and du Buf, J.M. 1994. N-folded symmetries by complex moments in Gabor space and their application to unsupervised texture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(1):80–87.
Bovik, A.C., Clark, M., and Geisler, W.S. 1990. Multichannel texture analysis using localized spatial filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(12):55–73.
Bozkaya, T. and Ozsoyoglu, M. 1997. Distance-based indexing for high-dimensional metric spaces. SIGMOD Record (ACM Special Interest Group on Management of Data), 26(2):357–368.
Brodatz, P. 1966. Textures: A Photographic Album for Artists and Designers: Dover: New York, NY.
Clarkson, K.L. 1997. Nearest neighbor queries in metric spaces. In ACM Symposium on the Theory of Computing, pp. 609–617.
Cover, T.M. and Thomas, J.A. 1991. Elements of Information Theory. Wiley Series in Telecommunications. John Wiley & Sons: New York, NY, USA.
Das, M., Riseman, E.M., and Draper, B.A. 1997. FOCUS: Searching for multi-colored objects in a diverse image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 756–761.
Daugman, J.D. 1998. Complete discrete 2-d Gabor transforms by neural networks for image analysis and compression. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36:1169–1179.
Duda, R.O. and Hart, P.E. 1973. Pattern Classification and Scene Analysis. Wiley: New York.
Farrokhnia, F. and Jain, A.K. 1991. A multi-channel filtering approach to texture segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 364–370.
Field, D.J. 1987. Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A, 4(12):2379–2394.
Gabor, D. 1946. Theory of communication. The Journal of the Institute of Electrical Engineers, Part III, 93(21):429–457.
Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., and Niblack, W. 1995. Efficient color histogram indexing for quadratic form distance functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(7):729–735.
Hillier, F.S. and Lieberman, G.J. 1990. Introduction to Mathematical Programming. McGraw-Hill, New York, NY.
Hitchcock, F.L. 1941. The distribution of a product from several sources to numerous localities. J. Math. Phys., 20:224–230.
Karmarkar, N. 1984. A new polynomial-time algorithm for linear programming. In Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, Washington, DC. pp. 302–311.
Klee, V. and Minty, G. 1972. How good is the simplex algorithm. In Inequalities, Vol. III, O. Shisha (Ed.). Academic Press: New York, NY, pp. 159–175.
Kullback, S. 1968. Information Theory and Statistics Dover: New York, NY.
Liu, F. and Picard, R.W. 1996. Periodicity, directionality, and randomness: Wold features for image modeling and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7):722–733.
Manjunath, B.S. and Ma, W.Y. 1996. Texture features for browsing and retrieval of image data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8):837–842.
Nasrabad, N.M. and King, R.A. 1988. Image coding using vector quantization: A review. IEEE Transactions on Communication, 36(8):957–971.
Niblack, W., Barber, R., Equitz, W., Flickner, M.D., Glasman, E.H., Petkovic, D., Yanker, P., Faloutsos, C., Taubin, G., and Heights, Y. 1993. Querying images by content, using color, texture, and shape. In SPIE Conference on Storage and Retrieval for Image and Video Databases, Vol. 1908, pp. 173–187.
Peleg, S., Werman, M., and Rom, H. 1989. A unified approach to the change of resolution: Space and gray-level. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:739–742.
Poynton, C. 1996. A Technical Introduction to Digital Video. John Wiley and Sons, New York, NY.
Puzicha, J., Hofmann, T., and Buhmann, J. 1997. Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 267–272.
Rachev, S.T. 1984. The Monge-Kantorovich mass transference problem and its stochastic applications. Theory of Probability and its Applications, XXIX(4):647–676.
Russell, E.J. 1969. Extension of Dantzig's algorithm to finding an initial near-optimal basis for the transportation problem. Operations Research, 17:187–191.
Shen, H.C. and Wong, A.K.C. 1983. Generalized texture representation and metric. Computer, Vision, Graphics, and Image Processing, 23:187–206.
Smith, J.R. 1997. Integrated Spatial and Feature Image Systems: Retrieval, Analysis and Compression. Ph.D. Thesis, Columbia University.
Stolfi, J. 1994. Personal communication.
Stricker, M. and Orengo, M. 1995. Similarity of color images. In SPIE Conference on Storage and Retrieval for Image and Video Databases III, Vol. 2420, pp. 381–392.
Swain, M.J. and Ballard, D.H. 1991. Color indexing. International Journal of Computer Vision, 7(1):11–32.
Tversky, A. 1977. Features of similarity. Psychological Review, 84(4):327–352.
Werman, M., Peleg, S., and Rosenfeld, A. 1985. A distance metric for multi-dimensional histograms. Computer, Vision, Graphics, and Image Processing, 32:328–336.
Wyszecki, G. and Stiles, W.S. 1982. Color Science: Concepts and Methods, Quantitative Data and Formulae. John Wiley and Sons: New York, NY.
Zikan, K. 1990. The Theory and Applications of Algebraic Metric Spaces. Ph.D. Thesis, Stanford University.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rubner, Y., Tomasi, C. & Guibas, L.J. The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision 40, 99–121 (2000). https://doi.org/10.1023/A:1026543900054
Issue Date:
DOI: https://doi.org/10.1023/A:1026543900054