At SONY CSL, we have pioneered the use of timbre representation (MFCCs, spectral models, etc.) for comparing music titles (MFCCs, spectral models, etc). Usually these representation were used for speech processing.
Our work consisted in using them for extracting a representation of the global timbre color of an entire title. Thanks to this representation, distances between titles can be computed that represent some sort of “timbral similarity”. These algorithms won the ISMIR 2004 genre classification contest and are now widely used and extended in the MIR community.