The problem of performing similarity queries on compressed data is considered. We study the fundamental tradeoff between compression rate, sequence length, and reliability of queries performed on compressed data. For a Gaussian source and quadratic similarity criterion, we show that queries can be answered reliably if and only if the compression rate exceeds a given threshold – the identification rate – which we explicitly characterize. We then discuss the case of binary symmetric source and Hamming distortion, for which we have similar results. Finally, we turn to the general case and show that the exact calculation of the identification rate is tightly connected to the existence of an isoperimeric inequality over an appropriate domain.