Deep neural networks achieve state-of-the-art performance for image classification and other tasks but are easily fooled by forgeries which slightly modify a legitimate image in a specific direction and are visually indistinguishable from the original. We formulate detection of such forgeries as a watermark detection problem and derive locally optimal detectors based on Gaussian mixture models (GMMs) for low-dimensional image representations. The GMM parameters are learned from training data, and the reliability of our forgery detector is assessed for several image classification tasks.