When a picture is taken with a camera, the raw data of the camera
undergoes several processes before obtaining the final image. In
particular, a digital image is denoised and demosaiced by the
interpolation of missing colors in the Bayer pattern. Chromatic
aberrations and optical distortions are corrected, and non-linear
transformations are applied to enhance its contrast. The image is
finally compressed to be stored and transmitted in a reasonable
time. The JPEG (Joint Photographic Experts Group) standard is
the most commonly used digital image compression format for
photographs today. JPEG is a lossy compression method which reduces
the image size to the price of image degradation. The main one is the
appearance of an artifact in the form of squares, forming what is
called the JPEG grid. The stronger the compression, the more visible
this grid. Even when this signature is imperceptible to the naked eye
in slightly compressed images, it is statistically significant and
therefore detectable.
To these classic and automatic operations defining the image
processing chain, manual global and local modifications can be
performed by users. These can be alterations of color and brightness,
or local retouching. These modifications can have various purposes
including forgery. They are within the reach of everyone thanks to
easy to use image editing software. Local forgery generally implies
erasing, masking, cloning or inserting objects in an image. These
local operations cause a rupture of homogeneity of the compression
traces. The state of the art of forgery detection is based on these
considerations to propose digital filters, i.e. operators able
to highlight areas that appear to have been forged. However, a review
of these methods shows that they suffer from shortcomings in the
evaluation of the confidence that can be attributed to their
detection. In the absence of any probabilistic and statistical
modeling, it is not possible to quantify the confidence of the
detections.
Through the algorithms presented in this thesis, we aim to recreate a
complete compression history of the analyzed images. We provide a
statistical validation of the detected inconsistencies in the image
based on the a contrario detection methods, using large
deviation arguments. This process allows us to define the detected
events as such that could not occur by chance, being highly improbable
in a normal image.
As a result, the proposed detection tools offer an automatic analysis
of images, and do not require any interpretation or expertise in the
field. The algorithms developed are published and made available
online so that they can be used by the largest number of users, in
particular by fact-checking journalists via the verification
tool InVID-WeVerify
developed by Agence France-Presse news agency. In order to
make research reproducible, our scientific publications are delivered
together with their source code and their online execution via the
IPOL journal (Image Processing On
Line, https://www.ipol.im).
In the final part of this thesis, we explore the evaluation of forgery
detection methods themselves. We propose a methodology and a dataset
to study the sensitivity of the detection tools to specific traces, as
well as their ability to perform detection without semantic cues in
the image. More than a simple evaluation tool, this methodology can be
used to evaluate the strengths and weaknesses of each method.