Copyright, neighboring rights, fake news … is it possible to know the author, the source and the rights of the images published in the press? 36 French and international press sites were analyzed to evaluate the presence of this crucial information in image metadata.

 

Last year IMATAG published the first study on the presence of metadata, including credit and copyright, in images used by newspaper publishers on their websites.

The conclusion, although predictable, was quite alarming: a majority (97%) of the images published on the internet are stripped of their credit metadata, thus jeopardizing any reconciliation between an image reused outside its original context (a web page) and its source, author or rightholder.

Our report showed up the negligence of publishers in maintaining these identification data attached to the image file, while a simple technical specification to their CMS provider could save the credit from the purge caused by image compression.

#1  FOCUS 2019: PRESS SITES

Here is an update of these statistics. The websites of the same publishers as in 2018 were visited by our robots, this time by prioritizing the study of the images displayed on the homepage of the site and on the pages of the complete articles linked from this page. The observation was made over the period of May 2019, on a sample of 36 french and international press sites, representing 100,000 images (with a minimum of 1,000 photos per site).

credit or copyright metadata study on news web sites  métadonnées de credit ou copyright sur les sites de presse

#2  ONLY 5 SITES HAVE MORE THAN 50% IMAGES WITH CREDIT

Publishers with a high rate of credit metadata have a CMS designed to keep them “alive” (that is, not to delete them). Very few are in this case: The Spiegel is the best example.

#3  5  OTHERS HAVE GOOD INTENTIONS BUT ALSO A FEW QUACKS

Those between 30% and 50% may have strange behaviors when looking at certain details.

For example, Politico, when analyzed by our robots, provides images credited by default. However, if as a human user you browse their site, the images that will be displayed for you will be resized to fit your device profile (screen size and resolution, network). These resized images, alas, have no metadata

Default Image as found in page's source code has metadata
Default Image as found in page’s source code: credit is “Copyright 2018 the Associated Press. All Rights Reserved.”
Resized Image when the page is displayed to a user: no metadata
Resized Image when the page is displayed on a laptop: metadata fields are missing.

Another example: Le Figaro. This year, we modified our algorithm to focus only on the fields for photo credit. We discovered that some credits were missing while the images had metadata: in fact, the problem was that the “owner” field was filled instead of the “credit” field. Not in the right place, so …

#4  BELOW 30%, EDITOR DOES NOT REALLY CARE

Less than 30% may have metadata “by accident”, which means that the CMS does not erase them, but they probably disappeared before, during the editing process.

#5  THE MAJORITY OF SITES CLEAR METADATA SYSTEMATICALLY

In the end, the majority of them have credit metadata close to 0%. Whether negligently at the level of the news publisher, or its editors or its image providers, the collective responsibility of the whole chain is engaged, and this will have far more consequences now that a law has been passed by the European Parliament to reward contributors to the information flows exploited by GAFA platforms.

Without proof that the contents relayed by Facebook or Google News are yours, no neighboring rights! This is exactly why metadata were designed for: to sign your visual productions.

#6  PALMARES OF PHOTOGRAPHIC AGENCIES, A NEW EXCLUSIVE IMATAG DATA

To demonstrate the usefulness of credit metadata, IMATAG has identified in the 3% of images of the web that still contain a credit those containing the name of a photographic agency.

Then, by agency, the number of websites on which his images were found allowed to rank them between themselves. This gives them a good idea in the global coverage.

 

The supremacy of Getty Images in this area was not to prove, it is simply flagrant in this study. Then the trio AFP, REUTERS, AP follows, in relatively equal shares. Then other newswires succeed each other, their rank depending on their production volume and the extent of their geographical coverage.

WHAT TO REMEMBER FROM THIS STUDY

To assert their copyright and neighboring rights, all actors in the image production and distribution chain must mobilize to safeguard image metadata intended to identify rights holders.

To mitigate the erasure or the falsification of this unavoidable data, it is also necessary to provide the press industry with a unique and secure Content Registry, allowing the recording of the content produced and broadcast by the publishers, and the consultation of these contents by the platforms to check the rights (copyright) or to evaluate their veracity (fake news).

This solution exists, we will talk about RoC, Register of Content, in a future article!

RoC, Register of Content