Text and data mining (TDM) denotes the automated analysis of large quantities of digital content in order to derive patterns, trends and connections. Under Art. 4 of the EU Copyright Directive (DSM Directive) and its national implementations — §44b UrhG (Copyright Act) in Germany, §42h(6) öUrhG (Austrian Copyright Act) in Austria — commercial TDM is permitted in principle, as long as the rightholder has not declared a reservation of use. For works available online, this reservation must be machine-readable.
Note: This article offers a general overview and does not replace individual legal advice. The legal position of mediaintel is documented in the TDM reservation.
The DSM Directive and its Article 4
Directive (EU) 2019/790 on copyright in the Digital Single Market — the DSM Directive for short — created the first EU-wide exceptions for text and data mining. Art. 3 permits TDM for research organisations without an opt-out option. Art. 4 permits TDM for all other purposes, including commercial ones — but grants rightholders the right to expressly reserve the use.
§44b UrhG (DE) and §42h öUrhG (AT)
Germany implemented Art. 4 in §44b UrhG: reproductions for the purpose of text and data mining are permitted, provided the rightholder has not reserved them; for works available online, the reservation is only effective if declared in machine-readable form. Austria combined both TDM exceptions in §42h öUrhG: §42h(1) corresponds to the research exception under Art. 3 (without opt-out, for scientific research), while §42h(6) is the commercially usable counterpart to §44b UrhG — TDM on lawfully accessible works, provided the rightholder has not declared a reservation. Both provisions require that lawfully accessible works be used.
The machine-readable reservation of use (opt-out)
The reservation must be discoverable and machine-readable for the party engaged in the mining. In practice this happens via established signals and locations, such as:
- entries in the
robots.txtin the root directory of the domain that exclude text and data mining or specific crawlers, - metadata, HTTP headers or rights statements anchored in the HTML that prohibit TDM use,
- express reservations in terms and conditions, terms of use or the legal notice,
- structured rights statements according to relevant standards.
An important distinction: merely blocking AI-specific bots (such as GPTBot) is not automatically a general TDM reservation of use — the reservation must clearly target text and data mining and be discoverable for the mining actor. A uniform technical standard is still under development; lawful capturers must evaluate and respect the common signals. Where an effective reservation exists, the TDM exception ceases to apply — the use would then require consent.
What this means for media monitoring
Media monitoring is a form of automated content analysis and therefore falls within the scope of the TDM rules. Lawful monitoring thus means two things: capturing only lawfully accessible content — and evaluating and observing declared reservations of use.
The position of mediaintel
mediaintel captures content from lawfully accessible, vetted sources, evaluates and observes declared and machine-readable reservations of use, and carries traceable source and licence information with every item. Its own, reverse reservation for the content of this website is documented in the TDM reservation.
Frequently asked questions
What is a TDM reservation?
A TDM reservation (opt-out) is a rightholder's declaration that their content may not be used for text and data mining. Under Art. 4 of the DSM Directive and §44b UrhG (Copyright Act), TDM is permitted in principle, provided the rightholder has not declared such a reservation — for works available online, it must be machine-readable.
What does §44b UrhG regulate?
§44b of the German Copyright Act (UrhG) implements Art. 4 of the DSM Directive and permits reproductions for text and data mining. Uses are permitted as long as the rightholder has not reserved them; for works available online, the reservation must be made in machine-readable form. In Austria, the corresponding provision is found in §42h(6) öUrhG.
How does lawful media monitoring respect reservations of use?
By capturing only lawfully accessible content, evaluating and observing declared and machine-readable reservations of use, and carrying traceable source and licence information with every item. Content with a declared reservation is not used for mining purposes.