The Digital Services Act

New rules for online platforms and how to hold them accountable

Dr. Martin Degeling
Stiftung Neue Verantwortung
IFIP summer school on privacy and identity management - 09.08.2023

About me

  • Research on Usable Privacy and Security, the GDPR and Web Privacy (Ruhr University, CMU)
  • since 2022: Stiftung Neue Verantwortung, research oriented, tech-policy Think Tank.

Summary

  • Intro to the Digital Services Act (DSA) and what it means for (data driven) research on socio-technial systems like social networks
  • Deep Dive into methods for TikTok Risk Assessments

Slides at: martin.degeling.com/slides/ifip

Introduction to the DSA

"The Digital Services Act is a EU Regulation that defines obligations for online services regarding liability for illegal content, content moderation, transparency and due diligence obligations for service providers.""

Why the DSA?

Who is regulated by the DSA?

  • intermediaries: online service that only passes on data
  • hosting: services that stores information
  • online platform: a hoster that disseminates user information to the public
  • VLOPs and VLOSES: online platforms with more than 45 M monthly users in the EU

Image Source: EU Commission

What are the new rules for whom?

New obligations

Intermediary services

Hosting
services

Online
platforms 

Very large
platforms

Transparency reporting
Requirements on ToS due account of fundamental rights
Cooperation with national authorities
Points of contact
Notice and action/provide information to users 
Reporting criminal offenses 
Complaint and redress mechanism and out of court dispute settlement  
Trusted flaggers  
Measures against abusive notices and counter-notices  

What are the new rules for whom?

New obligations

Intermediary services

Hosting
services

Online
platforms 

Very large
platforms

Special obligations for marketplaces  
Bans on targeted adverts to children  
Transparency of recommender systems  
User-facing transparency of online advertising  
Risk management obligations and crisis response   
External & independent auditing   
User choice for recommender   
Data sharing with authorities and researchers   
Codes of conduct   
Crisis response cooperation   

What are the VLOPs/VLOSes?

Very Large Online Platform

  • Alibaba AliExpress
  • Amazon Store
  • Apple AppStore
  • Booking.com
  • Meta: Instagram & Facebook
  • Google: Youtube, Play, Maps, Shopping
  • LinkedIn
  • Pinterest
  • Snapchat
  • TikTok
  • Twitter
  • Wikipedia
  • Zalando

Very Large Search Engine

  • Bing
  • Google

After publication by the EU in April. Amazon complained that it's unfair to be singled out.
Zalando argued that it is a safe platform .

Nothing new

Many of the measures proposed are happening

  • There already is a code of practice on misinformation in the EU, but not required so Twitter withdrew
  • Facebook offers Crowdtangle, a tool for researchers to study content distribution, but it's been on halt and no other platform has something similar
  • Some Transparency measures exist (some thanks to GDPR), but others like the Twitter & Reddit API were closed down
  • Trusted flaggers (Youtube heroes) already exist, but adoption is sparse

The DSA is still necessary..

  • .. to get from self-regulation to common standards
  • .. raise the bar and levels the playing field

Why is the DSA important for privacy and security researchers?

  • privacy is about autonomy, topics like transparency, user choice and control are in the GDPR as well as the DSA
  • the DSA will open up new research avenues with access to additional data

Why are we looking at TikTok?

  • TikTok is a designated VLOP (> 100 Mio monthly users in the EU)
  • Relatively new and less studied
  • immense impact on it's users as well as the ecosystem

A TikTok hands on

  • Shows videos in a constant stream (swipe up), allows editing
  • Interaction through likes and comments
  • Various search functions to explore content
  • Financed through ads, micro-payments
  • myriads of features (not shown: lives, shopping)

What's so new about the tiktok model

Subscriptions       Network                  Algorithm

e.g. Podcasts, RSS             e.g. the "old" Facebook, IG                       e.g. Tiktok FYP, IG Reels, Twitter
active selection                    selection of others                         weights (based on implicit feedback)

Graphic: Arvind Narayanan: Understanding Social Media Recommendation Algorithms, 3.9.2023

A recommendation algorithm

Our Project at SNV:

How to conduct risk assessments

They are required by VLOPs to conduct on their own platform. But we have little trust that they will be thorough so researcher and civil society should do them to.

systemic risks listed in the DSA (1/2)

  • risks associated with the dissemination of illegal content, such as: child sexual abuse material, hate speech or other types of misuse of their services for illegal activities. (Recital 80)
  • impact of the service on the exercise of fundamental rights, as protected by the EU Charter, including: human dignity, freedom of expression, media freedom and pluralism, the right to private life, data protection, the right to non-discrimination, the rights of the child and consumer protection. (Recital 81)

systemic risks listed in the DSA (2/2)

  • the actual or foreseeable negative effects on democratic processes, civic discourse and electoral processes, as well as public security. (Recital 82)
  • negative effect on the protection of public health, minors and serious negative consequences to a person’s physical and mental well-being, or on gender-based violence. (Recital 83)

From a very abstract risk to concrete scenario

Identify stakeholders and map out the audit process
Define and prioritise risk scenarios
Define and prioritise measurements to observe risk scenarios
Analyse results and create a report

Get a good understanding of the platform. Use this information to determine the profiles of stakeholders who should be involved in the process. Depending on the experience and expertise needed, stakeholders could be platform developers, researchers, legal experts and representatives of the parties affected.

More details
Media What type of media is the platform based on? Audience What is the audience of the platform? Products What technical productsdo exist on the platform? Strategy What is the platform‘smain strategy?
Legal experts Platforms Users and civil society Researchers Independent contractors

Define and prioritise scenarios. A scenario is a description of specific issues related to a 'systemic risk'. It breaks down abstract risks into concrete testable hypotheses by defining the affected party and its characteristics, the harm, the involved elements of the platform and the further impact. A systemic risk may often involve several scenarios; therefore, selecting scenarios and deciding if they have a 'high' priority is necessary.

More details
An individual/group/institution defined by some characteristics has experienced a harm that is related to something happening on the platform and this also has macro impact. A young adult who is temporarily in a personal crisis is overly exposed to videos describing or showing self-harm by the recommender system of the 'ForYou' feed and this might exacerbate the general mental health crisis of young adults. affected party characteristic harm platform involvement macro impact
Normal priority High priority

Develop measurements to understand the scenario. There are different types of algorithm audits, as well as platform elements to consider. These can range from automated measurements that look at the actual implementation to user perspectives through surveys. An auditor needs to develop multiple measurements and then prioritise them to find the best measurement(s) to test a specific scenario.

More details

Select audit type or platform element:

Connected elements

Code audit
Text
Document audit
Text
Architecture audit
Text
User survey
Text
User experience
Text
User interface
Text
Algorithmic logic
Review source code, evaluate model parameters.
Content moderation
Text
Terms and conditions
Text
Advertisement
Report mechanisms
Text
Data-related practices
Check dataflows, ensure that sensitive data like ethnicity is not used as model input.
Measure Priority Detectability Platforminfluence Macro Effects Individual harm How can a scenariobe effectively detectedwith this measurement? Difficulty Imple-mentation costs Replica-bility Repre-sentative-ness How difficult will it be to execute this measurement?

After conducting the measurements, you need to analyse the results and write an audit report. The report should foster observability and enable reproducibility and recommend mitigation measures.

More details
Observability Reproducibility Plan Scenarios Measure Evaluate
Plan Scenarios Measure Evaluate 1) Executive Summary 2) Introduction 3) Scenarios 4) Methods and 5) Results and Measurements Audit Opinion
A. Meßmer & M. Degeling Auditing Recommender Systems" (2023)

One pillar: audit types and platform elements

There are various possibilities to study risks on different elements of the platform with different methods

A. Meßmer & M. Degeling Auditing Recommender Systems" (2023)

Document Audit

  • Public documentation
  • Terms of service
  • Leaks

Goal: Better understand the platforms processes and company motivation.

Exhibit A: TikToks Official Explanation

Parameter TikTok lists:

  • User interactions: like, shares, comments..
  • Video information: captions, sounds, hashtags.
  • Devices and account settings: language, country, device. (less important)

Exhibit B: TikTok Leak

The "internal memo":

  • Optimization on usage time and user retention
  • Each video is assigned a Value for each user for how likely it will result in an interaction
  • Important: Calculations are mostly based on metadata (who posted, how many likes are there already), less on the actual content
Ben Smith, NYTimes: How TikTok Reads Your Mind, 6.12.2021, Translated Memo

Exhibit B: TikTok Leak

The value of a video varies with respect to actors

  • User: View time, app usage, satisfaction with the app
  • Creator: Number of interactions, Traffic, creator revenue
  • Platform: Brand Effect, Content Security, Platform Revenue
  • Indirect: Number of comment complaints, "Content Ecosystem"
Ben Smith, NYTimes: How TikTok Reads Your Mind, 6.12.2021, Translated Memo

Architecture Audit

How do different (software) products work together to create the platform experience

Exhibit A: Informed speculation

  • Multiple Steps of Content Moderation
  • Additional Factors: Cool Down and Gravedigging

Exhibit B: Leaks on Heating

  • Nothing goes viral that was not reviewed(SMW, 14.03.23)
  • TikTok employees can "heat" certain videos
The total video views of heated videos accounts for a large portion of the daily total video views, around 1-2%, which can have a significant impact on overall core metrics.
Emily Baker-White TikTok's Secret 'Heating' Button Can Make Anyone Go Viral 20.01.2023, Forbes.com

Automated Audit

Using automated means to simulate users (aka scraping)

  • Through the Web
  • On the mobile App

Exhibit A: Scraping the website

  • Access TikTok through the website is easy
  • "unofficial" simplify access
  • what you see is limited and "clean"

Exhibit B: Automated Mobile Swiping

Using a mitmproxy and adb we can automate the use of TikTok

  • At startup the server sends 8 videos and measures how long they are viewed.
  • Recommender Adaption: The first bubble can be reached within minutes minutes
  • Result: Virality happens faster for shorter videos

Findings from Scraping TikTok

  • Explainers are often wrong
  • Viewing videos two times (or more) has the biggest impact of the factors we tested
  • Content Moderation has holes: Content that is blocked from user pages can still show up on the FYP (shown by Tracking Exposed/AI Forensics)

Code Audit

TikTok is not an open-source oriented companies

Code Audit

But adverserial methods can help. See MobSF

Crowd-Source Audit

Surveying Users

  • to assess risks it is necessary to ask actual users.
  • they can explain how they perceive the algorithm (see Klug et al.)
  • or how they manage privacy (see Ebert et al.)

Overview Audit types and platform elements

A. Meßmer & M. Degeling Auditing Recommender Systems" (2023)

How to assess systemic risks

Understanding systemic risks requires mix-methods. Creating scenarios can help.

  • documents: to understand TikToks position.
  • scraping: to assess the topic on the platform (see WSJ, CCHD)
  • survey: users to see what aspects of the platform are critical (IG Leak)

DSA: Things to watch for

  • How platforms implement transparency measures
  • How oversight(DSCs, ECAT) is managed
  • How platforms interpret their risks (see Meta 2020 & Election)

Beware of the platform provided tools!

Platforms are pushing new transparency features and promise new APIs. But this is often mere PR:

  • TikTok APIs are limited with respect to data that can be accessed
  • access to APIs is decided on by TikTok
  • The ad library is missing information to be useful

Takeaways

  • the digital services act is something privacy and security researchers should know about
  • academia and civil society should use new transparency for research
  • interdisciplinary research is necessary to study platforms (and a CS background is beneficial)

Thanks

follow our research github.com/snv-berlin/tiktok-audit

or me :) chaos.social/@mrtn3000