Let's use WhatsApp

This is a post about messenger apps, privacy and data flow.

It is prompted by a discussion I had about which messenger app to use. The quick conclusion is that they all have issues, but some are better than others. I will not go through the different apps, but instead do an overview of how the data flows and the privacy implications.

In this post we will omit any issues related to transmission security, cryptographic implementations and device security. All of which are very relevant, complicated, and very difficult to get right.

Two basic designs: star and mesh

When talking about a system to send and receive messages, there are broadly speaking, two basic design topologies: star and mesh.

Star

With the star topology, your system looks like this,

You have multiple user, and when user A wants to send to user B, the process is

User A uses his app and send a message to the centralized server
The central server forwards the message to user B, who reads it on his app

Even though User C is using the same server, he will not be involved.

The “centralized server” will in reality be a lot of servers in one or more datacenters, but they are all operated by the same entity.

This is the model of e.g. messenger, signal, telegram and most other of the common messaging systems.

Mesh

In a mesh topology, we have multiple server operators, and data is only shared between the senders server and the recipients.

User A uses his app, and sends a message to his service provider (provider A)
Provider A forwards the message to user B’s service provider (provider B)
Provider B forwards the message to user B, who reads it using his app

User C is connecting through a completely different provider.

This is the model of emails, SMS and matrix.

A technical note about end-to-end encryption

The technical implementation of end-to-end encryption (E2E encryption) means that only users A and B can read messages send between them. This can be done with both models. In any case, you will still have metadata associated to sending and receiving messages.

As a side note, Governments are currently looking into how to make end-to-end encryption illegal. The implications are dire, and well outside the scope if this post.

It’s about the business model

Different companies will have different business models, but the prevailing one is to amass a lot of users, and then sell ads and/or user data. Perhaps, some sort of premium subscription is offered to avoid ads.

This model is only meaningful with a centralized system for attention and data collection. You have the added benefit of network effects, where lots of users will make the service more attractive, and you can implement easy entry/difficult exit to make people and customers stay longer. Vendor lock-in is also strong.

Building mesh style systems require a radically different business model. This could be payed subscriptions, grants, support contracts or similar. Emails is a very old invention from before the current internet, and SMS/telephone are very hardware focused, so these messenger system follow a different logic than what we see being used and developed for the internet today.

What about privacy?

To me, this is closely related to the question of trust. Will my data be misused?

If some company promises me that they will never look in my messages, and that all my data is end-to-end encrypted, do I trust them?

I see multiple layers to this question.

Do their business model even support privacy?
Are they known for high software quality?
Are they known for high standards of governance? and what about legal issues?
Even if they behave correctly now, what about the future?

Let’s unpack

If your business is to monetize off your users data, I would not trust your promises of privacy. Your definition of “privacy” would probably be very different from what is usually meant by the word.

Your company publishes an app that you want everybody to run. Why would I trust to install an app from you, on the device that follows me everywhere, handles my banking and is an integral part of my life? I know most people do not see it that way, but that is to trust your software a lot. Even if there are no intentional backdoors or questionable data collection, how do I know that your developer have done a good job? How do you detect corrupt behavior in your development team? This is a quality assurance question. You could present recurring auditing reports from 3rd party entities that I trust, or you could go open source and ask the public to verify your claims.

Next, we need to look at how the company handles data internally. In the case of non-E2E systems, we must look at all the data, and for E2E systems, we must consider the metadata. How do you prevent your employees from gaining and abusing access? You will do data mining of user data. Some companies do a lot, others a lot less. How do you ensure privacy when doing that? And, just to be clear, unless you dilute or do high level aggregation of your data, “anonymizing” is close to a non-existing concept.

There is a recurring follow up question to you internal data handling. How do you comply with requests from law enforcement? The company will of course abide by the law, and the real question is which jurisdiction you are in. Are you China, EU or US based? The rules governing US companies are not compatible with EU law.

The last point on the list is about the future.

Your company have a privacy-friendly business model, makes good software and have good data governance. What happens if the company get a new management maybe by going bankrupt, being sold, is taken over or a merger? New management might feel that they sit on a goldmine of data ripe for mining. Maybe you change jurisdiction.

Even if nothing happens to the specific company, laws will continuously be updated, also in more profound ways. We have not reached a point where standards, law enforcement and the legal system have found a general consensus and we have good predictability.

Let’s use whatsapp?

That was the question that triggered this post.

As you can see, I have not even looked at whatsapp specifically. They use some sort of star topology, make some promises about encryption and data handling, and you have to use their app. A quick google search show that they are owned by Meta. So I guess that settled any questions about the business model.

I am not on WhatsApp, and chances are that I never will. Not because they are WhatApp, I just don’t have the need and it must solve some real problem for me before I even consider it.

I do use some questionable messaging systems (including emails and SMS). It is fairly unavoidable, if you want to participate in this world.

My best recommendation is to spread data out between multiple entities and not put all eggs with one tech giant, and minimize the use of non-private systems.

And to the “I have nothing to hide” crowd, it is not about you, it is about the companies and why they collect your data.