Newsletter
V1.0.0 - meta

Information is physical


I am a physicalist.

Thus, I believe that like everything else, information is physical.
More and more people become information workers, and the remaining ones see more and more of their work moving to information work.

Paradoxically, most people do not feel the physicality of information. Information work often feels less real than physical work. Some people go as far as qualifying information work as "bullshit".

This is often idealised thinking, coming from the idea that information ought to be known everywhere at all times. Unfortunately, all processes involving information are physical. Thus they have a cost, and thus they only ever happen if someone somewhere pays that cost.

Missing this is a pervasive blindspot. One of its most infamous manifestations is neglecting the cost of Economic Planning.
For now, I'll call this blindspot "Information Idealism": the idea that information processes ought to have already happened, and at no cost.


I am a big believer in todo-lists and breaking down problems into concrete action points.
Too often I see people stuck when they have a task on their todo-list that is related to information processing.

I hope that this list of different information sub-processes will help internalise the physical nature of information, and help break down information tasks into more concrete action points.

I'll start with obvious stuff, and then move on to more complex ideas.
While it can be tempting to skip the obvious stuff, I am also a big believer in reflecting about obvious stuff.


That list is long. Managing information is just that complex.
While it is tempting to just leave it to smarts and IQ, this is unreliable: smart people regularly get sick or tired, and creating chokepoints around rare individuals is a recipe for failure.
Instead, I want to build foundations that let us manage information at scale.

Without further ado, let's start.

Observation

The only way to acquire factual information is by observing the world.

This observation process is physical, and often quite complex. For instance, visual perception involves the complex machinery of the eye or of cameras.

In practice, it means that fetching information requires constructing a physical system that interacts with the world and sets itself in a particular state as a result of these interactions.

This is highly non-trivial. As a result, our default expectation should be that we miss most information about everything. To reliably know something about the world, we must establish a physical process that captures the information.
An Information Idealist, upon witness a failure caused by some missing information, would not acknowledge the need for and the cost of such a process. They would instead complain about unfairness, that someone is at fault or did not try hard enough, and overall that something must have gone wrong.

Storing

Storing information is done through physical means. If a meeting happens, and no notes are taken, then most of the information is lost. There needs to be a process which goes from what is said in the meeting to some record, whether paper or digital.

This is a fact of physics.

Most people hate being bothered with taking notes. They think that the information will be magically retained in their brain, and that notes are a waste of time.

Transmitting

If Alice knows something, and Bob does not, then for Bob to know it, there must be a physical process which goes from Alice's brain to Bob's brain.

If this line of communication is not established, then it is physically impossible for Bob to know the information.
One such line might be for Alice to hint at the information in Carole's presence through body language, for Carole to see it and later mention the anecdote to Bob, and for Bob to reverse engineer the information from the anecdote.
Another line might be for Alice to just write it down and send it to Bob.

At some point, someone must expends energy to establish this line of communication. Bob will not magically know what is in Alice's mind.

Parsing

In the age of computers and the internet, storing and transmitting information is cheap. Too cheap to meter even.

In companies, we could just transcribe all meetings that happen and send them to everyone. Unfortunately, we then hit another bottleneck: reading itself takes time and energy. Specifically, converting the symbols on a page into a mental representation is a costly physical process itself.

Thus, if as a unified entity, a team or a group wants to know the state of the art of a field, they will need to have at least one person who reads the literature.

Summarising

Even after someone has read everything, they can not communicate all of it to others. They need to select and summarise the relevant pieces of information.

This process of selecting and summarising the relevant pieces of information is itself physical. When the information to be selected and summarised is numerical, this spans the field of computational statistics. As opposed to mere statistics, computational statistics is about studying how we can cheaply compute the relevant statistical summaries.

Unfortunately, most human-relevant information is not numerical, and the process requires a human in the loop. To some extent, this can be alleviated with Large Language Models, but we do not have yet a good understanding of when they can and cannot be used.

Structuring

Above was mentioned parsing information. The goal of structuring information is to reduce the cost of parsing it.

Two main situations require structuring information:

  1. When groups grow in numbers, and some information needs to be shared across everyone, the cost of parsing information grows linearly with the number of people. To reduce this cost, it makes sense to dedicate some effort to making it easier to parse. To a great extent, this is the role of teachers.
  2. When an entity needs to stay coherent, it makes sense to have "hubs": people dedicated to integrating information from many different sources. To let them integrate information from as many source as possible, it makes sense to structure information in a way that is easy to parse. To a great extent, this is the role of managers and leaders.

This is why we have powerpoints, textbooks, one-pagers, infographics and videos. They are all about making it cheaper to parse information.

Committing to Human Memory

A task that is central to knowledge workers is... integrating knowledge.
There are some complex components to integrating knowledge, that we will tackle later in this list.
But the most basic component is committing information to our memory.

The only reliable way to commit information to memory is repetition. Paradoxically, shockingly, very few knowledge workers explicitly use Spaced Repetition software at work.

This is directly related to the physicality of this process. If you internalise that the only reliable way to commit information to memory is repetition, then it is painfully obvious that people who do not it will just forget.

As a manager, if your employees do not use spaced repetition software, you know that you will need to repeat the information to them multiple times over weeks. In effect, you will be acting as the Spaced Repetition system for them.
At the same time, most employees hate it when their manager repeat the same things many times.

(There is a similar dynamic between teachers and gifted students, where gifted students hate being taught by repetition. What usually happens is that gifted students will hit a ceiling, at which point they will be stuck because they have learnt to avoid repetition.)

Double-Checking

As part of developing a critical mind, we are taught that it is important to double-check information we are exposed to.

But this is not free. Double-checking information takes time and energy. And as a direct result of not budgeting for it, no one does this.

This is quite a collective stupid failure. We are like "[X] process ought to happen", and then proceed to not dedicate any resources to it. This is a central example of Information Idealism.

I long for a media (whether a newspaper or a social network) that explicitly manage its double-checking budget.

Concretely, to consume articles on the platform, you would have one of two choices:

  • Punctually, contribute to the platform by performing some double-checking work. Check that a line of reasoning holds, that some fact has been cross confirmed by independent reliable sources, that the strength of the claims made by the author is coherent with their expertise, etc.
  • Pay a fee to the platform, and receive the articles without having to work for them.

I would gladly pay a fee to such a platform, if it meant that:

  • I would not have to perform any double-checking work.
  • The articles I consume would be double-checked.

Synchronisation

Very often, we want to maintain synchronisation between multiple information systems.
This is a very generic concept, it ranges from ensuring the integrity of replicated data, to having everyone in a team be on the same page.

What matters is that synchronisation requires work. By default, information systems integrate different pieces of information, and go out of sync.

It can happen within relationships, when people do not regularly synchronise what they want out of the relationship.
It can happen within a state, when people are not putting in the work to keep their beliefs and values aligned.
At work, it happens within programs, when different people start to expect different things from the same piece of code and use it in conflicting ways.

When this happens, people feel bad. They are frustrated, and believe that it means that someone did something bad.
This is another case of Information Idealism.

Once we internalise the fact that synchronisation itself is the result of a physical process, it becomes obvious that over time, things will naturally go out of sync.

Truth

From my point of view, truth is not some abstract concept, but rather the result of a physical process of synchronisation.

To the extent that a statement of fact is true, it is true because it is the result of a process that connects it to the fact. The statement means that its utterance is the end of a long causal chain that originated in the fact itself.

People regularly feel offended when I call them on their bullshit. However, most of the time, me calling them on their bullshit has nothing to do with doubting their integrity or honesty. Even when I fully believe what they say, there is just no causal story that originates in the fact they claim and ends with their claim.

Alice may genuinely tell me that [X] happened, if it's only because she fully believes Bob, who has not witnessed [X] but nevertheless fully believes it, I do not need to doubt her honesty to know that she is bullshitting.
In real-life, Alice is often an older family member, Bob is a random internet podcaster and [X] is some pseudo-scientific absurdity (Did you know that in Argentina, 37 species of blue dogs are created every year?). In this situation, Alice's honesty or my trust in her are both irrelevant.

More prosaically, this also happens in team meetings. Someone gets assigned with a bunch of tasks, they say they will do all of them, but they do not have a habit of writing these tasks down or keeping track of their progress. Regardless of how honest they are, I know that they will forget at least some of the tasks or the relevant constraints.
A similar situation occurs when someone tells me they stuck to a plan, while I see that they have not edited it in a month, thereby showing that they have not updated neither the plan nor their understanding of it for a full month.
Both have happened to me, when I knew for a fact that the other person was honest. It was just that at a physical level, there was no way for them to be truthful.

Unfolding

So far, I have been talking about empirical information. That is, information that is directly based on observations of the world.

However, information can also be logical. Ie: derived from other pieces of information and reasoning. While purely mathematics might seem non-physical, this article about formalism should help understand why it is still physical. Tl;dr: maths is about what happens when you perform specific physical actions, aka symbols manipulations.

But logical information is rarely purely mathematical. It is instead a bunch of conclusions that we build on top of empirical information.

And as you might expect, this process of unfolding the logical consequences of empirical information is physical.

It means the process requires energy. Thus it has a cost. Thus it does not happen by default. Therefore, by default, we do not know all the logical consequences of our empirical information, and we must put in the work to find out each additional bit of logical information.

It so happens that in the case of unfolding, neglecting its cost because of Information Idealism has an official name: The Problem of Logical Omniscience. Tl;dr: if you try to reason non-physically about knowledge, you end up with extremely counterintuitive results, such as expecting all humans to know all mathematical theorems (including the ones we haven't proven yet!).

Coherence

And last but not least, coherence is the result of a physical process.
By coherence here, I mean the property of an information system not containing contradictions.

Information Idealism sets an expectation that information systems should be coherent by default.
But by now, I think the punchline is clear: coherence is the result of a physical process.

To make an information system coherent, there must be a physical process that iterates through each piece of information contained therein, and check that it is coherent with the others.
By default, no such process happens, and as an information system integrates more and more information, it will naturally accumulate contradictions.

A triumph of the 20th century has been to tame this chaos by building database systems rigid enough that we could prove their coherence.
Getting this right was a major feat of both research and engineering, and it is at the core of our modern IT infrastructure. (If for some reason you want to understand more about this topic, I recommend the book Foundations of Databases.)

In practice, keeping organic information systems coherent is impossible. No human or group of humans can be made coherent.
We can only make organic information systems less coherent, or coherent up to a specific level.

Unfortunately, Information Idealism blinds us to this fact. Instead of an opportunity to learn, the feeling of witnessing a contradiction in ourselves is distinct enough to have its own dedicated name: Cognitive Dissonance.


This page doesn't have Substack comments. Send me a direct message on Twitter if you want me to create it.