5 March 2025 ⋅ Back to blog index
An arXiv preprint titled “A proof of the Collatz conjecture”, submitted by Toshiharu Kawasaki on February 28th, was brought to my attention this morning.
For the benefit of my non mathematically inclined readers, the Collatz conjecture is among the most famous unsolved mathematical problems and perhaps the most famous unsolved problem that is understandable to laypersons. It asks the following: Start from some positive integer . If is even, divide it by 2, otherwise multiply it by 3 and add 1. Then repeat the process. For example, starting from , you get the sequence . If you try it with other starting values of , you should quickly notice that whatever value you choose, the process seems to always end in a loop . This has been checked to hold for values of up to billions of billions, but as of yet nobody has been able to prove that it holds for absolutely all . It is not for lack of trying; many a mathematician has attacked the problem, which has even been jokingly described as a plot from the USSR to slow down the progress of mathematics in the US. There are some partial results, however. The current strongest partial result is from Terence Tao, who proved that if is a sequence with limit , then for “almost every ” in a certain precise sense, the iteration goes below .
@pianocktailiste asked on Bluesky was whether the paper had been confirmed correct, or whether a flaw had been identified. The paper is 13 pages full of math proofs; usually this would take a couple hours at least to fully understand. Nevertheless, after scanning the paper for about two minutes, I said that I did not believe for a second this paper contained a correct proof of the Collatz conjecture.
I will indeed show below that the paper is incorrect. But I would also like to take this opportunity to discuss a broader point. I suppose that, to non-mathematicians, it could sound shocking that a mathematician took a math paper proving an important conjecture, that perhaps the author spent months writing, skimmed the paper for two minutes, and dismissed it as worthless without even taking the time to understand the proof. To be clear, I am not a mathematician, just some random graduate student in theoretical computer science, but I bet many experienced mathematicians would have had the same reaction as me. Is this some kind of elitism where mathematicians discard anything that does not fit in their established methods, calling “naïve” what could be an original idea?
While I'm just, again, some poor random graduate student in theoretical computer science, many established researchers, especially those who work in areas close to famous problems, receive tons of papers, usually from amateur mathematicians, purporting to solve these problems. Most of these are indeed worthless. How do you decide quickly if a paper is worth your time? Scott Aaronson, a recognized complexity theorist, wrote a blog post with his criteria, which I encourage you to read.
In this blog post, I'll explain what I did in each pass of reading the paper. This will bring me both to why the paper is wrong, and to why I was already very confident it was wrong after just two minutes of skimming it.
Here were my first impressions:
The paper is written in precise mathematical language, not weasel words and handwaving. This is a good sign. (For comparison, look at bullshit proofs of the Collatz conjecture by professional bullshitters. These are “not even wrong”: it is impossible to figure out what if anything they mean mathematically.)
Already in the introduction there are noticeable typos (“allmost” and “in the sence”). Granted, let's forget about these.
The abstract and the introduction say very little. It is just mentioned that the paper first proves a new fixed point theorem in metric spaces, then deduces the Collatz conjecture. For a result attacked unsuccessfully by so many people, a correct proof is likely to require a complex proof strategy, and while there are some counterexamples to this, it is overwhelmingly likely that some new conceptual idea is required, and not just a rehashing of natural computations and known lemmas that thousands of people explored before. One would hope for an introduction giving an overview of the key ideas that make the proof succeed where so many people failed.
The introduction contains the “dangling” sentence “Using this theorem, we obtain for almost all .” This is stated but it is not explained how it is relevant to the paper or why four logs are used and not three or five.
One thing the introduction does say is what metric space the fixed point theorem is applied with in order to prove the Collatz conjecture: it is just with its usual metric. This feels fishy: is an extremely special metric space, for example a sequence converges iff it is eventually constant. One does not expect metric spaces as natural context to study this problem, and it seems this generalization can only make easy things harder to prove, and harder things impossible.
The first section after the introduction, presenting the fixed point theorem, contains a lot of formulas and computations. I skipped this.
I got to Theorem 3.1, which seems to be the meat of the proof. It defines six functions by big case disjunctions on whether their arguments are even, equal to 1, and similar conditions. This is a huge red flag: it seems to be essentially trying to analyze the behavior of Collatz iterations “locally”, with simple case disjunctions on the values modulo small numbers, which is probably the first idea that comes to the mind of most people who think about this problem (e.g., you quickly notice that every step is always followed by a step ).
Skimming to the end, I saw lots of computations of polynomials and lots of case disjunctions. Again this seems just too naïve for a problem of this calibre: a great many people have tried all sorts of case disjunctions like this, the computations seem long but trivial, and polynomials are very simple types of bounds so also unlikely to work. I am reminded of Terence Tao's blog post Be sceptical of your own work, where he warns against the following type of proof: Reason by contradiction, do lots of random computations, make a small mistake somewhere, derive a contradiction from this mistake, declare victory.
After the two minutes this lasted, I declared I didn't believe at all this could be a correct proof, not because I found a mistake — I barely read anything in the proofs! — but because the techniques seem like obvious methods that must have been tried zillions of times before, and they are packaged in a weird way that doesn't suggest the author has a good understanding of the problem. In terms of Aaronson's blog post, the paper fails criterion 10: “The techniques just seem too wimpy for the problem at hand.”
Motivated to actually exhibit a flaw in the paper and not just declare it too naïve, I tried to read portions of it in detail.
The author starts by defining a notion of “-weighted generalized pseudocontraction”. From the terminology and definition, it is clear that this should be seen as a generalization of the usual notion of contraction (-Lipschitz map with ), and therefore the theorem is a generalization of the usual Banach fixed point theorem. It is stated as theorems 2.1, 2.2, 2.3, which contain a lot of equations repeated (it is not very pleasant to figure out what changes between them). Then I spent some time on the proofs on page 2. I can reprove lemma 2.1 easily, but I don't understand what the author is doing in each step of the proposed derivation. Lemma 2.2 is trivial, though it also suggests that the definition of -weighted generalized pseudocontractions was not really the right one to begin with. Then I got lost trying to understand what is done on page 3. The theorems have five sub-cases, but it seems only one is being used by the proof of the Collatz conjecture, and the other cases (seemingly useless) are purported to be trivial but I don't understand what the author derives this from.
In parallel, I tried to understand the section proving Collatz from the fixed point theorem. It quickly appears that the big case distinctions in the definitions of seem to be completely useless, because they only serve to make a certain inequality satisfied (the one from the definitions of …-pseudocontractions), but it would be sufficient to use just one constant for each of these subcases, instead of six, or probably to just reason in “after a certain index” style. It is entirely unclear to me how all these constants (1, 1, 0, 1, 1, 0, 0, 0, 2, …) were chosen. Then again lots of computations follow. At this point, it started to take too much time to unravel the computations: I am appallingly bad myself at doing computations, as I tend to do all sorts of trivial mistakes as soon as I cannot entirely keep in my head what the computations are supposed to represent, which makes me even worse at checking computations.
I took a more high-level view again and tried to understand what the heck the paper was actually trying to do. As the fixed point theorem was supposed to be a Banach-type one, I went looking for the place where its proof would ressemble the classical and more or less obvious proof of the Banach fixed point theorem. This was on page 4, in the inequalities
These are preceded by a derivation of
Now if we specialize this to the context where it is used at the end of page 11, the ambient metric space is , the constant is , and the map is
But then the inequality is completely false! Take for example and , then the right hand side is while the left hand side is .
Behind the thick smoke screen of (to me, seemingly random) computations, what the paper is really trying to do deep down is to show that the map satisfies a very simple property which makes it resemble contractions enough that Banach's fixed point theorem can be recovered — except the map just doesn't have that property.
Admittedly, I somewhat forgot to keep non-mathematical readers in mind while trying to write down this post. But if you're still here, these would be my takeaways: There can of course be bad reasons to dismiss a paper, but there are also a lot of possible good reasons. What seems like a potentially novel idea to outsiders is often quickly recognized as just an application of ideas that are never going to work by experienced mathematicians. Humans are bad at checking proofs line-by-line, and rather want to understand what the author is trying to do, which concepts are being applied, what insight on the problem is being used. If there seems to be no insight, the purported proof is probably wrong but possibly difficult to refute precisely, and might not be worth reading at all.
# Comment by Informatheux (9 March 2025 at 11:37)
Je pensais pas que ce serait aussi trivialement faux. Je comprends pas pourquoi certains s'obstinent à publier des preuves qui ont toutes les chances d'être fausses : on parlera d'eux, certes, mais en mal uniquement.
# Comment by Jean Abou Samra (9 March 2025 at 18:42)
Je suis aussi perplexe sur ce qui se passe dans la tête de l'auteur de cet article (est-ce qu'il y croit vraiment ?), mais en même temps, je n'ai jamais compris les élèves qui essaient d'enfumer le correcteur en examen de maths, et pourtant ils semblent être courants… Et forcément, les problèmes hyper-célèbres comme Collatz attirent toutes sortes de gens trop sûrs d'eux, etc.
Leave a comment