How dare mathematicians dismiss papers after skimming them for two minutes?

5 March 2025 ⋅ Back to blog index

An arXiv preprint titled “A proof of the Collatz conjecture”, submitted by Toshiharu Kawasaki on February 28th, was brought to my attention this morning.

For the benefit of my non mathematically inclined readers, the Collatz conjecture is among the most famous unsolved mathematical problems and perhaps the most famous unsolved problem that is understandable to laypersons. It asks the following: Start from some positive integer nn. If nn is even, divide it by 2, otherwise multiply it by 3 and add 1. Then repeat the process. For example, starting from n=11n = 11, you get the sequence 11,34,17,52,26,13,40,20,10,5,16,8,4,2,1,4,2,1,4,2,1,4,2,1,11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1, 4, 2, 1, 4, 2, 1, …. If you try it with other starting values of nn, you should quickly notice that whatever value you choose, the process seems to always end in a loop 4,2,1,4,2,1,4, 2, 1, 4, 2, 1, …. This has been checked to hold for values of nn up to billions of billions, but as of yet nobody has been able to prove that it holds for absolutely all nn. It is not for lack of trying; many a mathematician has attacked the problem, which has even been jokingly described as a plot from the USSR to slow down the progress of mathematics in the US. There are some partial results, however. The current strongest partial result is from Terence Tao, who proved that if (un)(u_n) is a sequence with limit ++∞, then for “almost every nn” in a certain precise sense, the iteration goes below nn.

@pianocktailiste asked on Bluesky was whether the paper had been confirmed correct, or whether a flaw had been identified. The paper is 13 pages full of math proofs; usually this would take a couple hours at least to fully understand. Nevertheless, after scanning the paper for about two minutes, I said that I did not believe for a second this paper contained a correct proof of the Collatz conjecture.

I will indeed show below that the paper is incorrect. But I would also like to take this opportunity to discuss a broader point. I suppose that, to non-mathematicians, it could sound shocking that a mathematician took a math paper proving an important conjecture, that perhaps the author spent months writing, skimmed the paper for two minutes, and dismissed it as worthless without even taking the time to understand the proof. To be clear, I am not a mathematician, just some random graduate student in theoretical computer science, but I bet many experienced mathematicians would have had the same reaction as me. Is this some kind of elitism where mathematicians discard anything that does not fit in their established methods, calling “naïve” what could be an original idea?

While I'm just, again, some poor random graduate student in theoretical computer science, many established researchers, especially those who work in areas close to famous problems, receive tons of papers, usually from amateur mathematicians, purporting to solve these problems. Most of these are indeed worthless. How do you decide quickly if a paper is worth your time? Scott Aaronson, a recognized complexity theorist, wrote a blog post with his criteria, which I encourage you to read.

In this blog post, I'll explain what I did in each pass of reading the paper. This will bring me both to why the paper is wrong, and to why I was already very confident it was wrong after just two minutes of skimming it.

Here were my first impressions:

After the two minutes this lasted, I declared I didn't believe at all this could be a correct proof, not because I found a mistake — I barely read anything in the proofs! — but because the techniques seem like obvious methods that must have been tried zillions of times before, and they are packaged in a weird way that doesn't suggest the author has a good understanding of the problem. In terms of Aaronson's blog post, the paper fails criterion 10: “The techniques just seem too wimpy for the problem at hand.”

Motivated to actually exhibit a flaw in the paper and not just declare it too naïve, I tried to read portions of it in detail.

The author starts by defining a notion of “(α,β,γ,δ,ε,ζ)(α, β, γ, δ, ε, ζ)-weighted generalized pseudocontraction”. From the terminology and definition, it is clear that this should be seen as a generalization of the usual notion of contraction (kk-Lipschitz map with k<1k < 1), and therefore the theorem is a generalization of the usual Banach fixed point theorem. It is stated as theorems 2.1, 2.2, 2.3, which contain a lot of equations repeated (it is not very pleasant to figure out what changes between them). Then I spent some time on the proofs on page 2. I can reprove lemma 2.1 easily, but I don't understand what the author is doing in each step of the proposed derivation. Lemma 2.2 is trivial, though it also suggests that the definition of (α,β,γ,δ,ε,ζ)(α, β, γ, δ, ε, ζ)-weighted generalized pseudocontractions was not really the right one to begin with. Then I got lost trying to understand what is done on page 3. The theorems have five sub-cases, but it seems only one is being used by the proof of the Collatz conjecture, and the other cases (seemingly useless) are purported to be trivial but I don't understand what the author derives this from.

In parallel, I tried to understand the section proving Collatz from the fixed point theorem. It quickly appears that the big case distinctions in the definitions of α,β,γ,δ,ε,ζα, β, γ, δ, ε, ζ seem to be completely useless, because they only serve to make a certain inequality satisfied (the one from the definitions of …-pseudocontractions), but it would be sufficient to use just one constant for each of these subcases, instead of six, or probably to just reason in “after a certain index” style. It is entirely unclear to me how all these constants (1, 1, 0, 1, 1, 0, 0, 0, 2, …) were chosen. Then again lots of computations follow. At this point, it started to take too much time to unravel the computations: I am appallingly bad myself at doing computations, as I tend to do all sorts of trivial mistakes as soon as I cannot entirely keep in my head what the computations are supposed to represent, which makes me even worse at checking computations.

I took a more high-level view again and tried to understand what the heck the paper was actually trying to do. As the fixed point theorem was supposed to be a Banach-type one, I went looking for the place where its proof would ressemble the classical and more or less obvious proof of the Banach fixed point theorem. This was on page 4, in the inequalities

d(Tnx,Tmx)k=nm1d(Tkx,Tk+1x)k=nm1Ak/2d(x,Tx)An/21A1/2d(x,Tx)\begin{align} d(T^n x, T^m x) &≤ \sum_{k=n}^{m-1} d(T^k x, T^{k+1}x) \\ &≤ \sum_{k=n}^{m-1} A^{k/2} d(x, T x) \\ &≤ \frac{A^{n/2}}{1-A^{1/2}} d(x, T x) \end{align}

These are preceded by a derivation of

d(Tnx,Tn+1x)2And(x,Tx)2d(T^n x, T^{n+1} x)^2 ≤ A^n d(x, T x)^2

Now if we specialize this to the context where it is used at the end of page 11, the ambient metric space is N, the constant AA is 1/21/2, and the map TT is

T:x{1if x=1x2if x is even3x+12otherwiseT : x ↦ \begin{cases} 1 &\text{if} \ x = 1 \\ \frac{x}{2} &\text{if} \ x \ \text{is even} \\ \frac{3x+1}{2} &\text{otherwise} \end{cases}

But then the inequality is completely false! Take for example x=5x = 5 and n=1n = 1, then the right hand side is 1/2852=4.51/2 ⋅ |8-5|^2 = 4.5 while the left hand side is 842=16|8-4|^2 = 16.

Behind the thick smoke screen of (to me, seemingly random) computations, what the paper is really trying to do deep down is to show that the map TT satisfies a very simple property which makes it resemble contractions enough that Banach's fixed point theorem can be recovered — except the map just doesn't have that property.

Admittedly, I somewhat forgot to keep non-mathematical readers in mind while trying to write down this post. But if you're still here, these would be my takeaways: There can of course be bad reasons to dismiss a paper, but there are also a lot of possible good reasons. What seems like a potentially novel idea to outsiders is often quickly recognized as just an application of ideas that are never going to work by experienced mathematicians. Humans are bad at checking proofs line-by-line, and rather want to understand what the author is trying to do, which concepts are being applied, what insight on the problem is being used. If there seems to be no insight, the purported proof is probably wrong but possibly difficult to refute precisely, and might not be worth reading at all.


Comments

Comment by Informatheux (9 March 2025 at 11:37)

Je pensais pas que ce serait aussi trivialement faux. Je comprends pas pourquoi certains s'obstinent à publier des preuves qui ont toutes les chances d'être fausses : on parlera d'eux, certes, mais en mal uniquement.

Comment by Jean Abou Samra (9 March 2025 at 18:42)

Je suis aussi perplexe sur ce qui se passe dans la tête de l'auteur de cet article (est-ce qu'il y croit vraiment ?), mais en même temps, je n'ai jamais compris les élèves qui essaient d'enfumer le correcteur en examen de maths, et pourtant ils semblent être courants… Et forcément, les problèmes hyper-célèbres comme Collatz attirent toutes sortes de gens trop sûrs d'eux, etc.

Leave a comment