Comment on"Dr. Bertlmann's Socks in a Quaternionic World of Ambidextral Reality"

I point out critical errors in the paper"Dr. Bertlmann's Socks in a Quaternionic World of Ambidextral Reality"by Joy~Christian, published in IEEE Access. Christian's model does not generate the singlet correlations but in fact simply reproduces the Bertlmann effect. John Bell's friend and colleague Reinhold Bertlmann of CERN, in his younger days, always wore one pink and one blue sock, at random. The moment you saw his left foot, you knew what colour sock would be on his right foot. Action at a distance? As John Bell liked to explain, quantum entanglement cannot be explained away in such an easy way. Yet Christian's model assigns the two particles of the EPR-B experiment an equal and opposite spin at the source, the choice being determined by a fair coin toss. However they are measured, these spins are recovered. Christian's computer simulation works by not actually simulating his model at all but by almost directly tracing the negative cosine built into his computer algebra package. Bell's theorem has not been disproved. Debate as to what it means for the foundations of physics as well as for quantum information engineering (quantum communication, computation) is more lively today than ever before. Christian's paper, alas, does not contribute to the debate, but distracts from it. A possible role for Geometric Algebra is still wide open and deserves further investigation, informed by a proper understanding of the mathematical content of Bell's theorem.


I. INTRODUCTION
Bell's 1964 theorem [1] states that the conventional framework of quantum mechanics is incompatible with a physical principle called local realism. Bell's theorem is a cornerstone of modern quantum information theory, and of quantum computing. Proof that it is wrong would unleash a revolution in science with enormous impact on society and technology. Every textbook on quantum mechanics would have to be rewritten.
At the core of Bell's proof of his theorem is an elegant and simple probability inequality, going back to Boole (1853) [4]. However, Christian [8] claims in a recent paper in IEEE Access to have a counterexample. His other recent papers [6] and [7], the last of which was also published in IEEE Access, make the same claim. They have the common purpose of disproving Bell's theorem by giving a local realistic model, which in a nutshell means a classical physical model, which reproduces quantum correlations. Christian's models are formulated in the language of geometric algebra (GA), see Doran and Lasenby (2003) [10].
In this paper, I will go straight to the heart of Christian's new paper [8]. I show that the mathematical definitions in the new paper lead to a trivial model. The author's calculations within his model are incorrect. Inconsistent and ambiguous notation distracts, but cannot hide elementary errors in reasoning and mathematics. Bell's theorem is not contradicted. The computer simulation reported in the paper is not a simulation of the model described in the paper, and it does not prove anything.
Christian's idea that quantum correlations are explained by the geometry of space might seem appealing, but his work lends no support to this idea. Anyway, such an explanation would not be "local" in any meaningful sense. Christian seems to see the local spatial coordinate system of Alice being the mirror image of Bob's, the two orientations being determined completely at random, again and again! However, in modern accounts of Bell's theorem, angles and orientations play no role whatsoever. The new generation of loophole-free Bell experiments [20] measure correlations between four binary variables: two binary inputs and two binary outputs; one input and output at each of two distant locations. Bell does not take account of the geometry of space because his argument, on the side of local realism, does not depend on it in any way whatsoever.

II. THE HEART OF THE MATTER
In the language of probability theory, the mathematical core of Bell's original proof of his theorem is the assertion that one cannot find a single probability space on which are defined random variables X a and Y b taking values in the set {−1, +1}, for all a, b, unit vectors in R 3 , and such that for all a, b. Moreover, the expectation values of X a and Y b are all zero. The point is that these statistical properties are predicted in a particular set-up studied in quantum mechanics called the EPR-B model after the famous papers of Einstein, Podolsky and Rosen, 1935 [12], and Bohm and Aharonov, 1957 [3]. This collection of joint probability distributions of two binary variables, indexed by pairs of directions (a, b), is called "the singlet correlations". Both in real experiments and in quantum mechanical theory, one can only look at one pair of directions at a time. Thus, one can perform an experiment and observe one realisation of a pair of binary variables (X a , Y b ) for a given pair of settings (a, b). This will then be repeated many times, for the same or for different setting pairs. Notice that when a = b, QM predicts perfect anti-correlation. In the EPR-B thought experiment, we imagine performing a measurement on each of two spin-half particles at a great distance from one another. The result X a of measuring Alice's particle is obtained before the direction b in which Bob has chosen to measure his could possibly be known at Alice's location. Bob's setting can have no effect whatsoever on Alice's outcome. But Bob could measure in any direction, and if Alice were to measure in that same direction, her outcome would be the opposite of Bob's. This suggests that all the outcomes X a , Y b for all possible directions a, b exist in advance, perhaps as deterministic functions of the chosen directions and of some hidden variables.
Classical physical thinking therefore suggests that a local hidden variables (LHV) model could reproduce the singlet correlations. It would consist of two functions A(a, λ) and B(b, λ), taking the values ±1, and a probability distribution ρ over the space Λ of possible values of the hidden variable λ. Now simply change notation: write ω instead of λ, Ω instead of Λ. The random variable A a is just the function A(a, ·) defined on Ω. The word "local" refers to the fact that each of those measurement functions depends only on the measurement setting at the relevant location. The hidden variable need not be thought to be "localised" at any particular place. It could contain components to account for randomness at each measurement location in interaction with the measurement setting chosen there. Moreover, all these contributions can be correlated with one another. Bell (1964) proved that there can be no LHV model which even approximately reproduces the singlet correlations. His proof was soon improved by Clauser, Horne, Shimony, and Holt (1969) [9]. Consider two directions a 1 and a 2 , and two directions b 1 and b 2 , all in the equatorial plane. They correspond to angles in [0, 2π). Consider angles α 1 = 0, α 2 = π/2; β 1 = π/4, and β 2 = 3π/4. Thus −a i · b j = − cos(α i − β j ). Three of the differences α i − β j are equal to ±π/4 and one is ±3π/4. This means that −a i · b j equals − cos(π/4) = − √ 2/2 for three of the four combinations and + cos(π/4) = + √ 2/2 for the fourth. I will temporarily abuse notation and write Since the four random variables take the values ±1 only, one of the terms in brackets equal ±2 and the other equals 0. They are multiplied by ±1 and subtracted. The whole expression therefore takes the value ±2. Its mean value therefore cannot exceed +2. Thus we obtain the one-sided Bell-CHSH inequality (2) However, if the joint probability distribution of these four random variables would reproduce the singlet correlations, the same expression would have to equal 2 √ 2. The mathematical core of Bell's proof of his theorem can also be expressed in the language of distributed computing. It then becomes the assertion that one cannot write programs for two separated classical computers , each receiving as inputs streams of directions and generating as outcomes streams of numbers ±1, such that the correlation between the outputs given the inputs a, b is −a · b. In the appendix to this paper we explain the equivalence of the probability theory no-go theorem, and the distributed computing no-go theorem.
Christian's latest paper [8] includes what appears to be a construction of a LHV model with measurement functions A and B having the prescribed properties. His hidden variable is a fair coin toss, and he gives it a physical interpretation as left-or right-handedness of a coordinate system. He also supplies a computer simulation. In this note we will show that Christian's construction (just as in his preceding works) actually leads to the unwanted result for all a, b. One could call this the Bertlmann's socks correlation. Christian's model is a local hidden variables model, a trivial one. It does not reproduce the predictions of quantum mechanics for the EPR-B thought experiment. Christian has not disproved Bell's theorem. We will also explain what is wrong with his computer simulation. It apparently does reproduce the singlet correlations, but this means that it cannot actually be a simulation of Christian's model. By Bell's theorem, it cannot even be a simulation of a local hidden variables model. Indeed, inspection of the code shows that the program merely computes −a · b plus some random bivector noise of mean zero, directly from a and b.
Heine Rasmussen (in an internet forum debate) pointed out the following short cut to realising that Christian's claims must be false. For a = ±b, the probability distribution of the pair of binary variables (X a , Y b ) predicted by quantum mechanics gives positive probability to each of four distinct joint outcomes (±1, ±1). There is no way one can simulate a single draw from a probability distribution over four outcomes, each of positive probabilty, as a deterministic function of the outcome of one fair coin toss. Christian's hidden variable λ, which one may identify with the elementary outcome ω of the alleged probability model on which all those random variables are defined, is a fair coin toss, and in his model, the results of measurement of spin of the two particles in any two directions are functions only of λ and of the relevant direction. Christian's computer simulation program uses a fair coin toss to average Geometric Algebra products using the fundamental GA formula a · b = 1 2 ab + 1 2 ba. For those needing introduction to the whole field of Bell's theorem, Bell's inequalities, local hidden variables, loopholefree Bell experiments, and computer simulations thereof, the appendix to this paper supplies some further background and make some further remarks concerning issues raised by the referees.

III. THE GEOMETRIC ALGEBRA AND THE COMPUTER SIMULATION
I will now go into specific problems in Christian's newest paper [8], which the reader will need to have to hand. Christian formulates his model in terms of the Clifford Algebra Cl (3,0) (R). Recall that the algebra is generated by starting with three elements (called basis vectors) e 1 , e 2 , and e 3 , which anticommute with one another, and which each square to +1. Using those multiplication rules, we can furthermore generate on the one hand the scalar 1, and on the other hand three basis bivectors e 1 e 2 , e 1 e 3 , e 2 e 3 and a basis trivector e 1 e 2 e 3 . Our algebra consists exactly of all real linear combinations of the scalar 1, the three basis vectors, the three basis bivectors, and the single basis trivector. All this makes Cl (3,0) (R) a 1 + 3 + 3 + 1 = 8 dimensional real vector space endowed with a compatible non-commutative but associative multiplication operation .
Real multiples of 1 are called scalars, real linear combinations of e 1 , e 2 , and e 3 are called vectors, real linear combinations of e 1 e 2 , e 1 e 3 , e 2 e 3 are called bivectors, and real multiples of e 1 e 2 e 3 are called trivectors, and also called pseudo-scalars. The scalars 0 and 1 play, as elements of the algebra, the roles of an (additive and multiplicative) zero and of a multiplicative identity. Every element of the algebra can be written in a unique way as a sum of a scalar, vector, bivector and trivector. The algebra is called graded; it is built up of elements of grades 0, 1, 2 and 3. We think of the vectors in the algebra as real 3D vectors in Euclidean space. The bivectors can be thought of as oriented plane elements, the trivectors as oriented volume elements. The bivectors together with the scalars form a four dimensional sub-algebra. It is the algebra of the quaternions, discovered by Hamilton.
I will start with some notational problems. The main mathematical problem will come later, and cannot be resolved by cleaning up the notation, as I will show. Ambiguous notation is just a warning signal.
Equations (32) and (33) of [8] introduce bivectors L(a, λ), L(b, λ), D(a) and D(b). Here, a and b are ordinary unit length 3D vectors, but also seen as elements of grade 1 in the Geometric Algebra. The second formal argument of L is the scalar called λ, which can take the values +1 and −1. We are told in (32) that L(a, λ) = λD(a) = λIa and in (33) that L(b, λ) = λD(b) = λIb where I is the trivector e 1 e 2 e 3 . (Christian writes L(a, λ) = λI ·a and I = e 1 ∧e 2 ∧e 3 but the symbols "·" and "∧" in these contexts are superfluous). The pseudo-scalar I commutes with everything and I 2 = −1.
It follows directly from Christian's (32) and (33) that which does not depend on λ at all. In fact, from geometric algebra we know that Christian states, just after his (28), that the scalar λ stands for the "handedness" of a basis of the tangent space at any point of S 3 . The tangent space at every point is R 3 . If one fixes an orthonormal basis of the tangent space at one point, one can label its elements, and try to move it smoothly around the manifold. It is a fact that one can move around the manifold S 3 and discover that the labelling has changed when one comes back to the same position. One needs to complete the circuit twice to get back to the initial configuration. This is Dirac's famous belt trick, cf. Christian's Möbius strip example in his Section II.
Is it possible that the contradiction just obtained follows from unfortunate notation? Perhaps the author has in mind both a right-handed and a left-handed cross product. And even, perhaps, a right-and left-handed geometric product? Can we restore consistency by introducing either of these features explicitly?
For example, consider what happens if we keep one Geometric Algebra product, but introduce two Euclidean space cross-products, × (λ) where λ = ±1, by the rules Now equation (36), corrected, makes sense and is consistent with what follows: Having restored consistency to the definitions we can now quickly check formulas (34) and (35), taking account of (36), which give us However, there is no point at all in trying to fix these notational issues. Let us jump to the definition of the important measurement functions, A and B. I will not use any modification of earlier definitions. I start with the left hand parts of Christian's defining equalities (39) and (40); the ones with limits as s 1 converges to a and as s 2 converges to b of a D times an L. Those limits can be written down immediately, giving us exactly as Christian himself tells us with the right hand sides of his (39) and (40), and consistent with his (43)-(47). Moreover, in (47) to (49), Christian writes explicitly that the product of A and B is identically equal to minus 1. For this very reason, Christian embarks on alternative but more complex computations of the same quantity, and arrives at a very different result, namely the actual singlet correlations. These alternative calculations must be wrong. Anyone who has some patience and can do elementary calculus will easily locate fatal errors. Christian attempts to evaluate the product of two limits, one as s 1 converges to a, the other as s 2 converges to b. He evaluates this limit by imposing s 1 = s 2 = s before taking the limit. Taking the product first, the variables s 1 and s 2 happen to cancel, since s 2 = 1. He now takes the limit as s 1 → a, s 2 → b of an expression which does not depend on s 1 , s 2 at all.
Christian rounds things off with a computer simulation written in the language of the Geometric Algebra package GAViewer, which accompanied the book Dorst et al. (2007) [11]. The package does not run on present day Mac or Linux machines without recompiling and building it from the source code. Fortunately, the code in Christian's paper is easy to read, with the help of the GAViewer user's manual.
The outcomes of the measurement functions are computed but ignored, except in order to compute their averages, which of course are close to zero. The program jumps to an intermediate step in Christian's theoretical evaluation of the product of the outcomes. Christian About half the time, the "product of the measurements" is defined by the code as the quaternion −ab, the other half of the time it is the quaternion −ba. Recall the fundamental facts of 3D Geometric Algebra a b = a · b + I a×b, The program randomly samples many uniformly distributed, independent unit length vectors a and b. For a given pair, it computes either −ab or −ba, chosen by the outcome of a fair coin toss. It also computes the arc cosine of a · b. The scalar part of the average of the geometric products is plotted against the angles cos −1 (a · b), grouped into bins. The bivector parts are printed to show they are small, but they have to be discarded from the plot anyway. Christian's companion IEEE Access paper [7] also concluded with a computer simulation. His computer programmer for that paper, C.F. Diether, adopted Gill's [17] implementation of Pearle's (1970) [22] detection loophole model. Gill had fixed errors in Pearle's (1970) classic paper, posted R code on internet, and discussed it in public internet discussions in early 2014. That was the first time that anyone had implemented the Pearle model as a computer simulation. Till then, it had been seen as a purely theoretical result on the minimal efficiency of detectors needed to violate the CHSH inequality in the face of the detection loophole when using the usual state and spin measurements. Gill (2020a) [16] already showed that [7] contains the same defects as [8].

IV. CONCLUSION
Christian's model simply reproduces the Bertlmann effect. Bertlmann always wore one pink and one blue sock, at random. The moment you saw his left foot, you knew what sock he wore on his right foot. As John Bell explained [2], quantum entanglement cannot be explained in such a way. Christian's model assigns the two particles of the EPR-B experiment an equal and opposite spin at the source, and however they are measured, these spins are recovered.

A. INSIDE BELL'S THEOREM: NO-GO THEOREMS IN PROBABILITY THEORY AND IN DISTRIBUTED COMPUTING
I see Bell's theorem as the metaphysical statement that quantum mechanics (QM) is incompatible with local realism (LR). More precisely, and following Tsirelson's Citizendium.org article [23], Bell's theorem states that conventional quantum mechanics is a mathematical structure incompatible with the conjunction of three mathematical properties: relativistic local causality (commonly abbreviated to "locality"), counterfactual definiteness ("realism") and no-conspiracy ("freedom"). By conventional quantum mechanics, I mean: quantum mechanics including the Born rule, but with a minimum of further interpretational baggage. Whether the physicist likes to think of probabilities in a Bayesian or in a frequentist sense is up to them. In Many Worlds interpretations (and some other approaches), the Born rule is argued to follow from the deterministic (unitary evolution) part of the theory. But anyway, everyone agrees that it is there.
Bell himself, a physicist writing for physicists, sometimes used the phrase "my theorem" to refer to his inequalities: first his (Bell, 1964) three correlations inequality [1], and later what is now called the Bell-CHSH (Clauser, Horne, Shimony, Holt 1969) [9] four correlations inequality (2). I see those inequalities as simple probabilistic lemmas used in Bell's various proofs over the years of the same theorem (the incompatibility of QM and LR). Fine (1982) [13] converted Bell's theorem into an "if and only if result". He showed that the satisfaction of all eight one-sided Bell-CHSH inequalities together with the four nosignalling equalities is necessary and sufficient for a local hidden variables theory to explain the sixteen conditional probabilities p(x, y | a, b) of pairs of binary outcomes x, y given pairs of binary settings a, b, in a Bell-CHSH-type experiment. (No-signalling is the statement that Alice can not see from her statistics, what Bob is doing: p(x | a, b) does not depend on b, and similarly for Bob, p(y | a, b) does not depend on a.) A precursor of Fine's theorem can be recognised in Boole's (1853) book [4]. Illustrating general methodology developed in his book, Boole derives the conditions on three probabilities p, q and r of three events which must hold in order that a probability space exists on which those three events can be defined with precisely those three probabilities, given certain logical relations between those three events, and comes up with what can be recognised, with some creativity, as the six one-sided Bell three-correlation inequalities. With four events, his methodology would have given us Fine's theorem. More recently these results have been generalised to experiments with arbitrary numbers of parties, measurement settings, and measurement outcomes, [5].
In a Bell-CHSH type experiment we have two locations or labs, in which two experimenters Alice and Bob can each choose a binary setting to a device, which then generates a binary output or measurement outcome. The experimenters have previously set things up so that the setting choices correspond to certain angles or directions. Just two settings are considered in each wing of the experiment. This is repeated, say N times. We will talk about one run consisting of N individual trials. The binary setting choices are externally generated, perhaps by tossing coins or performing some other auxiliary experiment. One published experiment used, as inputs, the bits of a maximally compressed video recording of the movie "Back to the Future". The spatial-temporal arrangement of the 2N measurements is such that there is no way a signal carrying Alice's nth setting, sent just before it is inserted into her device, could reach Bob's lab before his device has generated its nth outcome, even if transmitted at the speed of light, and vice versa.
These experiments involve measurements of the "spin" of "quantum spin-half particles" (electrons, for instance); or alternatively, measurements of the polarization of photons in the plane opposite to their directions of travel. The two settings, both of Alice and Bob, correspond to two directions (spin) or orientations (polarization), usually in the plane, but conceivably in three-dimensional space. (Polarization can be represented as a direction in 3D, on what is called the Poincaré sphere). Focussing on the case of spins: in the so-called singlet state of two entangled spin-half quantum systems, one can conceivably measure each subsystem in any 3D direction whatsoever, and the resulting pair of ±1valued outcomes (X a , Y b ) would have the "correlation" EX a Y b = −a · b. Marginally, they would be completely random, EX a = EY b = 0.
These statistical predictions are easy to compute, using the standard rules of quantum mechanics, for the so-called EPR-B experiment: the Einstein, Podolsky, Rosen (1935) thought experiment, transferred to spin by Bohm and Aharonov (1957). (Translated to the polarization example, this joint probability distribution of two binary variables is often called Malus' law.) We will stick to the spin-half terminology and talk about "the singlet correlations" referring to the whole family, indexed by pairs of directions, of joint probability distributions of two binary variables just described. The archetypical example (though itself only a thought experiment) of such an experiment would involve two Stern-Gerlach devices and is a basic example in many quantum physics texts.
Bell was interested in what one nowadays calls (stochastic) "local hidden variables theories" (LHV). According to such a theory, the statistics predicted by quantum mechanics, and observed in experiments, are merely the reflection of a the classical underlying theory of an essentially deterministic and local nature. There might be local randomness, for instance, further randomness in the measurement devices. Different sources of randomness could even be correlated. Mathematically, such theories are generally agreed to assert the mathematical existence of a classical probability space on which are defined random variables X a and Y b for all directions a and b in the plane (or in space), such that each pair (X a , Y b ) has got the previously described joint probability distribution. A natural mathematical question is: can such a probability space exist? The answer is well-known to be "no". Just one of the many ways to prove this theorem is through Bell's inequalities.
The underlying probability space is usually called Λ instead of Ω, and the elementary outcomes λ ∈ Λ stand for the configuration of all the particles involved in the whole combined set-up of a source connected to two distant detectors, which are fed the settings a and b from outside. Thus X a (λ) stands within the mathematical model for the VOLUME 4, 2016 outcome which Alice would theoretically see if she used the setting a, even if she actually used another. There is no claim that these variables exist in reality, whatever that means. We are talking about the mathematical existence of a model with certain mathematical properties.
In a sequence of trials, one might initially suppose that for each trial there is some kind of resetting of apparatus, so that at the nth trial we see the outcomes corresponding to λ = λ n , where the sequence λ 1 , λ 2 , . . . , are independent draws from the same probability measure on the same probability space Λ. Now suppose we could come up with such a theory, and indeed come up with a (classical) Monte-Carlo computer simulation of that theory, on a classical PC. Then we could do the following. Simulate N outcomes of the hidden variable λ, and simply write them into two computer programs as N constants defined in the preamble to the programs. More conveniently, if they were simulated by a pseudo random number generator (RNG), then we could write the constants used in the generator, and an initial seed, as just a few constants, and reproduce the RNG itself inside both programs. The programs are to be run on two computers thought of as belonging to Alice and Bob. Think of the case of directions in the plane. The two programs are started. They both set up a dialogue (a loop). Initially, n is set to 1. Alice's computer prints the message "Alice, this is trial number n = . . .. Please input an angle." Alice's computer then waits for Alice to type an angle and hit the "enter" key. Bob's computer does exactly the same thing, repeatedly asking Bob for an angle.
If, on her nth trial, Alice submits the direction a, then the program on her computer evaluates and outputs X a (λ n ) = ±1, increments n by one, and the dialogue is repeated. Alice's computer does not need Bob's direction for thislocality! Bob's does not need Alice's. Thus, if one could implement a local hidden variables theory for one trial of a Bell-type experiment in one computer program, then one could simulate the singlet correlations derived from one run of many trials on two completely separate computers, each running its own program, and each receiving its own stream of inputs (settings) and generating its own stream of outputs.
This idea goes back a long way. It is for instance mentioned in Jaynes (1999) [21]. In his paper, presented at the MaxEnt conference the preceding year, Jaynes had argued that Bell did not understand conditional probability. The paper was discussed by Steve Gull, who disagreed, and had posed the problem: "Write a program which is to run on two PC's which mimics the QM predictions for the EPR setup. There must be no communication between the computers after the time of program load". He then presented a "Sketch proof of impossibility" using Fourier theory, no Bell inequalities at all. Jaynes was dumbfounded and predicted that it would take 30 years to understand Gull's new result, just as it had taken 20 years to understand Bell's. Gull's overhead transparencies are reproduced [19] on his home page. Gill and Karakozak (2020) [18] have worked out the proof in full detail.
But why should those two functions be the same, for each trial? Maybe the parameters of the underlying physics drifts, and even occasionally jumps. Experimenters know there is an enormous stability problem in these experiments. The LHV model for the nth trial should be allowed to depend on all previous inputs and outputs of all previous trials, as well as depending on time, n. The only thing that is forbidden, is that Alice's nth output depends on Bob's nth input, and vice versa.
Focussed on classical networked computer simulations of stringent CHSH type experiments, Gill (2003) [14] produced a martingale inequality using a variant of the CHSH statistic. He considered the case that settings are chosen completely at random. He noticed that if the denominators of the four fractions defining the four sample correlations are replaced by their expectation values N/4 and the whole statistic is multiplied by N/4, it then equals a sum over the N trials of a quantity which, under local realism, and assuming the complete randomness of the setting choices only, has conditional expectation (given the past) less than 3/4. Subtract off (3/4)N and one has, under local realism, a supermartingale in the time variable N . The conditional expectation of increments of the process, conditional on the past, are negative. The increments of the process are bounded, and martingale theory supplies powerful exponential inequalities on the probabilities of large deviations upwards.
Gill's methods were further refined. Hensen et al. (2015) [20], reporting the first ever successful "loophole-free" Belltype experiment in the journal Nature, also derived and employed a new martingale based modification of the CHSH inequality. Consider such an experiment and let us say that the nth trial results in a success, if and only if the two outcomes are equal and the settings are the pair (1, 2), or the two outcomes are opposite and the settings are not the pair (1, 2). The quantum engineering is such as to ensure a large positive correlation between the outcomes for setting pair (1, 2) and a large negative correlation for setting pairs (1, 1), (2, 1) and (2, 2). Let us denote the total number of successes in a fixed number, N , of trials, by S N . Then Hensen et al. (2015) [20] show that, for all x, under the assumption of local realism, where Bin(N, p) denotes a binomially distributed random variable with parameters N , the number of trials, and success probability per independent trial p. "The assumption of local realism" is a bundle of physics concepts. Notice that a network of two classical PC's both performing a completely deterministic computation, and allowed to communicate over a classical wired connection between every trial and the next, but not during each trial, does satisfy those assumptions. The theorem applies to a classical distributed computer simulation of the usual quantum optics lab experiment. Time trends and time jumps in the simulated physics, and correlations (dependency) due to use of memory of past settings (even of the past settings in the other wing in the experiment) do not destroy the theorem. The probability inequality (12) is driven solely by the completely random choice anew, trial after trial, of one of the four pairs of settings, while each computer is only fed its own setting, not that given to the other computer. Statistical randomisation neutralises effects of uncontrolled (and maybe even unknown) confounders, and leads to guaranteed Pvalues.
Take for instance N = 10 000. Take a critical level of x = 0.8N . Local realism says that S N is stochastically smaller (in the right tail) than the Bin(N, 0.75) distribution. According to quantum mechanics, and using the optimal pairs of settings and the optimal quantum state, S N would have the Bin(N, p) distribution with p = 1 2 + √ 2 4 ≈ 0.85. Under those two distributions, the probabilities of outcomes respectively larger and smaller than 0.80N are about 10 −30 and 10 −40 respectively. I challenge anyone who believes in local realism to develop a computer simulation of a local realistic physical model, which subject to the external experimental constraints sketched by Bell in [2], and nowadays routinely imposed in "loophole free Bell experiments", reliably achieves a greater than 80% large N success rate. The simulation experiment must be reproducible through the use of a "set seed" facility in any used RNG, so that it can be verified by intensive testing that Alice's nth output does not depend on Bob's nth input, nor on future inputs of Alice or Bob. The inputs must not be generated inside the simulation, but must be supplied by the outside user. The statistical analysis of the outputs must also be left to the user.

B. METAPHYSICS: WHAT SHOULD WE BELIEVE, NOW?
Several reviewers preferred novel physics and metaphysics in a paper on Bell's theorem, instead of well known mathematics. I have the same preference but as a mathematician, not a physicist or a philosopher, I am not qualified to deliver. My "position" on the metaphysical or philosophical issues has varied over the years and remains open. I see several reasonable positions to hold. I do think that since the loophole-free experiments of 2015, "local realism" is no longer tenable. Those experiments do need improvement. For instance, Hensen et al.'s (2015) Delft experiment [20] had N = 245, far too small. The P -value for testing the nullhypothesis of local realism was 3%. The result was promising and the experimental set-up was brilliant and innovative (making use of a technique called entanglement swapping), and moreover a model of "good scientific practice". RICHARD D. GILL was born in 1951 in the UK. B.A. degree in mathematics, Cambridge University, 1973; diploma of statistics, Cambridge University, 1974; PhD degree in mathematics, Free University Amsterdam, 1979. In his career he has been head of statistics department, CWI Amsterdam; professor mathematical statistics in Utrecht, later in Leiden; and is now emeritus professor in Leiden. His early work was in counting processes, survival analysis, martingale methods, semiparametric models. Later he has worked in forensic statistics, quantum information, and on scientific integrity. His work on experimental loopholes in Bell-type experiments was incorporated in the famous "loophole-free" Bell experiments of 2015. He is a member of the Royal Dutch Academy of Science and a past president of the Netherlands Society for Statistics and Operations Research.