Thursday, November 24, 2011 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

How classical fields, particles emerge from quantum theory

As Gene and Sidney Coleman have pointed out, the term "interpretation of quantum mechanics" is a misnomer encouraging its users to generate logical fallacies. Why? It's because we should always use a theory, or a more accurate, complete, and universal theory, to interpret its special cases, to interpret its approximations, to interpret the limits, and to interpret the phenomena it explains.

However, there's no language "deeper than quantum mechanics" that could be used to interpret quantum mechanics. Unfortunately, what the "interpretation of quantum mechanics" ends up with is an attempt to find a hypothetical "deeper classical description" underneath the basic wheels and gears of quantum mechanics. But there's demonstrably none. Instead, what makes sense is an "interpretation of classical physics" in terms of quantum mechanics. And that's exactly what I am going to focus in this text.

Plan of this blog entry

After a very short summary of the rules of quantum mechanics, I present the widely taught "mathematical limit" based on the smallness of Planck's constant. However, that doesn't really fully explain why the world seems classical to us. I will discuss two somewhat different situations which however cover almost every example of a classical logic emerging from the quantum starting point:

  1. Classical coherent fields (e.g. light waves) appearing as a state of many particles (photons)
  2. Decoherence which makes us interpret absorbed particles as point-like objects and which makes generic superpositions of macroscopic objects unfit for well-defined questions about classical facts
Fine, so let's start.




How quantum mechanics works

I will assume that the dear reader doesn't suffer from psychological problems that would prevent her (or usually him) from understanding that at the fundamental level, especially in the microscopic world, the basic conceptual logic of the laws of Nature qualitatively differs from classical physics. Everyone who is learning quantum mechanics must learn this new foundational basis from scratch.

Even though many previous insights where classical physics was OK may be confirmed by the quantum physics machinery, we need a brand new setup that works for the intrinsically quantum phenomena and the proofs of "common sense" facts in the everyday world are sometimes non-trivial, don't follow from any "basic assumptions" of the quantum theory, and are often true just approximately. Physicists had to make this switch 85 years ago and everyone who wants to follow their path and comprehend where physics has been standing and going since 1925 has to do the same thing.

So the people who are constantly distracted by some instinctive ideas that the world has to be realist or deterministic or classical or that an instrumentalist theory simply has to be wrong are politely asked to stop reading this article and return to their everyday activities involving their garden or lunch. They would probably get nothing out of this article and unfortunately, I cannot guarantee that they will ever understand modern physics. Thank you very much for your understanding and cooperation.

Now, when the obnoxious medieval bigots are gone, we may finally talk about some serious physics – how the phenomena around us actually work.

Internal wheels and gears behind quantum mechanics

Quantum mechanics is a framework to predict the probabilities of any phenomena in Nature – or in gedanken Universes that are qualitatively analogous to Nature or in very different parts of a hypothetical multiverse – out of the past data we have measured or otherwise perceived as facts. It satisfies a couple of universal postulates of quantum mechanics that we will return to momentarily.

However, every meaningful particular quantum theory must also choose some Hamiltonian or action or S-matrix or transition amplitudes, something that actually tells us how its Hilbert space evolves in time. We have talk about quantum models such as the description of a couple of qubits or spins; non-relativistic quantum mechanics with a particular potential \(V(x)\); Dirac equation for a single particle which doesn't quite work because the energy is unbounded from below; quantum electrodynamics; quantum chromodynamics; the Standard Model; the Minimal Supersymmetric Standard Model and other phenomenological models beyond the Standard Model; or string theory. We may also include lots of quantum theories that are studied mostly for mathematical reasons but that nevertheless obey the same principles – such as Chern-Simons theory.

Quantum mechanics says that the state of the world may be described by pure states such as \(\ket\psi\) that belong to a complex linear vector space, the Hilbert space \({\mathcal H}\), which is equipped with a sesquilinear inner product (it's linear in the ket-vector and antilinear i.e. linear with complex conjugation in the bra-vector): you may multiply two vectors to get a complex number. All future predictions may be obtained from this pure state \(\ket\psi\) if we evolve it in time. All predictable questions in quantum mechanics may be reduced to this template:
What is the probability that a particular observable \(L\) will have a particular eigenvalue \(l_i\) at some time \(t\)?
Every observable \(L\) is represented by a Hermitian linear operator
\[\hat L:\,{\mathcal H}\to{\mathcal H}\] which has real eigenvalues. The eigenvalues \(l_i\) correspond to possible values of \(\hat L\) that may be observed. The probability that \(L\) will be observed as having the eigenvalue \(l_i\) is given by
\[ {\rm Prob}_i = |\langle\phi_i | \psi(t)\rangle|^2 = \bra{\psi(t)} P_i \ket{\psi(t)},\quad P_i\equiv \ket{\phi_i}\bra{\phi_i} \] where \(\ket{\psi(t)}\) is the state vector evolved to the right time; \(\bra{\phi_i}\) is the bra-vector (a row, i.e. the Hermitian conjugate of a column ket-vector \(\ket{\phi_i}\)) corresponding to an eigenstate of \(\hat L\) and eigenvalue \(l_i\) i.e. satisfying \(\hat L\ket{\phi_i} = l_i \ket{\phi_i}\) and normalized to one (the inner product with itself \(\bra{\phi_i}\phi_i\rangle=1\) equals one). If the subspace of the Hilbert space corresponding to the same eigenvalue \(l_i\) is multi-dimensional, we must sum these probabilities over a (any) whole orthonormal basis of the space corresponding to the given eigenvalue.

It's important that every question about the real world, including questions about properties of cats, physicists, and other people must be formulated according to the formalism described above. If some property of Nature or an object in Nature isn't associated with a linear operator, then it's not an observable according to quantum mechanics. This sentence means both that it is not a "quantum reincarnation of a physical quantity" but it also means that "from the viewpoint of physics, such a thing isn't physical" because it can't be observed in a single experiment.

If you were talking about any "observable feature of reality" or even a "beable" but it wouldn't be given by a linear operator, then you would be making an elementary mistake according to quantum mechanics. All usual quantities such as energy \(E=H\) (Hamiltonian), momentum \(\vec p\), angular momentum \(\vec J\), position of a particle \(\vec x\), the average electric field vector inside your room \(\vec E\) etc. are associated with a linear operator (or a vector of linear operators, or a whole "field" of linear operators) acting on the Hilbert space. We sometimes emphasize their operator character by a hat written above the letter.
Off-topic: Harvard president Drew Gilpin Faust has restricted the access to the Harvard Yard for potentially violent external members of the "Occupy Wall Street, Harvard, and Everything Else" communist movement. Via Willie.
There exists another special subset of operators \(\hat L\), namely operators corresponding to "Yes/No" questions. These are represented by Hermitian projection operators \(\vec P\). Their being Hermitian and their being projection operators means that
\[ P^\dagger = P, \qquad P^2=P. \] It is easily seen that the second condition (saying that if you project twice, you get the same thing and waste your time) must also be obeyed by their eigenvalues. But \(p^2=p\) is satisfied by \(p=0\) and \(p=1\) only. So those are the eigenvalues. The eigenvalue \(0\) corresponds to "No" and the eigenvalue \(1\) corresponds to "Yes". Once again, and this is a special case of what I have already said; according to quantum mechanics, every physically meaningful "Yes/No" question about Nature must be associated with a Hermitian projection operator \(P\) acting on the Hilbert space. (There may be additional conditions imposed on \(P\) for the question to be really meaningful and for the answer to be interpreted as a "classical fact".)

What is the answer to this question \(P\)? The answer is that the probability of "Yes" is
\[{\rm Prob(Yes)} = \bra{\psi(t)} P \ket{\psi(t)}. \] No, there's no second power in it anymore (I had a mistake in it). That's it and once again, it's a special case of the story for a general operator. Nothing more accurate than probabilities may be predicted, not even in principle: the free will theorem and other things guarantee that not even God may know the actual "Yes/No" answer in advance. But of course, once the answer is "lived" in the world around us, the only possible answers are "Yes" and "No". There are no eigenvalues in between so it is mathematically guaranteed that no "vague" answer to a well-defined "Yes/No" question may occur in Nature. Only "Yes" and "No" may appear and the probabilities are given above.

As you may see, the axioms of quantum mechanics are very concise. The people thinking in the "binary way" could even say that all questions may be reduced to lots of "Yes/No" questions so my rule for the projection operators is, in some sense, sufficient for all predictions in science.

We may know the exact initial pure state \(\ket\psi\) if we, for example, measure a maximal set of mutually commuting observables. In practice, we don't know \(\ket\psi\) exactly and we must consider statistical mixtures. They're also described by Hermitian linear operators, ones whose trace equals one:
\[\rho:\,{\mathcal H}\to{\mathcal H},\qquad {\rm Tr}(\rho) = 1.\] The operator \(\rho\) is called the density matrix (mathematical expression) and it is physically identified with a mixed state (physical expression). In the case of a pure state, the corresponding \(\rho\) is
\[ \rho = \ket\psi \bra \psi.\] However, in a general case, the density matrix is a linear superposition of such terms,
\[ \rho = \sum_i p_i \ket{\psi_i} \bra {\psi_i}, \] summed over some states \(\ket{\psi_i}\). When they're orthogonal, the real coefficients \(p_i\in {\mathbb R}\) may be interpreted as probabilities. Density matrices generalize classical probability distributions on the phase space into the case of quantum mechanics.
Advisory: Check a screenshot to see a recommended font size so that the blog and its maths reads nicely. Press CTRL/+ or CTRL/– to zoom in/out.
When we calculate probabilities using the density matrix \(\rho\) rather than a pure state \(\ket\psi\), it simply means that we make a simple replacement
\[ \bra \psi (\dots) \ket \psi \to {\rm Tr}[\rho (\dots)]. \] Note that for \(\rho=\ket\psi \bra\psi\), these two things coincide due to the cyclic property of the trace.

That's it. The rules are concise, beautiful, and universal. They also reduce to classical physics whenever classical physics was successful, as I will discuss in the rest of the text. The only two reasons why people have a problem with the fundamental quantum setup above is that
  1. they aren't capable of mathematically calculating the predictions and verify that they agree with the observations
  2. they have metaphysical or psychological reasons why they're unwilling to believe that this is how the world works, despite all the evidence
We have said "good-bye" to anti-quantum zealots at the beginning but because of quantum tunneling, there is a nonzero probability amplitude that one of them managed to get to this point, so I gave her (or, more likely him) a special gift to make sure that I haven't forgotten about her (or his) portion of the wave function.

Removing hats and neglecting Planck's constant

All the universal setup of quantum mechanics was explained above; the only obstacle that could prevent a rational person from understanding Nature are difficulties with the structure of the relevant Hilbert spaces, operators on them, and especially the Hamiltonian (or evolution operators or whatever we have that determines the evolution of the state vectors from one time to another).

In the previous section, I discussed how the questions are being reformulated to the mathematical formalism and how they're being answered according to quantum mechanics. But I haven't discussed how the wave function evolves in time. It's because everyone knows that. Whenever the moment-by-moment evolution exists (and it's problematic in theories based on general relativity), the wave function obeys Schrödinger's equation,
\[ i\hbar \frac{{\rm d}}{{\rm d}t}\ket{\psi(t)} = \hat H(t) \ket{\psi(t)}. \] I have indicated that the Hamiltonian \(\hat H\) is time-dependent; but in most cases we care about, it is not. In Nature, its time dependence always arises from some "effective incorporation" of a time-dependent environment but the evolution of the environment could be described by more fundamental laws with a time-independent \(\hat H\), anyway.

Also note that I have used the ordinary time-derivative of the wave function. It's because \(\ket\psi\) doesn't really depend on anything else than time. The state vector "contains in it" all the dependence on the positions of particles or whatever basic degrees of freedom you consider in your system. Of course, if you were writing a more concrete version of the general equation above, you could replace \(\ket{\psi(t)}\) by \(\psi(x,y,z;t)\) or something analogous. Because new variables besides time have appeared, we would need to use a partial derivative.

In this picture, the operators such as \(\hat x\) are time-independent; only the wave function varies. However, Schrödinger's equation may be replaced by the Heisenberg equation in which \(\ket\psi\) is constant but the operators vary. These two "pictures" may be obtained from one another by a time-dependent, rotating system of coordinates on the Hilbert space. When you translate them properly, the predictions for the probability amplitudes will be identical. There are also mixed or Dirac or interaction pictures. And one may also calculate the probability amplitudes by Feynman's path integral
\[{\mathcal A}_{i\to f} \sim \int {\mathcal D} \phi(x,y,z,t) \exp(i S[\phi(x,y,z,t)]/\hbar) \]and show why it's equivalent to Schrödinger's picture, too. All these dynamical formalisms are equivalent: they produce exactly the same results for all the things that may be experimentally verified, even in principle, i.e. for the probabilities that an observable will be observed to have a particular value.

Quantum mechanical theories often have "straightforward classical limits". Even though the quantum mechanical theory is the fundamental one and everything else is derived from it, the history of science went in the opposite direction. We first knew the classical limit and then we "deduced" (well, one should really say "induced") the full quantum theory out of it. I want to stress that not every quantum theory may be "reconstructed" out of its classical limit in this way.

Fine. So let's look at the cases where it can. How do the classical and quantum theories differ? Schrödinger's picture is the least convenient one for these comparisons; the Heisenberg picture or Feynman's path-integral approach give a clearer understanding of the quantum-classical relationship. In Feynman's approach, one integrates over all conceivable classical trajectories. Each of them is given a weight whose absolute value is constant; the complex phase is given by the classical action. It may be shown that the trajectories that obey the classical equations of motion will be the most important ones (once we partially integrate the integral over a vicinity of a given trajectory). That's an easy way to see that the configurations obeying the classical equations will be important in the classical limit according to Feynman's approach.

Returning a few decades before Feynman, we may use the Heisenberg picture. The operators \(\hat L\) are evolving according to the Heisenberg equations
\[ i\hbar \frac{{\rm d}}{{\rm d}t} \hat L(t) = [\hat L(t), \hat H(t)] + \frac{\partial \hat L}{\partial t}. \]Ignore the last "partial term" because this "explicit time-dependence" is usually zero for similar reasons to those we have already mentioned. Once again, the Hamiltonian is usually time-independent. A funny thing is that this commutator may be shown to reduce to the classical Poisson brackets:
\[ \lim_{\hbar\to 0} \frac{1}{i\hbar} [\hat F,\hat G] = \{F,G\}. \] You may learn how to formulate the good old equations of classical physics in terms of Poisson brackets – it's a purely mathematical trick changing nothing about the physics or its interpretation. And another mathematical operation is enough to prove the identity above. Up to the factor of \(i\hbar\), the quantum commutator
\[ [\hat F,\hat G]=\hat F\hat G-\hat G\hat F \] reduces to the Poisson bracket of the corresponding classical observables (which no longer have any hats because they're not operators anymore). And because classical physics may also be formulated in terms of Hamilton's equations involving \(\{H,L\}\), it follows that the dynamical equations of quantum mechanics, namely the Heisenberg equations, reduce to the right classical equations, the Hamiltonian equations, in the classical i.e. \(\hbar\to 0\) limit.

For the evolution to be nonzero, it's important that we divided by the tiny constant \(i\hbar\), to make the tiny commutator finite again. Indeed, all the commutators are tiny, e.g.
\[ [\hat x,\hat p] = i\hbar, \] which means that in the classical limit, finite ready-to-be-classical observables such as \(\hat x\) and \(\hat p\) reduce to commuting objects and there's no contradiction with the fact that they're "just numbers" in classical physics. This smallness of the commutator has many obvious consequences: for large objects, both \(\hat x\) and \(\hat p\) are "finite" in the SI units (meter-kilogram-second). That's why uncertainties of both \(\hat x,\hat p\) of order \(10^{-17}\) in the SI units are enough to obey the Heisenberg uncertainty principle. That's no limitation in practice.

Let me mention that I have talked about mechanics but the same thing holds for field theory, too. You may write e.g. quantum electrodynamics as "classical electrodynamics with hats". The electric and magnetic fields and potentials get tiny commutators proportional to \(i\hbar\), well, this is also multiplied by various \(\delta\)-functions of the positions where the operators are attached. Field theory may be imagined to be nothing else than mechanics with (continuously) infinitely many degrees of freedom: some coordinates and velocities are sitting at each point of space.

The \(\hbar\to 0\) limit works as well as it does in mechanics. In particular, that's one way to get a classical field theory as a limit of the corresponding quantum field theory. You may focus on configurations where the "wave functional" which depends on things like \(\hat{\vec A}(x,y,z)\) is "concentrated" near some classical configuration \(A(x,y,z)\). The width of the "wave packet" in the infinite-dimensional space may also become relatively small when \(\vec A(x,y,z)\) itself is large enough, so the uncertainty principle becomes inconsequential in the classical, \(\hbar\to 0\) limit.

These mathematical operations sending \(\hbar\to 0\) are pretty simple conceptually and mathematically and they're presented rather well by all quantum mechanics textbooks. However, the textbooks are not doing a good job in explaining why we "really see" the world as classical in the conventional macroscopic situations. Typical readers don't quite understand why the simple consequences of a vanishing \(\hbar\) are enough to reconstruct the classical perceptions. I deliberately talk about "perceptions" because quantum mechanics never changes its "conceptual structure": the explanations below just show that its predictions become indistinguishable from the predictions of classical physics, a theory using entirely different conceptual underpinnings.

In the rest of the text, I will discuss the emergence of classical objects and perceptions from quantum physics. The discussion will be divided to two basic sections: fields and particles.

The comments about the fields won't include any (significant) decoherence; the emergence of classical fields will depend on statistical properties of "large numbers" in the very same way that allows one to extract a smooth thermodynamic description of a thermal system out of the chaotic microscopic description rooted in statistical physics (which may be even classical statistical physics).

My comments about the emergence of well-localized particles will depend on decoherence. This discussion covers all the situation where the people who have been told good-bye love to hear and babble about "collapses of the wave function" and similar nonsense. Decoherence is the right insight you must master to properly understand the situations where the misguided people talk about "collapses of the wave function"; the latter doesn't exist and decoherence isn't really making the system "collapse" into one particular answer, one particular eigenvalue. So let me begin.

Getting classical fields from many quantum particles

Electrodynamics will be our example; most of the results below generalize to other bosonic fields as well: gravitational fields, perhaps Z-boson, W-boson, gluon, and Higgs fields as well, among others. (And I could also talk about "emergent fields" in low-temperature physics that are relevant for the description of superconductivity; or about phonon fields in solid materials: these are additional analogous situations.)

However, Z-bosons, W-bosons, gluons, and Higgs fields operate in the high-energy regime in which a "purely classical limit" never becomes too useful. For gravitational fields, the classical description is too useful: due to the weakness of gravity, it is hard to detect any phenomena such as individual gravitons. The number of gravitons in the same state must always be huge for us to have a chance to detect a gravitational wave. So even though I said that the results will be more general, the electromagnetic field is really the most important real-world (I don't want to say that the world is objectively, classically real here!) realization of the ideas we're going to discuss.

In classical physics, the electromagnetic field is described by \(\vec E(x,y,z,t)\) and \(\vec B(x,y,z,t)\) which obey the classical Maxwell's equation. They may also be replaced by the potentials \(\phi(x,y,z,t)\) and \(\vec A(x,y,z,t)\), the electrostatic and vector potential. However, the latter are not uniquely determined for a given observable situation; a gauge transformation
\[ (\phi, \vec A) \to (\phi+\frac{\partial \lambda}{\partial t}, \vec A-\vec \nabla \lambda ) \] leads to identical observations in all conceivable experiments. Sorry if some signs are wrong or powers of \(c\) are missing. Imagine that I insert your favorite textbook on classical electromagnetism here.
Insert your favorite book on E&M.
In the quantum world, electromagnetism has to be – much like everything else – described in agreement with the postulates of quantum mechanics. So the quantities such as \(\phi,\vec A\) become operators on the Hilbert space. The state vector may be expressed as a "functional" of the variables such as \(\vec A(x,y,z)\) at each point of space: the dependence on \(\phi\) may be totally liquidated: that's one way to transform the gauge symmetry into cash. As I previously mentioned, you may imagine that the corresponding "wave packetal" (it's a wave packet among functionals, not functions, you surely get it) is concentrated near a classical configuration. Its width may be chosen very small.

Just like wave packets in mechanics of large bodies may easily have small \(\Delta x\) and \(\Delta p\), the same fact works for wave packetals \(\Psi[\vec A(x,y,z)] \). In the classical \(\hbar\to 0\) limit, you may extract classical electromagnetism i.e. Maxwell's equations from the corresponding quantum Heisenberg equations for the operators by simplifying the commutator and neglecting subleading terms in \(\hbar\).

In the real world, it won't be guaranteed that the packetal stays "narrow". It may spread to totally different places of the configuration space, much like Schrödinger's cat or the ordinary wave function of an electron. However, your multimeter always shows some particular values of the electric fields, currents, and so on. Why you can't ever see an ambiguous answer on the multimeter does depend on decoherence and will be discussed in the next section. (This decoherence only begins once the photons start to interact with the macroscopic objects such as multimeters.)

However, in the rest of this section, I want to focus on another way how to see classical physics of fields emerge out of large ensembles of photons, one that mimics the thermodynamic limit of statistical physics (even in the context of classical mechanics).

An advantage of the electromagnetic field is that it is not self-interacting. If you shine two beams of light against each other, they will not collide. Because the classical Maxwell's equations are linear (much like Schrödinger's equation in quantum mechanics for which the linearity is mandatory, however), one may make a superposition of two vacuum solutions of the equations to get another solution. In the real world, we actually know that this is not quite right. Photons may violently collide with each other. The leading interaction comes from a "box" (square) Feynman diagram with an electron (lightest charged particle) running in the loop, and with four photon external lines attached. This gives a tiny but nonzero probability that two photons collide and change their energies and directions. It doesn't affect anything in your real life and doesn't cause any observable decoherence but the state-of-the-art technology was actually capable of (barely) detecting these QED-predicted photon-photon interactions (at SLAC).

Fine, so photons don't interact with each other. If you create one, it freely propagates through space before it hits a charged particle in the future (and another photon isn't a charged particle, so it can't count).

What are the states of the photons in which you typically produce many photons in a light bulb, to pick an example? A light bulb usually produces photons one-by-one and they're largely uncorrelated with each other. So the multi-photon state may be written as
\[ \ket\psi = \ket{\psi_1} \otimes \ket{\psi_2} \otimes \cdots \otimes \ket{\psi_N}\] where the number of photons \(N\) is very high but finite. You may imagine that each of the photons has its own wave function – that's the assumption of their independence (no entanglement i.e. no correlation) – and the overall state just factorizes into a tensor product. Well, I should still symmetrize the multi-photon wave function in the case when the states \(\ket{\psi_i}\) differ from each other.

Imagine that they do differ and you want to calculate the average value of the electric field \(\vec E\) at some point in space, away from the light bulb. The electric field may be rewritten as some combination of creation and annihilation operators for photons in all conceivable states, with various coefficients. And by symmetry, or because the many photons contribute randomly, you get zero. So even though we are surely imagining – and we may measure – nonzero values of the electric field away from the light bulb at some moments and locations, the statistical expectation is zero and the fluctuations of the electric field are due to the randomness of the emission processes. (The multimeters would measure particular nonzero values of electric and magnetic fields due to decoherence, see the next section.)

However, you may also ask what is the probability that the energy density in a certain region near the light bulb will belong to a certain interval. In this case, there's no symmetry between positive (possible) and negative (impossible) energy density, so you get a certain positive number. If you calculate the mean value of the energy density from the multiphoton state \(\ket\psi\) above, you obtain a particular positive number. The result will be written as a sum of many terms. By the central limit theorem, the distribution for the energy density will be normal (Gaussian) and its width will be very small relatively to the energy density itself.

So even though the light bulb's multiphoton state predicts a vanishing mean value for the electric fields \(\vec E\) – that's what we mean by saying that the light from a light bulb is incoherent – the energy density has a distribution suggesting a classical limit. The energy density is equal to a particular number which depends on the wattage of the light bulb, plus minus a small error.

This is nothing else than the method how we get smooth and non-fluctuating thermodynamic quantities, such as temperature and pressure of gas inside a bottle, from the chaotically fluctuating properties of the molecules of the gas, using the methods of statistical physics. If you sum many things that have the same or similar distribution, the sum will reflect the "average molecule" (or "average photon") and the relative error of the sum will be very small if the number of molecules (or photons) is large. Let me emphasize that this is the right analogy; the probabilities predicted from quantum mechanics should be interpreted exactly in the same way as probabilities predicted from classical statistical physics. The only difference is that quantum mechanics implies that the "exact truth" about the system doesn't exist even in principle. However, in practice, you don't care about it because you can't know the coordinate and positions of many gas molecules in a bottle, anyway. The predictive power is exactly the same; the quantum probabilities have exactly the same physical consequences as the probabilities arising from classical statistical physics where you could "imagine" an objectively real state behind everything.

But I want to discuss coherent light in some more detail.

Imagine a laser or an antenna emitting electromagnetic waves at a well-defined angular frequency \(\omega\). In this case, the electric and magnetic fields \(\vec E(x,y,z)\) and \(\vec B(x,y,z)\) are predictable and regular: we deal with nice, classical waves. How do they emerge from the wave function of the many photons?

First, you may still imagine that these classical values of the electromagnetic fields emerge from very narrow distributions (the wave functional that depends on the classical values of \(\vec A(x,y,z)\)) of the quantities you want to measure. But the wave functionals are only a convenient language to describe the state vector in the classical limit when the total energy is high i.e. it corresponds to very many photons.

When the energy is low and corresponds to just a few photons (or other particles), we want to choose a different basis of the Hilbert space for the electromagnetic field: we want to realize that the Hilbert space is isomorphic to the multi-photon Hilbert space of many particles; this fact may be deduced from the electromagnetic field's isomorphism to an infinite-dimensional harmonic oscillator. Its basis vectors include the vacuum state \(\ket 0\): note that \(\bra 0 0 \rangle=1\), not zero: it's something else than the trivial element of the vector space. And the other basis vectors of the Hilbert space are \(N\)-photon states described by the overall wave function for all these photons,
\[ \psi_N (x_1,x_2,\dots x_N). \] The photons also have polarizations so the wave function has many components, too. I don't want to scare you by the indices but the wave function of a single photon mathematically looks like the (complexified) classical electromagnetic potential \(\vec A(x,y,z)\), with some extra subtleties. (But its interpretation is different!)

To specify the exact state vector of the electromagnetic field, you have to tell me the complex amplitudes \(\psi_0\) in front of the vacuum state as well as all the functions \(\psi_N\) above depending on all the positions (and polarizations) as indicated above. Yes, any linear superposition of this kind is allowed; that's required by the superposition principle of quantum mechanics (physically) or by the linearity of the Hilbert space of states (mathematically). There's no rule that the "number of photons" has to be sharply defined. And indeed, we will see that the states corresponding to the "classical field configurations" don't have a quite well-defined number of the photons; they're not eigenstates of \(N_{\rm photons}\). Also, the number of photons isn't really conserved (think about Feynman's and his dad's story about the bag with particular words that you may deplete if you say the word too many times): that's really why the number of photons isn't ever well-defined (especially because one produces infinitely many very low-frequency ones during any accelerated motion of a charged particle).

So far, I haven't mentioned one thing: electromagnetic waves at high frequencies \(\omega\) have a high energy of a single photon \(E=\hbar\omega\), so even a pretty decent energy you may donate to the electromagnetic field is only enough to produce one quantum (photon) or a few of them. Moreover, the wavelength of the wave function (or the classical electromagnetic wave) is very short for high-frequency photons and they may therefore form highly localized packets. That's why for high frequencies e.g. gamma rays, the description of the electromagnetic fields in terms of particles, photons, is more useful in practice. You may even talk about the "classical limit" of the electromagnetic field which is composed of particles, not fields. For low frequencies, the single photon's energy \(E=\hbar\omega\) is tiny and you need many photons for them to be seen at all (to carry some energy, i.e. the quantity needed to do work or any big impact, for that matter). That's why low-frequency electromagnetic fields are usefully described as classical fields in the classical limit: that corresponds to the simple removal of hats from the electric and magnetic field operators.

Fine. Let us look at coherent light now. Take an antenna, laser, or a maser. It is no longer true that the photons will be independent and chaotic. In the case of the antenna, the photons are created "in the same state" because the frequency and angular distribution is predictable from the "classical" motion of the charges we impose upon the antenna. In the case of the laser, the photons will be in the same state because the total emission rate of a photon into a state becomes more likely, \((N+1)\) times, if there are already many or \(N\) photons in the state. Well, \((N+1)\) may be divided to \(N\), which corresponds to "stimulated emission", and \(1\) which corresponds to "spontaneous emission" of the photon. Lasers and masers are all about the term \(N\).

Let me mention that the coefficient \(N+1\) comes from squaring the matrix element of an ordinary quantum harmonic oscillator's raising operator,
\[ \abs{\bra{N+1} \hat a^\dagger \ket N}^2 = N+1. \] That's why the probabilities behave like that. That's also why the photons will be created in the same state one-particle \(\ket\psi\). As I have said, the mathematical content of this state is isomorphic to a (complexified) classical electromagnetic wave \(\vec A(x,y,z)\) but the interpretation is different.

The most accurate vector in the multi-photon Hilbert space that describes a particular classical configuration is actually the so-called coherent field (it's because it's relevant for coherent sources of light such as lasers), schematically:
\[ \ket{\psi}_{\rm coherent} = \exp \left[\int {\rm d}^3x\,\vec A(x,y,z)\cdot \hat{\vec a}^\dagger (x,y,z) \right] \ket 0 \] You see that it's some exponential of an operator acting on the vacuum state. The operator in the exponent is a combination of creation operators for the photons. It may be difficult to sort out what this exponential does but you should start with the analogous construction for a harmonic oscillator. With renaming the "creation" operator as a "raising" operator if it makes your reasoning simpler, you will see that such an exponential inserted in front of the vacuum state (the minimal-uncertainty Gaussian) simply moves the Gaussian (or its equally Gaussian Fourier transform) in the direction of \(x\) or \(p\), in either representation, and/or modulates the Gaussian by a phase that linearly depends on the variable \(x\) or \(p\). The resulting state still saturates the Heisenberg uncertainty principle.

(You could also make the state narrower or wider in the \(x\)-representation and wider or narrower in the \(p\)-representation at the same moment and you would still saturate the uncertainty principle. Such states would be more general and called "squeezed states": they're needed when you discuss particle production including the Hawking radiation. Here we're talking about the "coherent states" only. They're given by the simplest exponential-of-linear formula above and they don't change the width of the Gaussians.)

How many photons the coherent state contains? Well, to answer this question, expand the exponential via the Taylor expansion,
\[ \exp(x) = 1+x+\frac{x^2}{2!}+ \frac{x^3}{3!} + \dots \] Because \(x\) is proportional to the creation operators (some superposition of them), it always increases the number of photons by one. Because the Taylor expansion contains any non-negative integer power of \(x\), it means that the coherent state will contain components with an arbitrary number of photons, \(0,1,2,\dots\). However, you may ask which number of photons is most likely. Indeed, the probability distribution for different numbers of photons will be rather sharply peaker around some value \(N\) for which \(N\hbar\omega\), assuming that you only use photons of the same frequency \(\omega\), is equal to the classical energy contained in the configuration \(\vec A(x,y,z)\), i.e. in the corresponding classical electromagnetic field (plus minus a relative error that is small if \(N\) is large).

To get a clue, you may imagine that you only pick the term in the coherent state for which \(N\), the number of photons, has this most likely value. Then you reduce the multi-photon state to a state we have already considered: it is a tensor product of many single-photon states. In this case, however, the factors represent the same one-particle wave function. (So the symmetrization isn't needed.)

Once again, you may calculate the total energy density at some point (and its relative error) from the coherent state and you get a large number plus minus a relatively small error, for the statistical reasons analogous to thermodynamics. However, the coherent state has one property that the incoherent state doesn't: if you calculate the mean value of \(\vec E, \vec B\) at some point, you get a nonzero value, too. It's exactly the value determined from the vector potential \(\vec A(x,y,z)\) we have incorporated as the coefficients in the exponential defining the coherent state. So all the photons constructively interfere and add the same contribution to \(\vec E,\vec B\) at a given point. That's what we mean by saying that the laser beam is "coherent".

This discussion should help you to understand how the classical electromagnetic fields are approximately represented by quantum state vectors of the quantized electromagnetic field. The wave functional of the electromagnetic field may be written as a "wave packetal" centered around a classical configuration of the fields. If the widths of this "wave packetal" are the same as they are in the vacuum state, and the Gaussian is just moved to a different place, the corresponding state may be expressed as the "coherent state" which is the exponential of a linear combination of creation operators, acting upon the vacuum state. (You should normalize the state to unity, too: I didn't do it explicitly to avoid irrelevant clutter.)

Such a coherent state has a completely undetermined number of photons: all values have a nonzero probability. However, the number's probability distribution still has a maximum and it becomes rather sharp when the most likely number of photons is high. When you use the multi-photon construction of the Hilbert space (you represent operators in terms of creation and annihilation operators for photons, rather than \(\vec E, \vec B\) – the relationships between these operators are well-known and linear) – you may see that the multi-photon coherent space gives rise to the right energy density (plus minus a small error), and because it's coherent in this case, it also gives rise to a nonzero mean value of the electromagnetic fields (pus minus a small error).

That's the summary of our discussion of the emergence of classical fields from a quantum field theory (or a multi-particle quantum theory that allows to change the number of particles; or its generalization such as string/M-theory).

Seeing sharp locations of quantum particles

Finally, I want to discuss the things that are discussed in terms of a "collapse of a wave function" by the readers who have hopefully been entirely good-byed by now (together with some of the open-minded readers whom I apologize for the relative complexity of this text: but it's probably too late because they're gone).

There's no collapse but there's decoherence which tells you which outcomes may be interpreted in terms of classical physics and become classical facts (once they happen) whose probabilities (but nothing else) may be predicted from quantum mechanics in advance.

Decoherence is a "loss of coherence". We have already discussed coherence: it's a situation in which a phase (otherwise pretty random) is fully synchronized between two or many terms or contributors so that they contribute in the same direction and, like in "constructive interference", produce a nonzero final result. OK, there's some coherence but coherence in what?

Well, coherence between relative phases of parts of the state vector. We want to see why the right questions to be asked is "what's the probability that a cat will be dead" or "that it will be alive" rather than "what's the probability that a cat will be in state \(\ket\psi\) where
\[\ket\psi = 0.6 \ket{\rm alive} + 0.8i \ket{\rm dead}. \] Note that the latter state surely exists in the Hilbert space and at least in principle (and for small enough but growing objects, even in practice), it may be produced as the initial state. However, what I want to say is that the projection operator \(\hat P\) onto this state isn't a natural observable defining a "Yes/No" question we have discussed at the beginning. For the pure "dead" and "alive" states, the projection operator is a good observable for a question to be asked. Why is it so?

Well, imagine that you have a general state
\[ \ket\psi = a\ket{\rm alive} + d\ket{\rm dead} \] where \(a,d\in{\mathbb C}\). We will see that something is special about the basis vectors "alive" and "dead" which doesn't hold for their generic superpositions. However, this "special role" isn't God-given and doesn't exist a priori. It is given by the Hamiltonian or, more generally, by the dynamical laws of Nature. It does depend on the dynamics. There can't be any predetermined "beables" in Nature: we know that the preferred states always sensitively depend on the Hamiltonian.

Fine, what will the state above evolve to after a short amount of time?
\[ \ket\psi = a\ket{\rm alive} \otimes \ket{\rm alive\,env.} + d\ket{\rm dead} \otimes \ket{\rm dead\,env.} \] It differs from the initial state in one respect: both terms contain an extra tensor product with a state of the environment. By the environment, I mean any degrees of freedom whose detailed state quickly becomes unobservable or chaotic. Once again, which degrees of freedom are the environment isn't and can't be God-given. There's no a priori separation. Even in a particular situation, it's up to your assumptions and required precision. At least in principle, there's some dependence. This situation-dependent nature of the environment isn't a bug of quantum theory that needs to be fixed: it's a feature of Nature and an important insight we have made about Her.

You should imagine that the "alive environment" and "dead environment" are e.g. states of the electromagnetic field including photons whose exact directions and polarizations (and shape, because it's very complicated) depend on the exact state of the cat. The most important condition that is really responsible for the preferred status of the "dead" and "alive" basis is that
\[ \langle{\rm alive\,env.} \ket{\rm dead\,env.} =0. \] At least approximately, the states of the environment corresponding to the "waiting to become preferred" basis vectors are orthogonal to one another (or each other, in the typical case when the number of possibilities is much greater than two).

That's important. Now, we ultimately give up any chance to talk about the entanglement, correlations, and coherence between all the detailed properties of the infrared photons emitted to the environment, and so on. We know that it's unmanageable so we can't really ask useful questions about it. Instead, we can only ask questions about the degrees of freedom describing the cat itself (without the environment, without all the thermal photons it has emitted to the environment etc.). It may be shown that all probabilities involving properties of the cat's degrees of freedom only may be calculated from a "smaller density matrix" that only lives in the space of operators
\[ \rho_{\rm cat}: \,{\mathcal H}_{\rm cat} \to {\mathcal H}_{\rm cat} \] where
\[ {\mathcal H} = {\mathcal H}_{\rm cat} \otimes {\mathcal H}_{\rm environment}, \] at least approximately. The probability of all "Yes/No" questions \(P\) may be calculated from our old formula
\[ {\rm Prob}(P) = {\rm Tr} (\rho_{\rm cat} P) \] where only \(P\) acting on the cat's Hilbert space is allowed. The density matrix relevant for all such questions restricted to properties of the cat itself is
\[ \rho_{\rm cat} = {\rm Tr}_{\rm environment} \rho. \] We just partially trace over the index associated with the environment: you should imagine that because of the tensor product structure, each Hilbert-space index of the whole system is really a pair of a cat-related index and an environment-related index. The latter is being traced over but the former is not; that's why the partial trace remains an operator acting on the cat's Hilbert space.

Now, we may take the last displayed formula with the partial trace and apply it to \(\rho = \ket\psi\bra\psi\) where
\[ \ket\psi = a\ket{\rm alive} \otimes \ket{\rm alive\,env.} + d\ket{\rm dead} \otimes \ket{\rm dead\,env.} \] What we obtain is the following: because the "dead environment" and "alive environment" states are orthogonal to each other, the formula for the cat's density matrix will preserve the separation of the sum into the two terms. There won't be any mixed terms and we get
\[ \rho_{\rm cat} = |a|^2 \ket{\rm alive}\bra{\rm alive} + |d|^2 \ket{\rm dead}\bra{\rm dead}. \] This density matrix is diagonal which means that there are no \(\ket{\rm dead}\bra{\rm alive}\) terms or their Hermitian conjugates. The reason why it's diagonal boils down to the orthogonality of the states of the environment emitted by the cat's preferred basis vectors. They're orthogonal because macroscopic objects at different places emit photons at different places as well, if you wish, whose states' inner product is equal to zero. The exact or approximate locality of the laws of physics guarantees that particles in different regions are orthogonal which is why the localized states of an absorbed particle become preferred relatively to their generic linear superpositions. The position \(x\) of a particle didn't become a "natural observable following the classical logic" because it was predetermined by God or by comrades Bohm and Bell as a "beable" (a pseudoscientific term!); it becomes a typical preferred "classical observable" after decoherence because the laws of physics are local in \(x\).

Now, we're finished. The density matrix is diagonalized. Because the density matrix is the quantum counterpart of the "probability distribution" on the phase space, its eigenvalues correspond to probabilities and the corresponding eigenstates are the states whose probabilities we're considering. And we've seen that these eigenvalues end up being the quantum states that are able to copy the information about themselves into the environment, in an orthogonal i.e. mutually exclusive fashion.

I have sketched a similar rule for conventional observables such as the energy: only its eigenvalues are allowed values we may measure. In some slightly generalized sense, it is true even for the "operator of the probabilities" called the density matrix. If you don't want to count the density matrix as another observable, and it's legitimate not to count it as one because of its proximity to the state vector which is surely not an observable, you should accept another independent axiom: mutually exclusive possibilities are those whose state vectors have a vanishing corresponding matrix element of the reduced density matrix.

(For years, I've though that something special could happen when a piece of the density matrix is proportional to the identity matrix which is diagonal in every basis; however, this situation doesn't really occur in practice because it finely depends on your separation of the environment. If you think that such situations would make it easier to observe and perceive the exotic superpositions, you will be disappointed right after another interaction with the environment.)

You would get contradictions with the laws expected from classical probability theory if you tried to interpret the diagonal entries of a non-diagonal density matrix as probabilities. In the same way, you avoid all possible contradictions with classical logic e.g. with
\[ P(A\,{\rm or}\, B) = P(A) + P(B) - P(A\,{\rm and}\,B) \](and you may prove it) if you only allow the eigenvalues of the density matrix to be interpreted as probabilities and their eigenstates as the acceptable outcomes an observe may observe. One may prove these statements. The interpretation based on "consistent histories" is playing with probabilities of whole histories – given by sequences of projection operators at different times – and studies necessary and sufficient conditions for the histories to be the bases of "legitimate questions" for which a quantum theory calculates "the right probabilities of the individual histories". The consistency of the set of histories is a sort of mutual "orthogonality condition" for each pair of histories in the set.

At any rate, the squared amplitudes \(|a|^2\) and \(|d|^2\) are interpreted as the probabilities \({\rm Tr}(\rho P_{\rm alive/dead})\) that you observe a dead or alive cat. More generally, we would consider many states that are macroscopically distinguishable from each other and that – as we know from experience as well as pure logic – may be interpreted as classical facts. Once you get classical facts, and quantum mechanics predicts their probabilities, you may deal with them and interpret them exactly as you did in classical physics. But the rules to calculate them are conceptually different from classical physics. They're totally new, sexy, revolutionary: they're quantum laws.

You may also use quantum mechanics to prove that different observers will have the same perceptions if their senses work, that there will be a correlation between the direction of a particle in a cloud chamber at two different moments (so it will produce a straight line), despite the superpositions in the initial state (i.e. despite the fact that you can't be certain what the direction of the line will be). You may prove all things that are actually seen and that you know intuitively except that in quantum mechanics, they don't come from the "classical dogmas" but they come from calculations that must always obey the template we have described: linear operators on the Hilbert space for every question, squared absolute values of complex amplitudes as the probability. There's absolutely no contradiction. Many things that classical physicists (and anti-quantum zealots) considered (or still consider) to be God-given dogmas are emergent properties that require a non-trivial calculation and that sometimes depend on approximations (which are amazingly good for macroscopic objects).

As Sidney Coleman was explaining in his "in your face" lecture, one may also use the formulae of quantum mechanics to prove that outcomes of repeated experiments will be independent and truly random, following the predictable distribution, and there won't be any "pattern" in the outcomes.

All these things work: quantum mechanics passes all the tests, including those where classical physics used to work before the early 20th century. That's why the theory is alive. On the other hand, classical physics in all forms has been falsified. That's how science proceeds: it may only falsify a theory if its predictions disagree with the genuinely observed facts; prejudices are irrelevant. No such falsification or failure exists for the general framework of quantum mechanics so the theory is still alive and chances are growing that no modification of its basic postulates may ever be consistent.

And that was a modest verbal shadow of the memo.

Add to del.icio.us Digg this Add to reddit

snail feedback (2) :


reader Nick Libreman said...

There are many strongly worded assertions which albeit currently accepted as true by the mainstream scientific community, the reader is hopefully not expected to take as dogma and are free to be questioned.

For instance I would examine this claim:

"However, there's no language "deeper than quantum mechanics" that could be used to interpret quantum mechanics."

This is stated matter-of-factly and I can't blame the author because there truly is no currently known/accepted deeper "language", or is it?

Science is based on the idea of falsification, so let's take a shot at it. What would suffice to falsify or at least put in question the above statement? If we'd find a completely classical approach (meaning not using Planck's constant or other quantum mechanics ad-hoc constructs) to calculate energy levels and orbital radii of *any* atom (even muoninc one), probability of transition and energy of an emitted photon ... would that satisfy this condition?

All these are considered impossible to calculate by classical means without quantum mechanics constructs by the state-of-the-art mainstream physics.

So, what if we could calculate all this using just classical physics? What if I told you that we can! You would probably find that hard to believe as you absolutely should as a properly skeptical scientist.

But skepticism must end where the equations start - dismissing the idea that it can be done without examining the calculations and equations is not science, it is not skepticism, it is unfortunately just ignorance as the famous quote says:

"Condemnation without investigation is the height of ignorance." —Albert Einstein

That's why I present the equations and all the aspects I mentioned to you for examination. I welcome and encourage all skepticism and criticism that is necessary to make sure it is indeed correct ... and I hope I'm not met with dismissal without examination. Math does not lie, if you can calculate Planck's constant from first principles it is not a matter to be regarded lightly - it can indeed turn out to be revolutionary as according to current state of modern science this is impossible.

Download PDF

regards,
Libreman


reader Andy Everett said...

I wish I had found this earlier, thank you!