Lorentz transformation

In physics, the Lorentz transformation (or transformations) are coordinate transformations between two coordinate frames that move at constant velocity relative to each other.

Frames of reference can be divided into two groups, inertial (relative motion with constant velocity) and non-inertial (accelerating in curved paths, rotational motion with constant angular velocity, etc.). The term "Lorentz transformations" only refers to transformations between inertial frames, usually in the context of special relativity.

In each reference frame, an observer can use a local coordinate system (most exclusively Cartesian coordinates in this context) to measure lengths, and a clock to measure time intervals. An observer is a real or imaginary entity that can take measurements, say humans, or any other living organism—or even robots and computers. An event is something that happens at a point in space at an instant of time, or more formally a point in spacetime. The transformations connect the space and time coordinates of an event as measured by an observer in each frame.[nb 1]

They supersede the Galilean transformation of Newtonian physics, which assumes an absolute space and time (see Galilean relativity). The Galilean transformation is a good approximation only at relative speeds much smaller than the speed of light. Lorentz transformations have a number of unintuitive features that do not appear in Galilean transformations. For example, they reflect the fact that observers moving at different velocities may measure different distances, elapsed times, and even different orderings of events, but always such that the speed of light is the same in all inertial reference frames. The invariance of light speed is one of the postulates of special relativity.

Historically, the transformations were the result of attempts by Lorentz and others to explain how the speed of light was observed to be independent of the reference frame, and to understand the symmetries of the laws of electromagnetism. The Lorentz transformation is in accordance with special relativity, but was derived before special relativity. The transformations are named after the Dutch physicist Hendrik Lorentz.

The Lorentz transformation is a linear transformation. It may include a rotation of space; a rotation-free Lorentz transformation is called a Lorentz boost. In Minkowski space, the mathematical model of spacetime in special relativity, the Lorentz transformations preserve the spacetime interval between any two events. This property is the defining property of a Lorentz transformation. They describe only the transformations in which the spacetime event at the origin is left fixed. They can be considered as a hyperbolic rotation of Minkowski space. The more general set of transformations that also includes translations is known as the Poincaré group.

History

Many physicists—including Woldemar Voigt, George FitzGerald, Joseph Larmor, and Hendrik Lorentz[1] himself—had been discussing the physics implied by these equations since 1887.[2] Early in 1889, Oliver Heaviside had shown from Maxwell's equations that the electric field surrounding a spherical distribution of charge should cease to have spherical symmetry once the charge is in motion relative to the ether. FitzGerald then conjectured that Heaviside’s distortion result might be applied to a theory of intermolecular forces. Some months later, FitzGerald published the conjecture that bodies in motion are being contracted, in order to explain the baffling outcome of the 1887 ether-wind experiment of Michelson and Morley. In 1892, Lorentz independently presented the same idea in a more detailed manner, which was subsequently called FitzGerald–Lorentz contraction hypothesis.[3] Their explanation was widely known before 1905.[4]

Lorentz (1892–1904) and Larmor (1897–1900), who believed the luminiferous ether hypothesis, also looked for the transformation under which Maxwell's equations are invariant when transformed from the ether to a moving frame. They extended the FitzGerald–Lorentz contraction hypothesis and found out that the time coordinate has to be modified as well ("local time"). Henri Poincaré gave a physical interpretation to local time (to first order in v/c) as the consequence of clock synchronization, under the assumption that the speed of light is constant in moving frames.[5] Larmor is credited to have been the first to understand the crucial time dilation property inherent in his equations.[6]

In 1905, Poincaré was the first to recognize that the transformation has the properties of a mathematical group, and named it after Lorentz.[7] Later in the same year Albert Einstein published what is now called special relativity, by deriving the Lorentz transformation under the assumptions of the principle of relativity and the constancy of the speed of light in any inertial reference frame, and by abandoning the mechanical aether.[8]

Derivation

An event is something that happens at a certain point in spacetime, or more generally, the point in spacetime itself. In any inertial frame an event is specified by a time coordinate t and a set of Cartesian coordinates x, y, z to specify position in space in that frame. Subscripts label individual events.

From Einstein's second postulate of relativity follows immediately

c^2(t_2 - t_1)^2 - (x_2 - x_1)^2 - (y_2 - y_1)^2 - (z_2 - z_1)^2 = 0

in all inertial frames for events connected by light signals. The quantity on the left is called the spacetime interval between events (t1, x1, y1, z1) and (t2, x2, y2, z2). The interval between any two events, not necessarily separated by light signals, is in fact invariant, i.e., independent of the state of relative motion of observers in different inertial frames, as is shown here (where one can also find several more explicit derivations than presently given) using homogeneity and isotropy of space. The transformation sought after thus must possess the property that

c^2(t_2 - t_1)^2 - (x_2 - x_1)^2 - (y_2 - y_1)^2 - (z_2 - z_1)^2 = c^2(t_2' - t_1')^2 - (x_2' - x_1')^2 - (y_2' - y_1')^2 - (z_2' - z_1')^2.

where t, x, y, z are the spacetime coordinates used to define events in one frame, and t, x, y, z are the coordinates in another frame. Now one observes that a linear solution to the simpler problem

c^2t^2 - x^2 - y^2 - z^2 = c^2t'^2 - x'^2 - y'^2 - z'^2

solves the general problem too. Finding the solution to the simpler problem is just a matter of look-up in the theory of classical groups that preserve bilinear forms of various signature.[nb 2] The Lorentz transformation is thus an element of the group O(3, 1) or, for those that prefer the other metric signature, O(1, 3).[nb 3]

Generalities

The relations between the primed and unprimed spacetime coordinates are the Lorentz transformations, each coordinate in one frame is a linear function of all the coordinates in the other frame, and the inverse functions are the inverse transformation. Depending on how the frames move relative to each other, and how they are oriented in space relative to each other, other parameters that describe direction, speed, and orientation enter the transformation equations.

Transformations describing relative motion with constant (uniform) velocity and without rotation of the space coordinate axes are called a boosts, and the relative velocity between the frames is the parameter of the transformation. The other basic type of Lorentz transformations is rotations in the spatial coordinates only, these are also inertial frames since there is no relative motion, the frames are simply tilted (and not continuously rotating), and in this case quantities defining the rotation are the parameters of the transformation (e.g., axis–angle representation, or Euler angles, etc.). A combination of a rotation and boost is a homogenous transformation, which transforms the origin back to the origin.

The full Lorentz group O(3, 1) also contains special transformations that are neither rotations nor boosts, but rather reflections in a plane through the origin. Two of these can be singled out; spatial inversion in which the spatial coordinates of all events are reversed in sign and temporal inversion in which the time coordinate for each event gets its sign reversed.

Boosts should not be conflated with mere displacements in spacetime; in this case, the coordinate systems are simply shifted and there is no relative motion. However, these also count as symmetries forced by special relativity since they leave the spacetime interval invariant. A combination of a rotation with a boost, followed by a shift in spacetime, is an inhomogenous Lorentz transformation, an element of the Poincaré group, which is also called the inhomogeneous Lorentz group.

Physical formulation of Lorentz boosts

Coordinate transformation

The spacetime coordinates of an event, as measured by each observer in their inertial reference frame (in standard configuration) are shown in the speech bubbles.
Top: frame F moves at velocity v along the x-axis of frame F.
Bottom: frame F moves at velocity −v along the x-axis of frame F.[9]
This diagram actually shows the inverse configuration of F "stationary" while F is boosted away along the negative x direction, although it correctly gives the original transformation since the coordinates ct, x of F are projected onto the coordinates ct, x of F. The event (ct, x) = (8, 6) in F corresponds to approximately (ct, x) ≈ (5.55, 1.67) in F, with rapidity ζ ≈ −0.66. Notice the difference in length and time scales, such that the speed of light is invariant.
This diagram shows the original configuration of F "stationary" while F is boosted away along the positive x direction, although it correctly gives the inverse transformation since the coordinates ct, x of F are projected onto the coordinates ct, x of F. The event (ct, x) = (8, 6) in F corresponds to approximately (ct, x) ≈ (14.3, 13.28) in F, with rapidity ζ ≈ +0.66. Again, the difference in length and time scales is such that the speed of light is invariant.

A "stationary" observer in frame F defines events with coordinates t, x, y, z. Another frame F moves with velocity v relative to F, and an observer in this "moving" frame F defines events using the coordinates t, x, y, z.

The coordinate axes in each frame are parallel (the x and x axes are parallel, the y and y axes are parallel, and the z and z axes are parallel), remain mutually perpendicular, and relative motion is along the coincident xx axes. At t = t = 0, the origins of both coordinate systems are the same, (x, y, z) = (x, y, z) = (0, 0, 0). In other words, the times and positions are coincident at this event. If all these hold, then the coordinate systems are said to be in standard configuration, or synchronized.

If an observer in F records an event t, x, y, z, then an observer in F records the same event with coordinates[10]

Lorentz boost (x direction)
\begin{align}
t' &= \gamma \left( t - \frac{v x}{c^2} \right)  \\ 
x' &= \gamma \left( x - v t \right)\\
y' &= y \\ 
z' &= z
\end{align}

where v is the relative velocity between frames in the x-direction, c is the speed of light, and

 \gamma = \frac{1}{ \sqrt{1 - \frac{v^2}{c^2}}}

(lowercase gamma) is the Lorentz factor.

Here, v is the parameter of the transformation, for a given boost it is a constant number, but can take a continuous range of values. In the setup used here, positive relative velocity v > 0 is motion along the positive directions of the xx axes, zero relative velocity v = 0 is no relative motion, while negative relative velocity v < 0 is relative motion along the negative directions of the xx axes. The magnitude of relative velocity v cannot equal or exceed c, so only subluminal speeds c < v < c are allowed. The corresponding range of γ is 1 ≤ γ < ∞.

The transformations are not defined if v is outside these limits. At the speed of light (v = c) γ is infinite, and faster than light (v > c) γ is a complex number, each of which make the transformations unphysical. The space and time coordinates are measurable quantities and numerically must be real numbers, not complex.

As an active transformation, an observer in F notices the coordinates of the event to be "boosted" in the negative directions of the xx axes, because of the v in the transformations. This has the equivalent effect of the coordinate system F boosted in the positive directions of the xx axes, while the event does not change and is simply represented in another coordinate system, a passive transformation.

The inverse relations (t, x, y, z in terms of t, x, y, z) can be found by algebraically solving the original set of equations. A more efficient way is to use physical principles. Here F is the "stationary" frame while F is the "moving" frame. According to the principle of relativity, there is no privileged frame of reference, so the transformations from F to F must take exactly the same form as the transformations from F to F. The only difference is F moves with velocity v relative to F (i.e., the relative velocity has the same magnitude but is oppositely directed). Thus if an observer in F notes an event t, x, y, z, then an observer in F notes the same event with coordinates

Inverse Lorentz boost (x direction)
\begin{align}
t &= \gamma \left( t' + \frac{v x'}{c^2} \right)  \\ 
x &= \gamma \left( x' + v t' \right)\\
y &= y' \\ 
z &= z',
\end{align}

and the value of γ remains unchanged. This "trick" of simply reversing the direction of relative velocity while preserving its magnitude, and exchanging primed and unprimed variables, always applies to finding the inverse transformation of every boost in any direction.

Sometimes it is more convenient to use β = v/c (lowercase beta) instead of v, so that

\begin{align}
ct' &= \gamma \left( ct - \beta x \right) \,,  \\ 
x' &= \gamma \left( x - \beta ct \right) \,, \\
\end{align}

which shows clearer the symmetry in the transformation. From the allowed ranges of v and the definition of β, it follows −1 < β < 1. The use of β and γ is standard throughout the literature.

The Lorentz transformations can also be derived in a way that resembles circular rotations in 3d space using the hyperbolic functions. For the boost in the x direction, the results are

Lorentz boost (x direction with rapidity ζ)
\begin{align}
ct' &=  ct \cosh\zeta - x \sinh\zeta \\ 
x' &= x \cosh\zeta - ct \sinh\zeta \\
y' &= y \\ 
z' &= z
\end{align}

where ζ (lowercase zeta) is a parameter called rapidity (many other symbols are used, including θ, ϕ, φ, η, ψ, ξ). Given the strong resemblance to rotations of spatial coordinates in 3d space in the Cartesian xy, yz, and zx planes, a Lorentz boost can be thought of as a hyperbolic rotation of spacetime coordinates in the xt, yt, and zt Cartesian-time planes of 4d Minkowski space. The parameter ζ is the hyperbolic angle of rotation, analogous to the ordinary angle for circular rotations. This transformation can be illustrated with a Minkowski diagram.

The hyperbolic functions arise from the difference between the squares of the time and spatial coordinates in the spacetime interval, rather than a sum. The geometric significance of the hyperbolic functions can be visualized by taking x = 0 or ct = 0 in the transformations. Squaring and subtracting the results, one can derive hyperbolic curves of constant coordinate values but varying ζ, which parametrizes the curves according to the identity

 \cosh^2\zeta - \sinh^2\zeta = 1 \,.

Conversely the ct and x axes can be constructed for varying coordinates but constant ζ. The definition

 \tanh\zeta = \frac{\sinh\zeta}{\cosh\zeta} \,,

provides the link between a constant value of rapidity, and the slope of the ct axis in spacetime. A consequence these two hyperbolic formulae is an identity that matches the Lorentz factor

 \cosh\zeta = \frac{1}{\sqrt{1 - \tanh^2\zeta}} \,.

Comparing the Lorentz transformations in terms of the relative velocity and rapidity, or using the above formulae, the connections between β, γ, and ζ are

 \beta = \tanh\zeta  \,,
 \gamma = \cosh\zeta  \,,
 \beta \gamma = \sinh\zeta  \,.

Taking the inverse hyperbolic tangent gives the rapidity

 \zeta = \tanh^{-1}\beta  \,.

Since −1 < β < 1, it follows −∞ < ζ < ∞. From the relation between ζ and β, positive rapidity ζ > 0 is motion along the positive directions of the xx axes, zero rapidity ζ = 0 is no relative motion, while negative rapidity ζ < 0 is relative motion along the negative directions of the xx axes.

The inverse transformations are obtained by exchanging primed and unprimed quantities to switch the coordinate frames, and negating rapidity ζ → −ζ since this is equivalent to negating the relative velocity. Therefore,

Inverse Lorentz boost (x direction with rapidity ζ)
\begin{align}
ct & = ct' \cosh\zeta + x' \sinh\zeta \\ 
x &= x' \cosh\zeta + ct' \sinh\zeta \\
y &= y' \\ 
z &= z'
\end{align}

The inverse transformations can be similarly visualized by considering the cases when x = 0 and ct = 0.

So far the Lorentz transformations have been applied to one event. If there are two events, there is a spatial separation and time interval between them. It follows from the linearity of the Lorentz transformations that two values of space and time coordinates can be chosen, the Lorentz transformations can be applied to each, then subtracted to get the Lorentz transformations of the differences;

\Delta t' = \gamma \left( \Delta t - \frac{v \Delta x}{c^2} \right) \,,
\Delta x' = \gamma \left( \Delta x - v \Delta t \right) \,,

with inverse relations

\Delta t = \gamma \left( \Delta t' + \frac{v \Delta x'}{c^2} \right) \,,
\Delta x = \gamma \left( \Delta x' + v \Delta t' \right) \,.

where Δ (capital Delta) indicates a difference of quantities, e.g., Δx = x2x1 for two values of x coordinates, and so on.

These transformations on differences rather than spatial points or instants of time are useful for a number of reasons:

Physical implications

A critical requirement of the Lorentz transformations is the invariance of the speed of light, a fact used in their derivation, and contained in the transformations themselves. If in F the equation for a pulse of light along the x direction is x = ct, then in F the Lorentz transformations give x = ct, and vice versa, for any c < v < c.

For relative speeds much less than the speed of light, the Lorentz transformations reduce to the Galilean transformation

 t'\approx t
 x'\approx x - vt

in accordance with the correspondence principle. It is sometimes said that nonrelativistic physics is a physics of "instantaneous action at a distance".[11]

Three unintuitive, but correct, predictions of the transformations are:

Vector transformations

Further information: Euclidean vector and vector projection
An observer in frame F observes F to move with velocity v, while F observes F to move with velocity v. The coordinate axes of each frame are still parallel and orthogonal. The position vector as measured in each frame is split into components parallel and perpendicular to the relative velocity vector v. Left: Standard configuration. Right: Inverse configuration.

The use of vectors allows positions and velocities to be expressed in arbitrary directions compactly. A single boost in any direction depends on the full relative velocity vector v with a magnitude |v| = v that cannot equal or exceed c, so that 0 ≤ v < c.

Only time and the coordinates parallel to the direction of relative motion change, while those coordinates perpendicular do not. With this in mind, split the spatial position vector r as measured in F, and r as measured in F, each into components perpendicular (⊥) and parallel ( ) to v,

\mathbf{r}=\mathbf{r}_\perp+\mathbf{r}_\|\,,\quad \mathbf{r}' = \mathbf{r}_\perp' + \mathbf{r}_\|' \,,

then the transformations are

t'  = \gamma \left(t - \frac{\mathbf{r}_\parallel \cdot \mathbf{v}}{c^{2}} \right)
\mathbf{r}_\|' = \gamma (\mathbf{r}_\| - \mathbf{v} t)
\mathbf{r}_\perp' = \mathbf{r}_\perp

where · is the dot product. The Lorentz factor γ retains its definition for a boost in any direction, since it depends only on the magnitude of the relative velocity. The definition β = v/c with magnitude 0 ≤ β < 1 is also used by some authors.

Introducing a unit vector n = v/v = β/β in the direction of relative motion, the relative velocity is v = vn with magnitude v and direction n, and vector projection and rejection give respectively

\mathbf{r}_\parallel = (\mathbf{r}\cdot\mathbf{n})\mathbf{n}\,,\quad \mathbf{r}_\perp = \mathbf{r} - (\mathbf{r}\cdot\mathbf{n})\mathbf{n}

Accumulating the results gives the full transformations,

Lorentz boost (in direction n with magnitude v)
t'  = \gamma \left(t - \frac{v\mathbf{n}\cdot \mathbf{r}}{c^2} \right) \,,
 \mathbf{r}' = \mathbf{r} + (\gamma-1)(\mathbf{r}\cdot\mathbf{n})\mathbf{n} - \gamma t v\mathbf{n} \,.

The projection and rejection also applies to r. For the inverse transformations, exchange r and r to switch observed coordinates, and negate the relative velocity v → −v (or simply the unit vector n → −n since the magnitude v is always positive) to obtain

Inverse Lorentz boost (in direction n with magnitude v)
t = \gamma \left(t' + \frac{\mathbf{r}' \cdot v\mathbf{n}}{c^{2}} \right) \,,
\mathbf{r} = \mathbf{r}' + (\gamma-1)(\mathbf{r}'\cdot\mathbf{n})\mathbf{n} + \gamma t' v\mathbf{n} \,,

The unit vector has the advantage of simplifying equations for a single boost, allows either v or β to be reinstated when convenient, and the rapidity parametrization is immediately obtained by replacing β and βγ. It is not convenient for multiple boosts.

The vectorial relation between relative velocity and rapidity is[12]

 \boldsymbol{\beta} = \beta \mathbf{n} = \mathbf{n} \tanh\zeta \,,

and the "rapidity vector" can be defined as

 \boldsymbol{\zeta} = \zeta\mathbf{n} = \mathbf{n}\tanh^{-1}\beta \,,

each of which serves as a useful abbreviation in some contexts. The magnitude of ζ is the absolute value of the rapidity scalar confined to 0 ≤ ζ < ∞, which agrees with the range 0 ≤ β < 1.

Transformation of velocities

The transformation of velocities provides the definition relativistic velocity addition , the ordering of vectors is chosen to reflect the ordering of the addition of velocities; first v (the velocity of F relative to F) then u (the velocity of X relative to F) to obtain u = v u (the velocity of X relative to F).

Defining the coordinate velocities and Lorentz factor by

\mathbf{u} = \frac{d\mathbf{r}}{dt} \,,\quad \mathbf{u}' = \frac{d\mathbf{r}'}{dt'} \,,\quad \gamma_\mathbf{v} = \frac{1}{\sqrt{1-\dfrac{\mathbf{v}\cdot\mathbf{v}}{c^2}}}

taking the differentials in the coordinates and time of the vector transformations, then dividing equations, leads to

\mathbf{u}'=\frac{1}{ 1-\frac{\mathbf{v}\cdot\mathbf{u}}{c^2} }\left[\frac{\mathbf{u}}{\gamma_\mathbf{v}}-\mathbf{v}+\frac{1}{c^2}\frac{\gamma_\mathbf{v}}{\gamma_\mathbf{v}+1}\left(\mathbf{u}\cdot\mathbf{v}\right)\mathbf{v}\right]

The velocities u and u are the velocity of some massive object. They can also be for a third inertial frame (say F), in which case they must be constant. Denote either entity by X. Then X moves with velocity u relative to F, or equivalently with velocity u relative to F, in turn F moves with velocity v relative to F. The inverse transformations can be obtained in a similar way, or as with position coordinates exchange u and u, and change v to v.

The transformation of velocity is useful in stellar aberration, the Fizeau experiment, and the relativistic Doppler effect.

The Lorentz transformations of acceleration can be similarly obtained by taking differentials in the velocity vectors, and dividing these by the time differential.

Transformation of coordinate derivatives

Numerous equations in physics are partial differential equations involving space and time coordinates. Since the space and time coordinates change under Lorentz transformations, the derivatives must also. Using the chain rule one finds the transformation of the coordinate time and space derivatives to be

\frac{\partial}{\partial t'}=\gamma\left(\frac{\partial}{\partial t}+v\mathbf{n}\cdot\nabla\right)
\nabla'=\nabla+(\gamma-1)\mathbf{n}(\mathbf{n}\cdot\nabla)+\gamma\frac{v\mathbf{n}}{c^2}\frac{\partial}{\partial t}

with inverses

\frac{\partial}{\partial t}=\gamma\left(\frac{\partial}{\partial t'}-v\mathbf{n}\cdot\nabla\right)
\nabla =\nabla'+(\gamma-1)\mathbf{n}(\mathbf{n}\cdot\nabla')-\gamma\frac{v\mathbf{n}}{c^2}\frac{\partial}{\partial t'}

These are not quite the same as the transformations of coordinates. It turns out many physical quantities transform either like the coordinates, or like the derivatives.

Transformation of other quantities

In general, given four quantities A and Z = (Zx, Zy, Zz) and their Lorentz-boosted counterparts A and Z = (Zx, Zy, Zz), a relation of the form

A^2 - \mathbf{Z}\cdot\mathbf{Z} = {A'}^2 - \mathbf{Z}'\cdot\mathbf{Z}'

implies the quantities transform under Lorentz transformations similar to the transformation of spacetime coordinates;

A'  = \gamma \left(A - \frac{v\mathbf{n}\cdot \mathbf{Z}}{c} \right) \,,
 \mathbf{Z}' = \mathbf{Z} + (\gamma-1)(\mathbf{Z}\cdot\mathbf{n})\mathbf{n} - \frac{\gamma A v\mathbf{n}}{c} \,.

The decomposition of Z (and Z) into components perpendicular and parallel to v is exactly the same as for the position vector, as is the process of obtaining the inverse transformations (exchange (A, Z) and (A, Z) to switch observed quantities, and reverse the direction of relative motion by n → −n);

A = \gamma \left(A' + \frac{\mathbf{Z}' \cdot v\mathbf{n}}{c} \right) \,,
\mathbf{Z} = \mathbf{Z}' + (\gamma-1)(\mathbf{Z}'\cdot\mathbf{n})\mathbf{n} + \frac{\gamma A' v\mathbf{n}}{c} \,.

The quantities (A, Z) collectively make up a four vector, where A is the "timelike component", and Z the "spacelike component". Examples of A and Z are the following:

Four vector A Z
Position four vector time (multiplied by c) ct position vector r
Four momentum energy (divided by c) E/c momentum p
Four spin (no name) st Spin s
Four current charge density (multiplied by c) ρc current density j
Electromagnetic four potential electric potential (divided by c) φ/c magnetic potential A

For a given object (e.g. particle, fluid, field, material), if A or Z correspond to properties specific to the object like its charge density, mass density, spin, etc., its properties can be fixed in the rest frame of that object. Then the Lorentz transformations give the corresponding properties in a frame moving relative to the object with constant velocity. This breaks some notions taken for granted in non-relativistic physics. For example, the energy E of an object is a scalar in non-relativistic mechanics, but not in relativistic mechanics because energy changes under Lorentz transformations; its value is different for various inertial frames. In the rest frame of an object, it has a rest energy and zero momentum. In a boosted frame its energy is different and it appears to have a momentum. Similarly, in non-relativistic quantum mechanics the spin of a particle is a constant vector, but in relativistic quantum mechanics spin s depends on relative motion. In the rest frame of the particle, the spin pseudovector can be fixed to be its ordinary non-relativistic spin with a zero timelike quantity st, however a boosted observer will perceive a nonzero timelike component and an altered spin.[13]

Not all quantities are invariant in the form as shown above, for example orbital angular momentum L does not have a timelike quantity, and neither does the electric field E nor the magnetic field B. The definition of angular momentum is L = r × p, and in a boosted frame the altered angular momentum is L = r × p. Applying this definition using the transformations of coordinates and momentum leads to the transformation of angular momentum. It turns out L transforms with another vector quantity N = (E/c2)rtp related to boosts, see relativistic angular momentum for details. For the case of the E and B fields, the transformations cannot be obtained as directly using vector algebra. A method of deriving the EM field transformations in an efficient way which also illustrates the unit of the electromagnetic field uses tensor algebra, given below.

Mathematical formulation

Throughout, italic non-bold capital letters are 4×4 matrices, while non-italic bold letters are 3×3 matrices.

Boost matrix

The separate algebraic equations are often used in practical calculations, but for theoretical purposes it is useful to arrange the coordinates in column vectors and the quantities defining the transformation into a transformation matrix thus

 X' = \begin{bmatrix} c\,t' \\ x' \\ y' \\ z' \end{bmatrix} \,, \quad B(\mathbf{v}) = \begin{bmatrix} \gamma&-\gamma\beta n_x&-\gamma\beta n_y&-\gamma\beta n_z\\ -\gamma\beta n_x&1+(\gamma-1)n_x^2&(\gamma-1)n_x n_y&(\gamma-1)n_x n_z\\ -\gamma\beta n_y&(\gamma-1)n_y n_x&1+(\gamma-1)n_y^2&(\gamma-1)n_y n_z\\ -\gamma\beta n_z&(\gamma-1)n_z n_x&(\gamma-1)n_z n_y&1+(\gamma-1)n_z^2\\ \end{bmatrix} \,, \quad X = \begin{bmatrix} c\,t \\ x \\ y \\ z \end{bmatrix}

and all the separate equations compress into one matrix equation;

X' = B(\mathbf{v})X

The boost matrix B is a symmetric matrix, it equals its transpose. In the inverse transformations the transformation matrix is the matrix inverse of the original transformation. Instead of explicitly calculating the inverse matrix by brute force, the simple change v → −v suffices, and the inverse transformation is

B (\mathbf{v})^{-1} = B (-\mathbf{v}) \quad \Rightarrow \quad X = B(-\mathbf{v})X'

The boosts along the Cartesian directions can be readily obtained, for example the unit vector in the x direction has components nx = 1 and ny = nz = 0. Looking at the patterns in the boost matrices along the Cartesian directions, the general boost matrix can be systematically rewritten by introducing

K_x = \begin{bmatrix}
0 &1 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
\end{bmatrix}\,,\quad K_y = \begin{bmatrix}0 & 0 & 1 & 0\\
0 & 0 & 0 & 0\\
1 & 0 & 0 & 0\\
0 & 0 & 0 & 0
\end{bmatrix}\,,\quad K_z = \begin{bmatrix}0 & 0 & 0 & 1\\
0 & 0 & 0 & 0\\
0 & 0 & 0 & 0\\
1 & 0 & 0 & 0
\end{bmatrix}

Collecting these into a vector of matrices K = (Kx, Ky, Kz), the matrix n·K = nxKx + nyKy + nzKz and its square allow the compact expression

B(\mathbf{v}) = I - \gamma\beta(\mathbf{n}\cdot\mathbf{K}) + (\gamma-1)(\mathbf{n}\cdot\mathbf{K})^2

or in the rapidity parametrization

B(\boldsymbol{\zeta}) = I - \sinh\zeta(\mathbf{n}\cdot\mathbf{K}) + (\cosh\zeta-1)(\mathbf{n}\cdot\mathbf{K})^2

which resembles Rodrigues' rotation formula for spatial rotations.

The matrices make one or more successive transformations easier to handle, rather than rotely iterating the transformations to obtain the result of more than one transformation. For two boosts along the same direction, the result is another boost, and rapidity provides a natural way to handle this. If a frame F is boosted with rapidity ζ1 relative to frame F in direction n, and another frame F is boosted with rapidity ζ2 relative to F along the same direction, the separate boosts are

X'' = B(\zeta_2\mathbf{n})X' \,, \quad X' = B(\zeta_1\mathbf{n})X

then ζ1 + ζ2 is the rapidity of the overall boost of F relative to F in the same direction as n,

X'' = B[ (\zeta_1+\zeta_2)\mathbf{n}]X

Moreover, the relative velocities are related to the rapidities by

\beta = \tanh(\zeta_1+\zeta_2) \,,\quad \beta_1 = \tanh\zeta_1 \,,\quad \beta_2 = \tanh\zeta_2 \,.

and the hyperbolic identity

\tanh(\zeta_1+\zeta_2) = \frac{\tanh\zeta_1 + \tanh\zeta_2}{1+\tanh\zeta_1 \tanh\zeta_2}

coincides with the resultant relative velocity of the two relative velocities along the same direction. Thus rapidities add if the boosts are collinear as they are here, while the relative velocities do not. The relative velocities can be in the same or opposite directions, but must be collinear.

For two or more consecutive boosts that are not collinear but in different directions, the result is still a Lorentz transformation, but not a single boost. Also, Lorentz boosts along different directions do not commute, changing their order changes the resultant transformation. The non-commutativity of Lorentz boosts is another unintuitive feature of special relativity that is unlike Galilean relativity. In Newtonian mechanics, any pair of Galilean boosts can be performed in either order, and both results are the same Galilean transformation.

The most general proper Lorentz transformation also contains a rotation of the three axes, because the composition of two boosts is not a pure boost but is a boost followed or preceded by a rotation. The rotation is the Wigner rotation, and gives rise to the Thomas precession. The boost is given by a symmetric matrix, but the general Lorentz transformation matrix need not be symmetric. Explicit formulae for the composite transformation matrices are given in the linked article.

Rotation matrix

A rotation on the spatial coordinates only, leaving the time coordinate alone, leaves the spacetime interval invariant. Therefore, ordinary spatial rotations are also Lorentz transformations. The 4d matrix is simply

\quad R(\boldsymbol{\theta}) = \begin{bmatrix} 1 & 0 \\ 0 & \mathbf{R}(\boldsymbol{\theta}) \end{bmatrix}

where R is a 3d rotation matrix. For the purposes of this article the axis-angle representation will be used here, and the "axis-angle vector" θ = θe is a useful definition; the angle θ multiplied by a unit vector e parallel to the axis. The inverse of R corresponds to rotations using the same axis and angle, but in the opposite sense. The rotation matrix is orthogonal, so the transpose equals the inverse,

R (\boldsymbol{\theta})^{-1} = R(\boldsymbol{\theta})^\mathrm{T} = R (-\boldsymbol{\theta}) \,.

Looking at the patterns in the rotation matrices about the Cartesian axes, it is useful to introduce the matrices

J_x =  \begin{bmatrix}
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & -1 \\
0 & 0 & 1 & 0 \\
\end{bmatrix}\,,\quad J_y = 
\begin{bmatrix}
0 & 0 & 0 & 0\\
0 & 0 & 0 & 1\\
0 & 0 & 0 & 0\\
0 & -1 & 0 & 0
\end{bmatrix}\,,\quad J_z =  \begin{bmatrix}
0 & 0 & 0 & 0\\
0 & 0 & -1 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 0 & 0
\end{bmatrix}

Collecting these into a vector J = (Jx, Jy, Jz), these matrices allow the 4d rotation matrix to be expressed in the Rodrigues quadratic,

R(\boldsymbol{\theta}) = I +\sin\theta(\mathbf{e}\cdot\mathbf{J})+(1-\cos\theta)(\mathbf{e}\cdot\mathbf{J})^2

In this article, the right-handed convention for the spatial coordinates is used (see orientation (vector space)), so that rotations are positive in the anticlockwise sense according to the right-hand rule, and negative in the clockwise sense. This matrix rotates any 3d vector about the axis e through angle θ anticlockwise (an active transformation), which has the equivalent effect of rotating the coordinate frame clockwise about the same axis through the same angle (a passive transformation).

Introduction to the Lorentz group

Main article: Lorentz group

It is a result of special relativity that the quantity

 X \cdot X = X^\mathrm{T} \eta X = {X'}^\mathrm{T} \eta {X'}

is an invariant, where η is the Minkowski metric as a square matrix

 \eta = \begin{bmatrix} -1&0&0&0\\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{bmatrix}

and the coordinates change under a Lorentz transformation

X' = \Lambda X

where Λ is a constant square matrix. Boosts and rotations themselves are Lorentz transformations since each operation leaves the spacetime interval invariant, and the composition of any two is also a Lorentz transformation. Specifically, two pure rotations (without boosts) is a rotation, but two pure boosts (without rotations) is generally a boost followed or preceded by a rotation.

The set of all Lorentz transformations Λ is denoted \mathcal{L}. This set together with matrix multiplication forms a group, in this context known as the Lorentz group. Also, the above expression X·X is a quadratic form of signature (3,1) on spacetime, and the group of transformations which leaves this quadratic form invariant is the indefinite orthogonal group O(3,1), a Lie group. In other words, the Lorentz group is O(3,1). As presented in this article, any Lie groups mentioned are matrix Lie groups. In this context the operation of composition amounts to matrix multiplication.

For the specific cases that Λ is a boost, rotation, or both, there is an additional detail; the determinant of any boost or rotation matrix is +1. The group of Lorentz transformations consisting only of boosts and rotations is called the "restricted Lorentz group", and is the special indefinite orthogonal group SO(3,1).

However, Λ is not limited to boosts and rotations. Other Lorentz transformations may have a determinant of opposite sign and other properties, for example any boosts and/or rotation, combined with parity inversion and/or time reversal, will also leave the above quadratic form invariant. The other transformations are outlined later.

In fact, the above transformation does not include all the symmetries in spacetime. For the spacetime interval to be invariant, it can be shown[14] that it is necessary and sufficient for the coordinate transformation to be of the form

X' = \Lambda X + C

where C is a constant column containing translations in time and space. If C ≠ 0, this is an inhomogenous Lorentz transformation or Poincaré transformation.[15][16] If C = 0, this is a homogeneous Lorentz transformation.

The set of Poincaré transformations also satisfy the properties of a group and is called the Poincaré group or inhomogeneous Lorentz group. The extra translations mean the Poincaré group is not O(3,1), details are given in the linked article. Under the Erlangen program, Minkowski space can be viewed as the geometry defined by the Poincaré group, which combines Lorentz transformations with translations. This is the full symmetry of special relativity.

Generators and parameters of the homogeneous Lorentz group

The axis-angle vector θ and rapidity vector ζ are altogether six continuous variables which make up the group parameters (in this particular representation), and J and K are the corresponding six generators of the group.[nb 4]

Physically, the generators of the Lorentz group are operators that correspond to important symmetries in spacetime: J are the rotation generators which correspond to angular momentum, and K are the boost generators which correspond to the motion of the system in spacetime.

Lorentz generators can be added together, or multiplied by real numbers, to get more Lorentz generators. For example,

\boldsymbol{\zeta} \cdot\mathbf{K} + \boldsymbol{\theta} \cdot\mathbf{J} = \zeta_x K_x + \zeta_y K_y + \zeta_z K_z 
 + \theta_x J_x + \theta_y J_y + \theta_z J_z= \begin{bmatrix}
0 & \zeta_x & \zeta_y & \zeta_z \\
\zeta_x & 0 & -\theta_z & \theta_y \\
\zeta_y & \theta_z & 0 & -\theta_x \\
\zeta_z & -\theta_y & \theta_x & 0 \\
\end{bmatrix}

is a generator. Therefore, the set of all Lorentz generators

V = \{ \boldsymbol{\zeta} \cdot\mathbf{K} + \boldsymbol{\theta} \cdot\mathbf{J}  \}

together with the operations of ordinary matrix addition and multiplication of a matrix by a number, forms a vector space over the real numbers.[nb 5] The generators Jx, Jy, Jz, Kx, Ky, Kz form a basis set of V, and the components of the axis-angle and rapidity vectors, θx, θy, θz, ζx, ζy, ζz, are the coordinates of a Lorentz generator with respect to this basis.[nb 6]

Three of the commutation relations of the Lorentz generators are

 [J_x ,J_y ] =  J_z
 [K_x ,K_y ] = - J_z
 [J_x ,K_y ] = K_z

where the bracket [A, B] = ABBA is a binary operation known as the commutator, and the other relations can be found by taking cyclic permutations of x, y, z components (i.e. change x to y, y to z, and z to x, repeat).

These commutation relations, and the vector space of generators, fulfill the definition of the Lie algebra so(3, 1). In summary, a Lie algebra is defined as a vector space V over a field of numbers, and with a binary operation [ , ] (called a Lie bracket in this context) on the elements of the vector space, satisfying the axioms of bilinearity, alternatization, and the Jacobi identity. Here the operation [ , ] is the commutator which satisfies all of these axioms, the vector space is the set of Lorentz generators V as given previously, and the field is the set of real numbers.

The exponential map (Lie theory) from the Lie algebra to the Lie group,

\mathrm{exp} \, : \, \mathfrak{so}(3,1) \rightarrow \mathrm{SO}(3,1),

provides a one-to-one correspondence between small enough neighborhoods of the origin of the Lie algebra and neighborhoods of the identity element of the Lie group. It the case of the Lorentz group, the exponential map is just the matrix exponential. Globally, the exponential map is not one-to-one, but in the case of the Lorentz group, it is surjective (onto). Hence any group element can be expressed as an exponential of an element of the Lie algebra.

To see the exponential mapping heuristically, consider the infinitesimal Lorentz boost in the x direction for simplicity (the generalization to any direction follows an almost identical procedure). The infinitesimal transformation a small boost away from the identity, obtained by the Taylor expansion of the boost matrix to first order about ζ = 0,

 B(\zeta\mathbf{e}_x) = I + \zeta \left. \frac{\partial B(\zeta\mathbf{e}_x)}{\partial \zeta } \right|_{\zeta=0} + \cdots

where the higher order terms not shown are negligible because ζ is small. The derivative of the matrix is the matrix of the entries differentiated with respect to the same variable (see matrix calculus), and it is understood the derivatives are found first then evaluated at ζ = 0, which turn out to give

 \left. \frac{\partial B(\zeta\mathbf{e}_x) }{\partial \zeta } \right|_{\zeta=0} = -K_x

The derivative of any smooth curve A(t) with A(0) = I in the group depending on some group parameter t with respect to that group parameter, evaluated at t = 0, serves as a definition of a corresponding group generator X, and this reflects an infinitesimal transformation away from the identity. The smooth curve can always be taken as an exponential as the exponential will always map X smoothly back into the group via t → exp(tX) for all t; this curve will yield X again when differentiated at t = 0. In other words, linking terminology used in mathematics and physics: A group generator is any element of the Lie algebra. A group parameter is a component of a coordinate vector representing an arbitrary element of the Lie algebra with respect to some basis. A basis, then, is a set of generators being a basis of the Lie algebra in the usual vector space sense.

In the limit of an infinite number of infinitely small steps, the finite boost transformation in the form of a matrix exponential is obtained

 B(\zeta\mathbf{e}_x) =\lim_{N\rightarrow\infty}\left(I-\frac{\zeta K_x}{N}\right)^{N}=e^{-\zeta K_x}

where the limit definition of the exponential has been used (see also characterizations of the exponential function).

Almost identical results appear for the other Cartesian directions, and the general boost matrix is

B(\boldsymbol{\zeta}) = e^{-\boldsymbol{\zeta}\cdot\mathbf{K}}

similarly the general rotation matrix is

R(\boldsymbol{\theta}) = e^{\boldsymbol{\theta}\cdot\mathbf{J}}

and the general Lorentz transformation is

\Lambda (\boldsymbol{\zeta}, \boldsymbol{\theta}) = e^{\boldsymbol{\zeta} \cdot\mathbf{K} + \boldsymbol{\theta} \cdot\mathbf{J} }.

This is in general a product of a rotation and a boost, but the decomposition of a general Lorentz transformation into such factors is nontrivial. In particular,

e^{\boldsymbol{\zeta} \cdot\mathbf{K} + \boldsymbol{\theta} \cdot\mathbf{J} } \ne e^{\boldsymbol{\zeta} \cdot\mathbf{K}} e^{\boldsymbol{\theta} \cdot\mathbf{J}},

because the generators do not commute. For a description of how to find the factors of a general Lorentz transformation in terms of a boost and a rotation in principle (this usually does not yield an intelligible expression in terms of generators J and K), see Wigner rotation. If, on the other hand, the decomposition is given in terms of the generators, and one wants to find the product in terms of the generators, then the Baker–Campbell–Hausdorff formula applies.

Generators and parameters of the inhomogeneous Lorentz group

For inhomogenous Lorentz transformations, the additional generators are the components of the four-momentum: energy is the generator of time translation, and the 3d momentum components are the generators of spatial translations in those directions. The extra parameters corresponding to these generators are displacements in space and time. The commutation relations are enlarged to include the momenta with the boost and rotation generators.

Classification of the homogeneous Lorentz group

From the invariance of the spacetime interval it follows immediately

\eta = \Lambda^\mathrm{T} \eta \Lambda

and this matrix equation contains the general conditions on the Lorentz transformation to ensure invariance of the spacetime interval. Taking the determinant of the equation using the product rule[nb 7] gives immediately

[\det (\Lambda)]^2 = 1 \quad \Rightarrow \quad \det(\Lambda) = \pm 1

Writing the Minkowski metric as a block matrix, and the Lorentz transformation in the most general form,

\eta = \begin{bmatrix}-1 & 0 \\ 0 & \mathbf{I}\end{bmatrix} \,, \quad \Lambda=\begin{bmatrix}\Gamma & -\mathbf{a}^\mathrm{T}\\-\mathbf{b} & \mathbf{M}\end{bmatrix}  \,,

carrying out the block matrix multiplications obtains general conditions on Γ, a, b, M to ensure relativistic invariance. Not much information can be directly extracted from all the conditions, however one of the results

\Gamma^2=1+\mathbf{b}^\mathrm{T}\mathbf{b}

is useful; bTb ≥ 0 always so it follows that

 \Gamma^2 \geq 1 \quad \Rightarrow \quad \Gamma \leq - 1 \,,\quad \Gamma \geq  1

The negative inequality may be unexpected, because Γ multiplies the time coordinate and this has an effect on time symmetry. If the positive equality holds, then Γ is the Lorentz factor.

The determinant and inequality provide four ways to classify Lorentz transformations (herein LTs for brevity). However, any particular LT has only one determinant sign and only one inequality. There are four sets which include every possible pair given by the intersections ("n"-shaped symbol meaning "and") of these classifying sets. In set notation the four sets and their intersections are:

Intersection Antichronous (or non-orthochronous) LTs
 \mathcal{L}^\downarrow = \{ \Lambda  \, : \, \Gamma \leq -1 \}
Orthochronous LTs
 \mathcal{L}^\uparrow = \{ \Lambda  \, : \, \Gamma \geq 1 \}
Proper LTs
 \mathcal{L}_{+} = \{ \Lambda  \, : \, \det(\Lambda) = +1 \}
Proper antichronous LTs
\mathcal{L}_+^\downarrow = \mathcal{L}_+ \cap \mathcal{L}^\downarrow
Proper orthochronous LTs
\mathcal{L}_+^\uparrow = \mathcal{L}_+ \cap \mathcal{L}^\uparrow
Improper LTs
 \mathcal{L}_{-} = \{ \Lambda  \, : \, \det(\Lambda) = -1 \}
Improper antichronous LTs
\mathcal{L}_{-}^\downarrow = \mathcal{L}_{-} \cap \mathcal{L}^\downarrow
Improper orthochronous LTs
\mathcal{L}_{-}^\uparrow = \mathcal{L}_{-} \cap \mathcal{L}^\uparrow

where "+" and "−" indicate the determinant sign, while "↑" for ≥ and "↓" for ≤ denote the inequalities.

The full Lorentz group splits into the union ("u"-shaped symbol meaning "or") of four disjoint sets

\mathcal{L} = \mathcal{L}_+^\uparrow \cup \mathcal{L}_{-}^\uparrow \cup \mathcal{L}_+^\downarrow \cup \mathcal{L}_{-}^\downarrow

A subgroup of a group must be closed under the same operation of the group (here matrix multiplication). In other words, for two Lorentz transformations Λ and L from a particular set, the composite Lorentz transformations ΛL and LΛ must return to the same set Λ and L came from. This will not always be the case; it can be shown that the composition of any two Lorentz transformations always has the positive determinant and positive inequality, a proper orthochronous transformation.

The orthochronous, proper, proper orthochronous sets of LTs are all subgroups. Another subgroup is the union of proper orthochronous and improper antichronous sets, \mathcal{L}_0 = \mathcal{L}_+^\uparrow \cup \mathcal{L}_{-}^\downarrow. Rotations and boosts are elements of the proper orthochronous Lorentz group.

The other sets involving the improper and/or antichronous properties do not form subgroups, because the composite transformation always has a positive determinant or inequality, whereas the original separate transformations will have negative determinants and/or inequalities. However, the elements of these sets can be expressed in terms of proper orthochronous transformations with appropriate parity inversion P and/or time reversal T. These are in matrix form

 P = \begin{bmatrix} 1 & 0 \\ 0 & - \mathbf{I} \end{bmatrix} \,, \quad T = \begin{bmatrix} - 1 & 0 \\ 0 & \mathbf{I} \end{bmatrix}

so if Λ is proper orthochronous, then TΛ is improper antichronous, PΛ is improper orthochronous, and TPΛ = PTΛ is proper antichronous.

Tensor formulation

For the notation used, see Ricci calculus.

Contravariant vectors

Writing the general matrix transformation of coordinates as the matrix equation

\begin{bmatrix}
{x'}^0 \\ {x'}^1 \\ {x'}^2 \\ {x'}^3
\end{bmatrix} =

\begin{bmatrix}
 \Lambda^0{}_0 & \Lambda^0{}_1 & \Lambda^0{}_2 & \Lambda^0{}_3 \\
 \Lambda^1{}_0 & \Lambda^1{}_1 & \Lambda^1{}_2 & \Lambda^1{}_3 \\
 \Lambda^2{}_0 & \Lambda^2{}_1 & \Lambda^2{}_2 & \Lambda^2{}_3 \\
 \Lambda^3{}_0 & \Lambda^3{}_1 & \Lambda^3{}_2 & \Lambda^3{}_3 \\
\end{bmatrix}

\begin{bmatrix}
x^0 \\ x^1 \\ x^2 \\ x^3
\end{bmatrix}

allows the transformation of other physical quantities that cannot be expressed as four-vectors, e.g., tensors or spinors of any order in 4d spacetime, to be defined. In the corresponding tensor index notation, the above matrix expression is

{x^\prime}^{\nu} = {\Lambda^\nu}_\mu x^\mu,

where upper and lower indices label covariant and contravariant components respectively, and the summation convention is applied. It is a standard convention to use Greek indices that take the value 0 for time components, and 1, 2, 3 for space components, while Latin indices simply take the values 1, 2, 3, for spatial components. Note that the first index (reading left to right) corresponds in the matrix notation to a row index. The second index corresponds to the column index.

The transformation matrix is universal for all four-vectors, not just 4-dimensional spacetime coordinates. If A is any four-vector, then in tensor index notation

 {A^\prime}^{\nu} = \Lambda^{\nu}{}_\mu A^\mu \,.

Alternatively, one writes

 A^{\nu^\prime} = \Lambda^{\nu^\prime}{}_\mu A^\mu \,.

in which the primed indices denote the indices of A in the primed frame. This notation cuts risk of exhausting the Greek alphabet roughly in half.

For a general n-component object one may write

{X'}^\alpha  = \Pi(\Lambda)^\alpha {}_\beta X^\beta \,,

where Π is the appropriate representation of the Lorentz group, an n×n matrix for every Λ. In this case, the indices should not be thought of as spacetime indices (sometimes called Lorentz indices), and they run from 1 to n. E.g. if X is a bispinor, then the indices are called Dirac indices.

Covariant vectors

There are also vector quantities with covariant indices. They are generally obtained from their corresponding objects with contravariant indices by the operation of lowering an index, e.g.

x_\nu = \eta_{\mu\nu}x^\mu,

where η is the metric tensor. (The linked article also provides more information about what the operation of raising and lowering indices really is mathematically.) The inverse of this transformation is given by

x^\mu = \eta^{\nu\mu}x_\nu,

where, when viewed as matrices, ημν is the inverse of ημν. As it happens, ημν = ημν. This is referred to as raising an index. To transform a covariant vector Aμ, first raise its index, then transform it according to the same rule as for contravariant 4-vectors, then finally lower the index;

{}{A^\prime}_\nu = \eta_{\rho\nu}{\Lambda^\rho}_\sigma\eta^{\mu\sigma}A_\mu.

But

\eta_{\rho\nu}{\Lambda^\rho}_\sigma\eta^{\mu\sigma} = {(\Lambda^{-1})^\mu}_\nu,

i. e. it is the (μ, ν)-component of the inverse Lorentz transformation. One defines (as a matter of notation),

{\Lambda_\nu}^\mu \equiv {(\Lambda^{-1})^\mu}_\nu,

and may in this notation write

{}{A^\prime}_\nu = {\Lambda_\nu}^\mu A_\mu.

Now for a subtlety. The implied summation on the right hand side of

{}{A^\prime}_\nu = {\Lambda_\nu}^\mu A_\mu = {(\Lambda^{-1})^\mu}_\nu A_\mu

is running over a row index of the matrix representing Λ−1. Thus, in terms of matrices, this transformation should be thought of as the inverse transpose of Λ acting on the column vector Aμ. That is, in pure matrix notation,

A^\prime = (\Lambda^{-1})^{\mathrm{T}}A.

This means exactly that covariant vectors (thought of as column matrices) transform according to the dual representation of the standard representation of the Lorentz group. This notion generalizes to general representations, simply replace Λ with Π(Λ).

Tensors

If A and B are linear operators on vector spaces U and V, then a linear operator AB may be defined on the tensor product of U and V, denoted UV according to[17]

(A \otimes B)(u \otimes v) = Au \otimes Bv, \qquad u \in U, v \in V, u \otimes v \in U \otimes V.               (T1)

From this it is immediately clear that if u and v are a four-vectors in V, then uvT2VVV transforms as

u \otimes v \rightarrow \Lambda u \otimes \Lambda v = {\Lambda^\mu}_\nu u^\nu \otimes {\Lambda^\rho}_\sigma v^\sigma = {\Lambda^\mu}_\nu  {\Lambda^\rho}_\sigma u^\nu \otimes v^\sigma \equiv {\Lambda^\mu}_\nu  {\Lambda^\rho}_\sigma w^{\mu\nu}.               (T2)

The second step uses the bilinearity of the tensor product and the last step defines a 2-tensor on component form, or rather, it just renames the tensor uv.

These observations generalize in an obvious way to more factors, and using the fact that a general tensor on a vector space V can be written as a sum of a coefficient (component!) times tensor products of basis vectors and basis convectors, one arrives at the transformation law for any tensor quantity T. It is given by[18]

T^{\alpha' \beta' \cdots \zeta'}_{\theta' \iota' \cdots \kappa'} =
\Lambda^{\alpha'}{}_{\mu} \Lambda^{\beta'}{}_{\nu} \cdots \Lambda^{\zeta'}{}_{\rho}
\Lambda_{\theta'}{}^{\sigma} \Lambda_{\iota'}{}^{\upsilon} \cdots \Lambda_{\kappa'}{}^{\zeta}
T^{\mu \nu \cdots \rho}_{\sigma \upsilon \cdots \zeta},               (T3)

where Λχψ is defined above. This form can generally be reduced to the form for general n-component objects given above with a single matrix (Π(Λ)) operating on column vectors. This latter form is sometimes preferred, e. g. for the electromagnetic field tensor.

Transformation of the electromagnetic field

Lorentz boost of an electric charge, the charge is at rest in one frame or the other.

Lorentz transformations can also be used to illustrate that the magnetic field B and electric field E are simply different aspects of the same force — the electromagnetic force, as a consequence of relative motion between electric charges and observers.[19] The fact that the electromagnetic field shows relativistic effects becomes clear by carrying out a simple thought experiment.[20]

The electric and magnetic fields transform differently from space and time, but exactly the same way as relativistic angular momentum and the boost vector.

The electromagnetic field strength tensor is given by


F^{\mu\nu} = 
\begin{bmatrix}
0     & -E_x/c & -E_y/c & -E_z/c \\
E_x/c & 0      & -B_z   & B_y    \\
E_y/c & B_z    & 0      & -B_x   \\
E_z/c & -B_y   & B_x    & 0
\end{bmatrix} \text{(SI units, signature }(+,-,-,-)\text{)}.

in SI units. In relativity, the Gaussian system of units is often preferred over SI units, even in texts whose main choice of units is SI units, because in it the electric field E and the magnetic induction B have the same units making the appearance of the electromagnetic field tensor more natural.[21] Consider a Lorentz boost in the x-direction. It is given by[22]

 
{\Lambda^\mu}_\nu = 
\begin{bmatrix} 
 \gamma & -\gamma\beta & 0 & 0\\
-\gamma\beta & \gamma & 0 & 0\\
 0 & 0 & 1 & 0\\
 0 & 0 & 0 & 1\\
\end{bmatrix}, \qquad

F^{\mu\nu} = 
\begin{bmatrix}
0     & E_x & E_y & E_z \\
-E_x & 0      & B_z   & -B_y    \\
-E_y & -B_z    & 0      & B_x   \\
-E_z & B_y   & -B_x    & 0
\end{bmatrix} \text{(Gaussian units, signature }(-,+,+,+)\text{)},

where the field tensor is displayed side by side for easiest possible reference in the manipulations below.

The general transformation law (T3) becomes

F^{\mu' \nu'} =
\Lambda^{\mu'}{}_{\mu} \Lambda^{\nu'}{}_{\nu}
F^{\mu \nu}.

For the magnetic field one obtains


\begin{align}
B_{x'}    & = F^{2'3'}
          = \Lambda^2{}_\mu\Lambda^3{}_\nu F^{\mu\nu}
          = \Lambda^2{}_2\Lambda^3{}_3 F^{23}
          = 1\times 1\times B_x \\
          & = B_x,
\end{align}

\begin{align}
B_{y'}  &= F^{3'1'}
        = \Lambda^3{}_\mu\Lambda^1{}_\nu F^{\mu \nu}
        = \Lambda^3{}_3\Lambda^1{}_\nu F^{3\nu}
        = \Lambda^3{}_3\Lambda^1{}_0 F^{30} + \Lambda^3{}_3\Lambda^1{}_1 F^{13}
        = 1\times (-\beta\gamma) (-E_z) + 1\times\gamma B_y   
        = \gamma B_y +\beta\gamma E_z \\
        & = \gamma\left(\mathbf{B} - \boldsymbol{\beta} \times \mathbf{E}\right)_y
\end{align}

\begin{align}
B_{z'}  &= F^{1'2'}
        = \Lambda^1{}_\mu\Lambda^2{}_\nu F^{\mu\nu}
        = \Lambda^1{}_\mu\Lambda^2{}_2 F^{\mu 2}
        = \Lambda^1{}_0\Lambda^2{}_2 F^{02} + \Lambda^1{}_1\Lambda^2{}_2 F^{12}
        = (-\gamma\beta) \times 1\times E_y + \gamma\times 1 \times   B_z
        = \gamma B_z -\beta\gamma E_y \\
        & = \gamma\left(\mathbf{B} - \boldsymbol{\beta} \times \mathbf{E}\right)_z
\end{align}

For the electric field results


\begin{align}
E_{x'}    &= F^{0'1'}
          = \Lambda^0{}_\mu\Lambda^1{}_\nu F^{\mu\nu}
          = \Lambda^0{}_1\Lambda^1{}_0 F^{10} + \Lambda^0{}_0\Lambda^1{}_1 F^{01}
          = (-\gamma\beta)(-\gamma\beta)(-E_x) + \gamma\gamma E_x
          = -\gamma^2\beta^2(E_x) + \gamma^2 E_x
          = E_x(1-\beta^2)\gamma^2 \\
          & = E_x,
\end{align}

\begin{align}
E_{y'}   &= F^{0'2'}
          = \Lambda^0{}_\mu\Lambda^2{}_\nu F^{\mu\nu}
          = \Lambda^0{}_\mu\Lambda^2{}_2 F^{\mu 2}
          = \Lambda^0{}_0\Lambda^2{}_2 F^{02} + \Lambda^0{}_1\Lambda^2{}_2 F^{12}
          = \gamma \times 1 \times E_y + (-\beta\gamma)\times 1 \times B_z
          = \gamma E_y -\beta\gamma B_z \\
          & = \gamma\left(\mathbf{E} + \boldsymbol{\beta} \times \mathbf{B}\right)_y
\end{align}

\begin{align}
E_{z'} &  = F^{0'3'}
          = \Lambda^0{}_\mu\Lambda^3{}_\nu F^{\mu\nu}
          = \Lambda^0{}_\mu\Lambda^3{}_3 F^{\mu 3}
          = \Lambda^0{}_0\Lambda^3{}_3 F^{03} + \Lambda^0{}_1\Lambda^3{}_3 F^{13}
          = \gamma\times 1 \times E_z -\beta\gamma\times 1 \times (-B_y)
          = \gamma E_z +\beta\gamma B_y \\
        & = \gamma\left(\mathbf{E} + \boldsymbol{\beta} \times \mathbf{B}\right)_z.
\end{align}

Here, β = (β, 0, 0) is used. These results can be summarized by

\begin{align} 
& \mathbf {{E}_{\parallel'}} = \mathbf {{E}_{\parallel}}\\
& \mathbf {{B}_{\parallel'}} = \mathbf {{B}_{\parallel}}\\
& \mathbf {{E}_{\bot'}}= \gamma \left( \mathbf {E}_{\bot} + \boldsymbol{\beta} \times \mathbf {B}_{\bot} \right) = \gamma \left( \mathbf {E} + \boldsymbol{\beta} \times \mathbf {B} \right)_{\bot},\\
& \mathbf {{B}_{\bot'}}= \gamma \left( \mathbf {B}_{\bot} - \boldsymbol{\beta} \times \mathbf {E}_{\bot} \right) = \gamma \left( \mathbf {B} - \boldsymbol{\beta} \times \mathbf {E} \right)_{\bot},
\end{align}

and are independent of the metric signature. For SI units, substitute EEc. Misner, Thorne & Wheeler (1973) refer to this last form as the 3 + 1 view as opposed to the geometric view represented by the tensor expression

F^{\mu' \nu'} =
\Lambda^{\mu'}{}_{\mu} \Lambda^{\nu'}{}_{\nu}
F^{\mu \nu},

and make a strong point of the ease with which results that are difficult to achieve using the 3 + 1 view can be obtained and understood. Only objects that have well defined Lorentz transformation properties (in fact under any smooth coordinate transformation) are geometric objects. In the geometric view, the electromagnetic field is a six-dimensional geometric object in spacetime as opposed to two interdependent, but separate, 3-vector fields in space and time. The fields E (alone) and B (alone) do not have well defined Lorentz transformation properties. The mathematical underpinnings are equations (T1) and (T2) that immediately yield (T3). One should note that the primed and unprimed tensors refer to the same event in spacetime. Thus the complete equation with spacetime dependence is

F^{\mu' \nu'}(x') =
\Lambda^{\mu'}{}_{\mu} \Lambda^{\nu'}{}_{\nu}
F^{\mu \nu}(\Lambda^{-1} x') =
\Lambda^{\mu'}{}_{\mu} \Lambda^{\nu'}{}_{\nu}
F^{\mu \nu}(x).

Length contraction has an effect on charge density ρ and current density J, and time dilation has an effect on the rate of flow of charge (current), so charge and current distributions must transform in a related way under a boost. It turns out they transform exactly like the space-time and energy-momentum four-vectors,

\begin{align} \mathbf{j}' & =\mathbf{j}-\gamma \rho v\mathbf{n} +\left( \gamma -1 \right)(\mathbf{j}\cdot \mathbf{n})\mathbf{n} \\ {\rho }' & =\gamma ( \rho - \mathbf{j}\cdot v\mathbf{n}/c^2) \end{align},

or, in the simpler geometric view,

j^{\mu^\prime} = \Lambda^{\mu^\prime}{}_\mu j^\mu.

One says that charge density transforms as the time component of a four-vector. It is a rotational scalar. The current density is a 3-vector.

The Maxwell equations are invariant under Lorentz transformations.

Spinors

Equation (T1) hold unmodified for any representation of the Lorentz group, including the bispinor representation. In (T2) one simply replaces all occurrences of Λ by the bispinor representation Π(Λ),

u \otimes v \rightarrow \Pi(\Lambda) u \otimes \Pi(\Lambda) v = {\Pi(\Lambda)^\alpha}_\beta u^\beta \otimes {\Pi(\Lambda)^\rho}_\sigma v^\sigma = {\Pi(\Lambda)^\alpha}_\beta  {\Pi(\Lambda)^\rho}_\sigma u^\beta\otimes v^\sigma \equiv {\Pi(\Lambda)^\alpha}_\beta  {\Pi(\Lambda)^\rho}_\sigma w^{\alpha\beta}.               (T4)

The above equation could, for instance, be the transformation of a state in Fock space describing two free electrons.

Transformation of general fields

A general noninteracting multi-particle state (Fock space state) in quantum field theory transforms according to the rule[23]

U(\Lambda ,a)\Psi_{p_1\sigma_1 n_1;p_2\sigma_2 n_2\cdots} = e^{-ia_\mu((\Lambda p_1)^\mu + (\Lambda p_2)^\mu + \cdots)}
\sqrt{\frac{(\Lambda p_1)^0(\Lambda p_2)^0\cdots}{p_1^0p_2^0\cdots}}\sum_{\sigma_1'\sigma_2'\cdots}
D_{\sigma_1'\sigma_1}^{(j_1)}(W(\Lambda, p_1))D_{\sigma_2'\sigma_2}^{(j_2)}(W(\Lambda, p_2))\cdots
\Psi_{\Lambda p_1\sigma_1' n_1;\Lambda p_2\sigma_2' n_2\cdots},

 

 

 

 

(1)

where W(Λ, p) is the Wigner rotation and D(j) is the (2j + 1)-dimensional representation of SO(3).

See also

Footnotes

  1. One can imagine that in each inertial frame there are observers positioned throughout space, each endowed with a synchronized clock and at rest in the particular inertial frame. These observers then report to a central office, where a report is collected. When one speaks of a particular observer, one refers to someone having, at least in principle, a copy of this report. See, e.g., Sard (1970).
  2. It should be noted that the separate requirements of the three equations lead to three different groups. The second equation is satisfied for spacetime translations in addition to Lorentz transformations leading to the Poincare group or the inhomogeneous Lorentz group. The first equation (or the second restricted to lightlike separation) leads to a yet larger group, the conformal group of spacetime.
  3. The groups O(3, 1) and O(1, 3) are isomorphic. It is widely believed that the choice between the two metric signatures has no physical relevance, even though some objects related to O(3, 1) and O(1, 3) respectively, e.g., the Clifford algebras corresponding to the different signatures of the bilinear form associated to the two groups, are non-isomorphic.
  4. In quantum mechanics, relativistic quantum mechanics, and quantum field theory, a different convention is used for these matrices; the right hand sides are all multiplied by a factor of the imaginary unit i = −1.
  5. Until now the term "vector" has exclusively referred to "Euclidean vector", examples are position r, velocity v, etc. The term "vector" applies much more broadly than Euclidean vectors, row or column vectors, etc., see linear algebra and vector space for details. The generators of a Lie group also form a vector space over a field of numbers (e.g. real numbers, complex numbers), since a linear combination of the generators is also a generator. They just live in a different space to the position vectors in ordinary 3d space.
  6. In ordinary 3d position space, the position vector r = xex + yey + zez is expressed as a linear combination of the Cartesian unit vectors ex, ey, ez which form a basis, and the Cartesian coordinates x, y, z are coordinates with respect to this basis.
  7. For two square matrices A and B, det(AB) = det(A)det(B)

Notes

References

Websites

Papers

Books

Further reading

External links

Wikisource has original works on the topic: Relativity
Wikibooks has a book on the topic of: special relativity
This article is issued from Wikipedia - version of the Wednesday, May 04, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.