This article is a continuation of Introductory mathematics

It has been known since the time of Euclid that all of geometry can be derived from a handful of objects (points, lines...), a few actions on those objects, and a small number of axoims. Every field of science likewise can be reduced to a small set of objects, actions, and rules. Math itself is not a single field but rather a constellation of related fields. One way in which new fields are created is by the process of generalization.

A generalization is the formulation of general concepts from specific instances by abstracting common properties. Generalization is the process of identifying the parts of a whole, as belonging to the whole.[1]


Mathematical notation can be extremely intimidating. Wikipedia is full of articles with page after page of indecipherable text. At first glance this article might appear to be the same. I want to assure the reader that every effort has been made to simplify everything as much as possible while also providing links to articles with more in-depth information.

The following has been assembled from countless small pieces gathered from throughout the world wide web. I cant guarantee that there are no errors in it. Please report any errors or omissions on this articles talk page.



See also: Peano axioms and Hyperoperation*

The basis of all of mathematics is the "Next"* function. See Graph theory. Next(0)=1, Next(1)=2, Next(2)=3, Next(3)=4. (We might express this by saying that One differs from nothing as two differs from one.) This defines the Natural numbers (denoted \mathbb{N}_0). Natural numbers are those used for counting.

These have the convenient property of being transitive. That means that if a<b and b<c then it follows that a<c. In fact they are totally ordered. See Order theory*.

Addition (See Tutorial:arithmetic) is defined as repeatedly calling the Next function, and its inverse is subtraction. But this leads to the ability to write equations like 1-3=x for which there is no answer among natural numbers. To provide an answer mathematicians generalize to the set of all integers (denoted \mathbb{Z} because zahlen means count in german) which includes negative integers.

The Additive identity is zero because x + 0 = x.
The absolute value or modulus of x is defined as |x| = \left\{
     x, & \text{if }  x \geq 0 \\
     -x, & \text{if } x < 0.
Integers form a ring* (denoted \mathcal O_\mathbb{Q}) over the field of rational numbers. Ring is defined below.
Zn is used to denote the set of integers modulo n *.
Modular arithmetic* is essentially arithmetic in the quotient ring Z/nZ (which has n elements).
An ideal* is a special subset of a ring. Ideals generalize certain subsets of the integers, such as the even numbers or the multiples of 3.
Ulam 1

The Ulam spiral*. Black pixels = prime numbers*.

The study of integers is called Number theory.
a \mid b means a divides b.
a \nmid b means a does not divide b.
p^a \mid\mid n means pa exactly divides n (i.e. pa divides n but pa+1 does not).
A prime number is a number that can only be divided by itself and one.
If a, b, c, and d are primes and x=abc and y=c2d then:
xy = lcm * gcd = abc2d * c
(See Tutorial:least common multiples)

Multiplication (See Tutorial:multiplication) is defined as repeated addition, and its inverse is division. But this leads to equations like 3/2=x for which there is no answer. The solution is to generalize to the set of rational numbers (denoted \mathbb{Q}) which include fractions (See Tutorial:fractions). Any number which isnt rational is irrational. See also p-adic number*

Rational numbers form a field. A Field is defined below.
Rational numbers form a division algebra* because every non-zero element has an inverse.
The set of all rational numbers minus zero forms a multiplicative group*.
The Multiplicative identity is one because x * 1 = x.
Division by zero is undefined and undefinable. 1/0 exists nowhere on the complex plane. It does, however, exist on the Riemann sphere (often called the extended complex plane) where it is surprisingly well behaved. See also Wheel theory* and L'Hôpital's rule.
(Addition and multiplication are fast but division is slow even for computers*.)

Exponentiation (See Tutorial:exponents) is defined as repeated multiplication, and its inverses are roots and logarithms. But this leads to multiple equations with no solutions:

Equations like \sqrt{2}=x. The solution is to generalize to the set of algebraic numbers (denoted \mathbb{A}). See also algebraic integer*. To see a proof that the square root of two is irrational see Square root of 2.
Equations like 2^{\sqrt{2}}=x The solution (because x is transcendental) is to generalize to the set of Real numbers (denoted \mathbb{R}).
A plus bi
Equations like \sqrt{-1}=x and e^x=-1. The solution is to generalize to the set of complex numbers (denoted \mathbb{C}) by defining i = sqrt(-1). A single complex number z=a+bi consists of a real part a and an imaginary part bi (See Tutorial:complex numbers). Imaginary numbers (denoted \mathbb{I}) often occur in equations involving change with respect to time. If friction is resistance to motion then imaginary friction would be resistance to change of motion wrt time. (In other words, imaginary friction would be mass.) In fact, in the equation for the Spacetime interval (given below), time itself is an imaginary quantity*.
Complex numbers can be used to represent and perform rotations but only in 2 dimensions. Hypercomplex numbers like quaternions (denoted \mathbb{H}), octonions (denoted \mathbb{O}), and sedenions* (denoted \mathbb{S}) are one way to generalize complex numbers to some (but not all) higher dimensions. A quaternion can be thought of as a complex number whose coefficients are themselves complex numbers.
(a + b\boldsymbol{\hat{\imath}}) + (c + d\boldsymbol{\hat{\imath}})\boldsymbol{\hat{\jmath}}  = a + b\boldsymbol{\hat{\imath}} + c\boldsymbol{\hat{\jmath}} + d\boldsymbol{\hat{\imath}\hat{\jmath}} =  a + b\boldsymbol{\hat{\imath}} + c\boldsymbol{\hat{\jmath}} + d\boldsymbol{\hat{k}}
\boldsymbol{\hat{\imath}}^2 = \boldsymbol{\hat{\jmath}}^2 = \boldsymbol{\hat{k}}^2 = \boldsymbol{\hat{\imath}} \boldsymbol{\hat{\jmath}} \boldsymbol{\hat{k}} = -1
\boldsymbol{\hat{\imath}}\boldsymbol{\hat{\jmath}} & = \boldsymbol{\hat{k}}, & \qquad \boldsymbol{\hat{\jmath}}\boldsymbol{\hat{\imath}} & = -\boldsymbol{\hat{k}}, \\
\boldsymbol{\hat{\jmath}}\boldsymbol{\hat{k}} & = \boldsymbol{\hat{\imath}}, & \boldsymbol{\hat{k}}\boldsymbol{\hat{\jmath}} & = -\boldsymbol{\hat{\imath}}, \\
\boldsymbol{\hat{k}}\boldsymbol{\hat{\imath}} & = \boldsymbol{\hat{\jmath}}, & \boldsymbol{\hat{\imath}}\boldsymbol{\hat{k}} & = -\boldsymbol{\hat{\jmath}}. 
Split-complex numbers* (hyperbolic complex numbers) are similar to complex numbers except that i2 = +1.
The Complex conjugate of the complex number z=a+bi is \overline{z}=a-bi. (Not to be confused with the dual of a vector.)
Complex numbers form a K-algebra* because complex multiplication is Bilinear*.
\sqrt{-100} * \sqrt{-100} = 10i * 10i = -100 \neq \sqrt{-100 * -100}
The complex numbers are not ordered. However the absolute value or modulus* of a complex number is:
|z| = |a + ib| = \sqrt{a^2+b^2}
There are n solutions of \sqrt[n]{z}
0^0 = 1. See Empty product.
\log_b(x) = \frac{\log_a(x)}{\log_a(b)}

Tetration is defined as repeated exponentiation and its inverses are called super-root and super-logarithm.


 {}^{b}a &

   = &

 \underbrace{a^{a^{{}^{.\,^{.\,^{.\,^a}}}}}} &

   = &

 a\uparrow\uparrow b

   = &

 \underbrace{a\uparrow (a\uparrow(\dots\uparrow a))}  &


    & & b\mbox{ copies of }a


    & & b\mbox{ copies of }a


When a quantity, like the charge of a single electron, becomes so small that it is insignificant we, quite justifiably, treat it as though it were zero. A quantity that can be treated as though it were zero, even though it very definitely is not, is called infinitesimal. If q is a finite ( q \cdot 1 ) amount of charge then using Leibniz's notation dq would be an infinitesimal ( q \cdot 1/\infty ) amount of charge. See Differential

Likewise when a quantity becomes so large that a regular finite quantity becomes insignificant then we call it infinite. We would say that the mass of the ocean is infinite ( M \cdot \infty ). But compared to the mass of the Milky Way galaxy our ocean is insignificant. So we would say the mass of the Galaxy is doubly infinite ( M \cdot \infty^2 ).

Infinity and the infinitesimal are called Hyperreal numbers (denoted {}^*\mathbb{R}). Hyperreals behave, in every way, exactly like real numbers. For example, 2 \cdot \infty is exactly twice as big as \infty. In reality, the mass of the ocean is a real number so it is hardly surprising that it behaves like one. See Epsilon numbers* and Big O notation*

Back to top


From Wikipedia:Binary number

0 0
1 1
2 10
3 11
4 100
5 101
6 110
7 111
8 1000
9 1001
10 1010
11 1011
12 1100
13 1101
14 1110
15 1111

The binary numbers 1011 and 1010 are multiplied as follows:

           1 0 1 1   (A)  (11 in decimal)
         × 1 0 1 0   (B)  (10 in decimal)
           0 0 0 0   
   +     1 0 1 1     
   +   0 0 0 0
   + 1 0 1 1
   = 1 1 0 1 1 1 0

Binary numbers can also be multiplied with bits after a binary point:

             1 0 1 . 1 0 1     A  (5.625 in decimal)
           × 1 1 0 . 0 1       B  (6.25  in decimal)
                 1 . 0 1 1 0 1   
   +           0 0 . 0 0 0 0     
   +         0 0 0 . 0 0 0
   +       1 0 1 1 . 0 1
   +     1 0 1 1 0 . 1
   =   1 0 0 0 1 1 . 0 0 1 0 1  (35.15625 in decimal)

From Wikipedia:Power of two

21 = 2
22 = 4
24 = 16
28 = 256
216 = 65,536
232 = 4,294,967,296
264 = 18,446,744,073,709,551,616 (20 digits)
2128 = 340,282,366,920,938,463,463,374,607,431,768,211,456 (39 digits)
2256 = 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,
639,936 (78 digits)

Our universe is tiny. Starting with only 2 people and doubling the population every 100 years will in only 27,000 years result in enough people to completely fill the observable universe.

Back to top


[-2,5[ or [-2,5) denotes the interval from -2 to 5, including -2 but excluding 5.
[3..7] denotes all integers from 3 to 7.
The set of all reals is unbounded at both ends.
An open interval does not include its endpoints.
Compactness* is a property that generalizes the notion of a subset being closed and bounded.
The unit interval* is the closed interval [0,1]. It is often denoted I.
The unit square* is a square whose sides have length 1.
Often, "the" unit square refers specifically to the square in the Cartesian plane with corners at the four points (0, 0), (1, 0), (0, 1), and (1, 1).
The unit disk* in the complex plane is the set of all complex numbers of absolute value less than one and is often denoted  \mathbb {D}

Back to top


See also: Algebraic geometry*, Algebraic variety*, Scheme*, Algebraic manifold*, and Linear algebra

The one dimensional number line can be generalized to a multidimensional Cartesian coordinate system thereby creating multidimensional math (i.e. geometry). See also Curvilinear coordinates*

For sets A and B, the Cartesian product A × B is the set of all ordered pairs (a, b) where aA and bB.[2] The direct product* generalizes the Cartesian product. (See also Direct sum*)

\mathbb{R}^3 is the Cartesian product \mathbb{R} \times \mathbb{R} \times \mathbb{R}.
\mathbb{R}^\infty = \mathbb{R}^\mathbb{N}
\mathbb{C}^3 is the Cartesian product \mathbb{C} \times \mathbb{C} \times \mathbb{C} (See Complexification*)

A vector space is a coordinate space with vector addition and scalar multiplication (multiplication of a vector and a scalar belonging to a field.

3D Vector

i, j, and k are basis vectors
a = axi + ayj + azk

If {\mathbf e_1} , {\mathbf e_2} , {\mathbf e_3} are orthogonal unit basis vectors*
and {\mathbf u} , {\mathbf v} , {\mathbf x} are arbitrary vectors then we can (and usually do) write:
\mathbf{u} = u_1 \mathbf{e_1} + u_2 \mathbf{e_2} + u_3 \mathbf{e_3} = \begin{bmatrix} u_1 & u_2 & u_3 \end{bmatrix}
\mathbf{v} = v_1 \mathbf{e_1} + v_2 \mathbf{e_2} + v_3 \mathbf{e_3} = \begin{bmatrix} v_1 & v_2 & v_3 \end{bmatrix}
\mathbf{x} = x_1 \mathbf{e_1} + x_2 \mathbf{e_2} + x_3 \mathbf{e_3} = \begin{bmatrix} x_1 & x_2 & x_3 \end{bmatrix}
See also: Linear independence
A module* generalizes a vector space by allowing multiplication of a vector and a scalar belonging to a ring.

Coordinate systems define the length of vectors parallel to one of the axes but leave all other lengths undefined. This concept of "length" which only works for certain vectors is generalized as the "norm" which works for all vectors. The norm of vector \mathbf{v} is denoted \|\mathbf{v}\|. The double bars are used to avoid confusion with the absolute value of the function.

Taxicab metric (called L1 norm. See Lp space*. Sometimes called Lebesgue spaces. See also Lebesgue measure.)
\|\mathbf{v}\| = v_1 + v_2 + v_3
Pythagoras (2)

c² = (a+b)² - 4ab/2
c² = a² + b²

In Euclidean space the norm (called L2 norm) doesnt depend on the choice of coordinate system. As a result, rigid objects can rotate in Euclidean space. See proof of the Pythagorean theorem to the right. L2 is the only Hilbert space* among Lp spaces.
\|\mathbf{v}\| = \sqrt{v_1^2 + v_2^2 + v_3^2}
In Minkowski space (See Pseudo-Euclidean space*) the Spacetime interval is
\|s\| = \sqrt{x^2 + y^2 + z^2 + (cti)^2}
In complex space* the most common norm of an n dimensional vector is obtained by treating it as though it were a regular real valued 2n dimensional vector in Euclidean space
\left\| \boldsymbol{z} \right\| = \sqrt{z_1 \bar z_1 + \cdots + z_n \bar z_n}
A Banach space* is a normed vector space* that is also a complete metric space (there are no points missing from it).
Tangent bundle

Tangent bundle of a circle

A manifold \mathbf{M} is a type of topological space in which each point has an infinitely small neighbourhood that is homeomorphic to Euclidean space. A manifold is locally, but not globally, Euclidean. A Riemannian metric* on a manifold allows distances and angles to be measured.

A Tangent space* \mathbf{T}_p \mathbf{M} is the set of all vectors tangent to \mathbf{M} at point p.
Informally, a tangent bundle* \mathbf{TM} (red cylinder in image to the right) on a differentiable manifold \mathbf{M} (blue circle) is obtained by joining all the tangent spaces* (red lines) together in a smooth and non-overlapping manner.[3] The tangent bundle always has twice as many dimensions as the original manifold.
A vector bundle* is the same thing minus the requirement that it be tangent.
A fiber bundle* is the same thing minus the requirement that the fibers be vector spaces.
The cotangent bundle* (Dual bundle*) of a differentiable manifold is obtained by joining all the cotangent spaces* (pseudovector spaces).
The cotangent bundle always has twice as many dimensions as the original manifold.
Sections of that bundle are known as differential one-forms.
Circle as Lie group

The circle of center 0 and radius 1 in the complex plane* is a Lie group with complex multiplication.

A Lie group* is a group that is also a finite-dimensional real smooth manifold, in which the group operation is multiplication rather than addition.[4] n×n invertible matrices* (See below) are a Lie group.

A Lie algebra* (See Infinitesimal transformation*) is a local or linearized version of a Lie group.
The Lie derivative generalizes the Lie bracket which generalizes the wedge product which is a generalization of the cross product which only works in 3 dimensions.

Back to top

Multiplication of vectors

Multiplication can be generalized to allow for multiplication of vectors in 3 different ways:

Dot product

Dot product (a Scalar): 

\mathbf{u} \cdot \mathbf{v} =

\| \mathbf{u} \|\ \| \mathbf{v}\| \cos(\theta) =

u_1 v_1 + u_2 v_2 + u_3 v_3

\mathbf{u}\cdot\mathbf{v} =


\mathbf{e_1} \\

u_2 \mathbf{e_2} \\

u_3 \mathbf{e_3}


\begin{bmatrix}v_1 \mathbf{e_1} & v_2 \mathbf{e_2} & v_3 \mathbf{e_3}

\end{bmatrix} =

\begin{bmatrix}u_1 v_1 + u_2 v_2 + u_3 v_3

Strangely, only parallel components multiply.
The dot product can be generalized to the bilinear form \beta(\mathbf{u,v}) = u^T Av = scalar where A is an (0,2) tensor. (For the dot product in Euclidean space A is the identity tensor. But in Minkowski space A is the Minkowski metric*).
Two vectors are orthogonal if \beta(\mathbf{u,v}) = 0.
A bilinear form is symmetric if \beta(\mathbf{u,v}) = \beta(\mathbf{v,u})
Its associated quadratic form* is Q(\mathbf{x}) = \beta(\mathbf{x,x}).
In Euclidean space \|\mathbf{v}\|^2 = \mathbf{v}\cdot\mathbf{v}= Q(\mathbf{x}).
The inner product is a generalization of the dot product to complex vector space. \langle u,v\rangle=u\cdot \bar{v}=\langle v \mid u\rangle
The 2 vectors are called "bra" and "ket"*.
A Hilbert space* is an inner product space that is also a Complete metric space.
The inner product can be generalized to (a sesquilinear form)
A complex Hermitian form (also called a symmetric sesquilinear form), is a sesquilinear form h : V × VC such that[5] h(w,z) = \overline{h(z, w)}.
A is a Hermitian operator* iff \langle v \mid A u\rangle = \langle A v \mid u\rangle. Often written as \langle v \mid A \mid u\rangle.
The curl operator, \nabla\times is Hermitian.

Back to top

Outer product

Outer product (a tensor called a dyadic):\mathbf{u} \otimes \mathbf{v}.

As one would expect, every component of one vector multipies with every component of the other vector.

\mathbf{u} \otimes \mathbf{v} =
u_1 \mathbf{e_1} \\
u_2 \mathbf{e_2} \\
u_3 \mathbf{e_3}
v_1 \mathbf{e_1} & v_2 \mathbf{e_2} & v_3 \mathbf{e_3}
\end{bmatrix} =

{\color{red} u_1 v_1 \mathbf{e_1} \otimes \mathbf{e_1} } & 
{\color{blue} u_1 v_2 \mathbf{e_1} \otimes \mathbf{e_2} } & 
{\color{blue} u_1 v_3 \mathbf{e_1} \otimes \mathbf{e_3} } \\

{\color{blue} u_2 v_1 \mathbf{e_2} \otimes \mathbf{e_1} } & 
{\color{red} u_2 v_2 \mathbf{e_2} \otimes \mathbf{e_2} } & 
{\color{blue} u_2 v_3 \mathbf{e_2} \otimes \mathbf{e_3} } \\

{\color{blue} u_3 v_1 \mathbf{e_3} \otimes \mathbf{e_1} } & 
{\color{blue} u_3 v_2 \mathbf{e_3} \otimes \mathbf{e_2} } & 
{\color{red} u_3 v_3 \mathbf{e_3} \otimes \mathbf{e_3} }

Taking the dot product of uv and any vector x (See Visualization of Tensor multiplication) causes the components of x not pointing in the direction of v to become zero. What remains is then rotated from v to u.
A rotation matrix can be constructed by summing three outer products. The first two sum to form a bivector. The third one rotates the axis of rotation zero degrees. \mathbf{e}_1 \otimes \mathbf{e}_2 - \mathbf{e}_2 \otimes \mathbf{e}_1 + \mathbf{e}_3 \otimes \mathbf{e}_3
\mathbf{e}_1 \otimes \mathbf{e}_2 \cdot \mathbf{e}_2 = \mathbf{e}_1
The Tensor product generalizes the outer product.

Back to top

Wedge product

Exterior calc cross product

A unit vector and a unit bivector are shown in red

Wedge product (a simple bivector): \mathbf{u} \wedge \mathbf{v} = \mathbf{u} \otimes \mathbf{v} - \mathbf{v} \otimes \mathbf{u} = [\overline{\mathbf{u}}, \overline{\mathbf{v}}]

The wedge product is also called the exterior product (sometimes mistakenly called the outer product).
The term "exterior" comes from the exterior product of two vectors not being a vector.
Just as a vector has length and direction so a bivector has an area and an orientation.
In three dimensions \mathbf{u} \wedge \mathbf{v} is a pseudovector and its dual is the cross product. \overline{\mathbf{u} \wedge \mathbf{v}} = \mathbf{u} \times \mathbf{v}

\mathbf{a \wedge b \wedge c =

a \otimes b \otimes c -

a \otimes c \otimes b +

c \otimes a \otimes b -

c \otimes b \otimes a +

b \otimes c \otimes a -

b \otimes a \otimes c}
Exterior calc triple product

The magnitude of a∧b∧c equals the volume of the parallelepiped.

The triple product a∧b∧c is a trivector which is a 3rd degree tensor.
In 3 dimensions a trivector is a pseudoscalar so in 3 dimensions every trivector can be represented as a scalar times the unit trivector. See Levi-Civita symbol
\mathbf{a}\cdot(\mathbf{b}\times \mathbf{c}) \cdot \mathbf{e}_1 \wedge \mathbf{e}_2 \wedge \mathbf{e}_3
The Matrix commutator generalizes the wedge product.
[A_1, A_2] = A_1A_2 - A_2A_1
The dual of vector a is bivector ā:
\overline{\mathbf{a}} \quad\stackrel{\rm def}{=} \quad\begin{bmatrix}\,\,0&\!-a_3&\,\,\,a_2\\\,\,\,a_3&0&\!-a_1\\\!-a_2&\,\,a_1&\,\,0\end{bmatrix}

Back to top


Tensor components explained

Multiplying a tensor and a vector results in a new vector that can not only have a different magnitude but can even point in a completely different direction:

    a_1 & a_2 & a_3 \\
    b_1 & b_2 & b_3 \\
    c_1 & c_2 & c_3 \\
  \end{bmatrix} = 

Some special cases:

    {\color{green}a_1} & a_2 & a_3 \\
    {\color{green}b_1} & b_2 & b_3 \\
    {\color{green}c_1} & c_2 & c_3 \\
  \end{bmatrix} = 
    a_1 & {\color{green}a_2} & a_3 \\
    b_1 & {\color{green}b_2} & b_3 \\
    c_1 & {\color{green}c_2} & c_3 \\
  \end{bmatrix} = 
    a_1 & a_2 & {\color{green}a_3} \\
    b_1 & b_2 & {\color{green}b_3} \\
    c_1 & c_2 & {\color{green}c_3} \\
  \end{bmatrix} = 

One can also multiply a tensor with another tensor. Each column of the second tensor is transformed exactly as a vector would be.

    a_1 & a_2 & a_3 \\
    b_1 & b_2 & b_3 \\
    c_1 & c_2 & c_3 \\
    1 & 0 & 0 \\
    0 & 1 & 0 \\
    0 & 0 & 1 \\
  \end{bmatrix}  = 
    a_1 & a_2 & a_3 \\
    b_1 & b_2 & b_3 \\
    c_1 & c_2 & c_3 \\

And we can also switch things around:

    1 & 0 & 0 \\
    0 & 0 & 1 \\
    0 & 1 & 0 \\
  \end{bmatrix} = 
    0 & 0 & 1 \\
    0 & 1 & 0 \\
    1 & 0 & 0 \\
  \end{bmatrix} = 
    0 & 1 & 0 \\
    1 & 0 & 0 \\
    0 & 0 & 1 \\
  \end{bmatrix} = 

This is called a Permutation matrix*. See also Permutation group*.

Complex numbers can be used to represent and perform rotations but only in 2 dimensions.

Tensors, on the other hand, can be used in any number of dimensions to represent and perform rotations and other linear transformations. See the image to the right.

Any affine transformation is equivalent to a linear transformation followed by a translation of the origin. (The origin is always a fixed point for any linear transformation.) "Translation" is just a fancy word for "move".

Just as a vector is a sum of unit vectors multiplied by constants so a tensor is a sum of unit dyadics (e_1 \otimes e_2) multiplied by constants. Each dyadic is associated with a certain plane segment having a certain orientation and magnitude.

The order or degree of the tensor is the dimension of the tensor which is the total number of indices required to identify each component uniquely.[6] A vector is a 1st-order tensor.

A simple tensor is a tensor that can be written as a product of tensors of the form T=a\otimes b\otimes\cdots\otimes d. (See Outer Product above.) The rank of a tensor T is the minimum number of simple tensors that sum to T.[7] A bivector is a tensor of rank 2.

The Determinant of a matrix is the area or volume spanned by its column vectors and is frequently useful.


\begin{pmatrix}0&1&0&0\\0&0&1&0\\0&0&0&1\\0&0&0&0\end{pmatrix}, \quad

\begin{pmatrix}0&0&1&0\\0&0&0&1\\0&0&0&0\\0&0&0&0\end{pmatrix}, \quad

\begin{pmatrix}0&0&0&1\\0&0&0&0\\0&0&0&0\\0&0&0&0\end{pmatrix}, \quad


From Wikipedia:Matrix similarity

In linear algebra, two n-by-n matrices A and B are called similar if

B = P^{-1} A P

for some invertible n-by-n matrix P. Similar matrices represent the same linear operator* under two (possibly) different bases*, with P being the change of basis* matrix.[8][9]

A transformation AP−1AP is called a similarity transformation or conjugation of the matrix A. In the general linear group*, similarity is therefore the same as conjugacy*, and similar matrices are also called conjugate; however in a given subgroup H of the general linear group, the notion of conjugacy may be more restrictive than similarity, since it requires that P be chosen to lie in H.

Decomposition of tensors

Every tensor of degree 2 can be decomposed into a symmetric and an anti-symmetric tensor

 \text{Symmetric tensor} =
    x & a & b \\
    a & y & c \\
    b & c & z \\
 \text{Anti-symmetric tensor} =
    0 & a & b \\
    -a & 0 & c \\
    -b & -c & 0 \\

The Outer product (tensor product) of a vector with itself is a symmetric tensor:

  \end{bmatrix} \otimes
  \end{bmatrix} =
    xx & xy & xz \\
    yx & yy & yz \\
    zx & zy & zz \\

The wedge product of 2 vectors is anti-symmetric:

\mathbf{u} \wedge \mathbf{v} = \mathbf{u} \otimes \mathbf{v} - \mathbf{v} \otimes \mathbf{u} =

0 & u_1 v_2 - v_1 u_2 & u_1 v_3 - v_1 u_3  \\
u_2 v_1 - v_2 u_1 & 0 & u_2 v_3 - v_2 u_3 \\
u_3 v_1 - v_3 u_1 & u_3 v_2 - v_3 u_2 & 0

Any n\times n matrix X with complex entries can be expressed as

X = A + N \,


  • A is diagonalizable
  • N is nilpotent
  • A commutes with N (i.e. AN = NA)

This is the Jordan–Chevalley decomposition*.

Block matrix

From Wikipedia:Block matrix

The matrix

\mathbf{P} = \begin{bmatrix}
1 & 1 & 2 & 2\\
1 & 1 & 2 & 2\\
3 & 3 & 4 & 4\\
3 & 3 & 4 & 4\end{bmatrix}

can be partitioned into 4 2×2 blocks

\mathbf{P}_{11} = \begin{bmatrix}
1 & 1 \\
1 & 1 \end{bmatrix},   \mathbf{P}_{12} = \begin{bmatrix}
2 & 2\\
2 & 2\end{bmatrix},  \mathbf{P}_{21} = \begin{bmatrix}
3 & 3 \\
3 & 3 \end{bmatrix},   \mathbf{P}_{22} = \begin{bmatrix}
4 & 4\\
4 & 4\end{bmatrix}.

The partitioned matrix can then be written as

\mathbf{P} = \begin{bmatrix}
\mathbf{P}_{11} & \mathbf{P}_{12}\\
\mathbf{P}_{21} & \mathbf{P}_{22}\end{bmatrix}.

the matrix product


can be formed blockwise, yielding \mathbf{C} as an (m\times n) matrix with q row partitions and r column partitions. The matrices in the resulting matrix \mathbf{C} are calculated by multiplying:

\mathbf{C}_{\alpha \beta} = \sum^s_{\gamma=1}\mathbf{A}_{\alpha \gamma}\mathbf{B}_{\gamma \beta}.

Or, using the Einstein notation* that implicitly sums over repeated indices:

\mathbf{C}_{\alpha \beta} = \mathbf{A}_{\alpha \gamma}\mathbf{B}_{\gamma \beta}.

Back to top

Linear groups

A square matrix of order n is an n-by-n matrix. Any two square matrices of the same order can be added and multiplied. A matrix is invertible if and only if its determinant is nonzero.

GLn(F) or GL(n, F), or simply GL(n) is the Lie group* of n×n invertible matrices with entries from the field F. The group GL(n, F) and its subgroups are often called linear groups or matrix groups.

SL(n, F) or SLn(F), is the subgroup* of GL(n, F) consisting of matrices with a determinant of 1.
U(n), the Unitary group of degree n is the group of n × n unitary matrices. (More general unitary matrices may have complex determinants with absolute value 1, rather than real 1 in the special case.) The group operation is matrix multiplication.[10]
SU(n), the special unitary group of degree n, is the Lie group* of n×n unitary matrices with determinant 1.

Back to top

Symmetry groups

Affine group*

Poincaré group*: boosts, rotations, translations
Lorentz group*: boosts, rotations
The set of all boosts, however, does not form a subgroup, since composing two boosts does not, in general, result in another boost. (Rather, a pair of non-colinear boosts is equivalent to a boost and a rotation, and this relates to Thomas rotation.)

Aff(n,K): the affine group or general affine group of any affine space over a field K is the group of all invertible affine transformations from the space into itself.

E(n): rotations, reflections, and translations.
O(n): rotations, reflections
SO(n): rotations
so(3) is the Lie algebra of SO(3) and consists of all skew-symmetric 3 × 3 matrices.

Clifford group: The set of invertible elements x such that for all v in V x v \alpha(x)^{-1}\in V . The spinor norm* Q is defined on the Clifford group by Q(x) = x^\mathrm{t}x.

PinV(K): The subgroup of elements of spinor norm 1. Maps 2-to-1 to the orthogonal group
SpinV(K): The subgroup of elements of Dickson invariant 0 in PinV(K). When the characteristic is not 2, these are the elements of determinant 1. Maps 2-to-1 to the special orthogonal group. Elements of the spin group act as linear transformations on the space of spinors

Back to top


In 4 spatial dimensions a rigid object can rotate in 2 different ways simultaneously*.


Stereographic projection of four-dimensional Tesseract in double rotation

See also: Hypersphere of rotations*, Rotation group SO(3)*, Special unitary group*, Plate trick*, Spin representation*, Spin group*, Pin group*, Spinor*, Clifford algebra, Indefinite orthogonal group*, Root system*, Bivectors, Curl

From Wikipedia:Rotation group SO(3):

Consider the solid ball in R3 of radius π. For every point in this ball there is a rotation, with axis through the point and the origin, and rotation angle equal to the distance of the point from the origin. The two rotations through π and through −π are the same. So we identify* (or "glue together") antipodal points* on the surface of the ball.

The ball with antipodal surface points identified is a smooth manifold*, and this manifold is diffeomorphic* to the rotation group. It is also diffeomorphic to the real 3-dimensional projective space* RP3, so the latter can also serve as a topological model for the rotation group.

These identifications illustrate that SO(3) is connected* but not simply connected*. As to the latter, consider the path running from the "north pole" straight through the interior down to the south pole. This is a closed loop, since the north pole and the south pole are identified. This loop cannot be shrunk to a point, since no matter how you deform the loop, the start and end point have to remain antipodal, or else the loop will "break open".

Belt Trick

A set of belts can be continuously rotated without becoming twisted or tangled. The cube must go through two full rotations for the system to return to its initial state. See Tangloids*.

Surprisingly, if you run through the path twice, i.e., run from north pole down to south pole, jump back to the north pole (using the fact that north and south poles are identified), and then again run from north pole down to south pole, so that φ runs from 0 to 4π, you get a closed loop which can be shrunk to a single point: first move the paths continuously to the ball's surface, still connecting north pole to south pole twice. The second half of the path can then be mirrored over to the antipodal side without changing the path at all. Now we have an ordinary closed loop on the surface of the ball, connecting the north pole to itself along a great circle. This circle can be shrunk to the north pole without problems. The Balinese plate trick* and similar tricks demonstrate this practically.

The same argument can be performed in general, and it shows that the fundamental group* of SO(3) is cyclic group of order 2. In physics applications, the non-triviality of the fundamental group allows for the existence of objects known as spinors*, and is an important tool in the development of the spin-statistics theorem*.

Spin group

The universal cover* of SO(3) is a Lie group* called Spin(3)*. The group Spin(3) is isomorphic to the special unitary group* SU(2); it is also diffeomorphic to the unit 3-sphere* S3 and can be understood as the group of versors* (quaternions with absolute value 1). The connection between quaternions and rotations, commonly exploited in computer graphics, is explained in quaternions and spatial rotation*. The map from S3 onto SO(3) that identifies antipodal points of S3 is a surjective* homomorphism* of Lie groups, with kernel* {±1}. Topologically, this map is a two-to-one covering map*. (See the plate trick*.)

From Wikipedia:Spin group:

The spin group Spin(n)[11][12] is the double cover* of the special orthogonal group* SO(n) = SO(n, R), such that there exists a short exact sequence* of Lie groups* (with n ≠ 2)

1 \to \mathrm{Z}_2 \to \operatorname{Spin}(n) \to \operatorname{SO}(n) \to 1.

As a Lie group, Spin(n) therefore shares its dimension*, n(n − 1)/2, and its Lie algebra* with the special orthogonal group.

For n > 2, Spin(n) is simply connected* and so coincides with the universal cover* of SO(n)*.

The non-trivial element of the kernel is denoted −1, which should not be confused with the orthogonal transform of reflection through the origin*, generally denoted −I .

Spin(n) can be constructed as a subgroup* of the invertible elements in the Clifford algebra Cl(n). A distinct article discusses the spin representations*.

Back to top

Matrix representations

See also: Group representation*

Real numbers

If a vector is multiplied with the the identity matrix* I then the vector is completely unchanged:

 I \cdot v =
    1 & 0 & 0 \\
    0 & 1 & 0 \\
    0 & 0 & 1 \\
  \end{bmatrix} = 1
  \end{bmatrix} = 

And if A=a \cdot I then

 A \cdot v =
    a & 0 & 0 \\
    0 & a & 0 \\
    0 & 0 & a \\
  \end{bmatrix} = a
  \end{bmatrix} = 
    a\cdot x\\
    a \cdot y\\
    a\cdot z\\

Therefore A=a \cdot I can be thought of as the matrix form of the scalar a.

 A \cdot B =
    a & 0 & 0 \\
    0 & a & 0 \\
    0 & 0 & a \\
    b & 0 & 0 \\
    0 & b & 0 \\
    0 & 0 & b \\
  \end{bmatrix}  = 
    ab & 0 & 0 \\
    0 & ab & 0 \\
    0 & 0 & ab \\
 A + B =
    a & 0 & 0 \\
    0 & a & 0 \\
    0 & 0 & a \\
  \end{bmatrix} +
    b & 0 & 0 \\
    0 & b & 0 \\
    0 & 0 & b \\
  \end{bmatrix}  = 
    a + b & 0 & 0 \\
    0 & a + b & 0 \\
    0 & 0 & a + b \\
 A^B  = 
    a^b & 0 & 0 \\
    0 & a^b & 0 \\
    0 & 0 & a^b \\
    e^a & 0 & 0 \\
    0 & e^a & 0 \\
    0 & 0 & e^a \\
  \end{bmatrix}  .
\ln A=
    \ln a & 0 & 0 \\
    0 & \ln a & 0 \\
    0 & 0 & \ln a \\
  \end{bmatrix}  .

(Note: Not all matrices have a logarithm and those matrices that do have a logarithm may have more than one logarithm. The study of logarithms of matrices leads to Lie theory since when a matrix has a logarithm then it is in a Lie group and the logarithm is the corresponding element of the vector space of the Lie algebra.)

Back to top

Complex numbers

Complex numbers can also be written in matrix form in such a way that complex multiplication corresponds perfectly to matrix multiplication:

(a+ib)(c+id) &= 
     a  & b \\
    -b  & a 
     c  & d \\
    -d  & c 
\end{bmatrix} \\
     ac-bd  & ad+bc \\
    -(ad+bc)  & ac-bd

(i)(i) &= 
     0  & 1 \\
    -1  & 0 
     0  & 1 \\
    -1  & 0 
\end{bmatrix} \\
     -1  & 0 \\
    0  & -1
\end{bmatrix} \\ 
&= -I

 |z|^2 =
  a & -b  \\
  b &  a
= a^2 + b^2.

Back to top


There are at least two ways of representing quaternions as matrices in such a way that quaternion addition and multiplication correspond to matrix addition and matrix multiplication.

Using 2 × 2 complex matrices, the quaternion a + bi + cj + dk can be represented as

z_1 & z_2 \\
-\overline{z_2} & \overline{z_1} 
\end{bmatrix} = 
a+bi & c+di \\ 
-(c-di) & a-bi 
\end{bmatrix}= a
 1 & 0  \\ 
 0 & 1  \\
+ b
 i & 0  \\ 
 0 & -i  \\
+ c
 0 & 1  \\ 
 -1 & 0  \\
+ d
 0 & i  \\ 
 i & 0  \\

Using 4 × 4 real matrices, that same quaternion can be written as

 a & b & c & d \\ 
 -b & a & -d & c \\
 -c & d & a & -b \\
 -d & -c & b & a 
\end{bmatrix}= a
 1 & 0 & 0 & 0 \\ 
 0 & 1 & 0 & 0 \\
 0 & 0 & 1 & 0 \\
 0 & 0 & 0 & 1 
+ b
 0 & 1 & 0 & 0 \\ 
 -1 & 0 & 0 & 0 \\
 0 & 0 & 0 & -1 \\
 0 & 0 & 1 & 0 
+ c
 0 & 0 & 1 & 0 \\ 
 0 & 0 & 0 & 1 \\
 -1 & 0 & 0 & 0 \\
 0 & -1 & 0 & 0 
+ d
 0 & 0 & 0 & 1 \\ 
 0 & 0 & -1 & 0 \\
 0 & 1 & 0 & 0 \\
 -1 & 0 & 0 & 0 


b \cdot i &= b \cdot (e_1 \wedge e_2 + e_4 \wedge e_3)\\
c \cdot j &= c \cdot (e_1 \wedge e_3 + e_2 \wedge e_4)\\
d \cdot k &= d \cdot (e_1 \wedge e_4 + e_3 \wedge e_2)\\

The obvious way of representing quaternions with 3 × 3 real matrices does not work because:

 0 & -1 & 0  \\ 
 1 & 0 & 0  \\
 0 & 0 & 0  
\cdot c
 0 & 0 & -1  \\ 
 0 & 0 & 0  \\
 1 & 0 & 0  
\neq d
 0 & 0 & 0  \\ 
 0 & 0 & -1  \\
 0 & {\color{red}1} & 0

Back to top


See also: Split-complex numbers*

Unfortunately the matrix representation of a vector is not so obvious. First we must decide what properties the matrix should have. To see consider the square (quadratic form*) of a single vector:

Q(c) = c^2 = \langle c , c \rangle = (a\mathbf{e_1} + b\mathbf{e_2})^2
Q(c) = aa\mathbf{e_1}\mathbf{e_1} + bb\mathbf{e_2}\mathbf{e_2} + ab\mathbf{e_1}\mathbf{e_2} + ba\mathbf{e_2}\mathbf{e_1}
Q(c) = a^2\mathbf{e_1}\mathbf{e_1} + b^2\mathbf{e_2}\mathbf{e_2} + ab(\mathbf{e_1}\mathbf{e_2} + \mathbf{e_2}\mathbf{e_1})

From the Pythagorean theorem we know that:

c^2 = a^2 + b^2 + ab(0) = Scalar

So we know that

e_1^2 = e_2^2 = 1
e_1 e_2 = -e_2 e_1

The set of 3 matrices in 3 dimensions that have these properties are called Pauli matrices.

From Wikipedia:Pauli matrices

The Pauli matrices are a set of three 2 × 2 complex matrices which are Hermitian and unitary.[13] They are

  \sigma_1 = \sigma_x &=
    \end{pmatrix} \\
  \sigma_2 = \sigma_y &=
    \end{pmatrix} \\
  \sigma_3 = \sigma_z &=
    \end{pmatrix} \,.

Squaring a Pauli matrix results in a "scalar":

\sigma_1^2 = \sigma_2^2 = \sigma_3^2 = 
    \end{pmatrix} = \sigma_0 = I

Multiplication is anticommutative*:

\sigma_1  \sigma_2 = - \sigma_2  \sigma_1
\sigma_2  \sigma_3 = - \sigma_3  \sigma_2
\sigma_3  \sigma_1 = - \sigma_1  \sigma_3


\sigma_1 \sigma_2 \sigma_3 = \begin{pmatrix} i&0\\0&i\end{pmatrix} = i

commutation relations:

    \left[\sigma_1, \sigma_2\right] &= 2i\sigma_3 \, \\
    \left[\sigma_2, \sigma_3\right] &= 2i\sigma_1 \, \\
    \left[\sigma_3, \sigma_1\right] &= 2i\sigma_2 \, \\
    \left[\sigma_1, \sigma_1\right] &= 0\, \\

anticommutation* relations:

  \left\{\sigma_1, \sigma_1\right\} &= 2I\, \\
  \left\{\sigma_1, \sigma_2\right\} &= 0\,.\\

Exponential of a Pauli vector:

e^{i a(\hat{n} \cdot \vec{\sigma})} = I\cos{a} + i (\hat{n} \cdot \vec{\sigma}) \sin{a}

Adding the commutator to the anticommutator gives:

(\vec{a} \cdot \vec{\sigma})(\vec{b} \cdot \vec{\sigma}) = (\vec{a} \cdot \vec{b}) \, I + i ( \vec{a} \times \vec{b} )\cdot \vec{\sigma}

If  i is identified with the pseudoscalar  \sigma_x \sigma_y \sigma_z then the right hand side becomes  a \cdot b + a \wedge b which is also the definition for the geometric product of two vectors in geometric algebra (Clifford algebra). The geometric product of two vectors is a multivector.

Isomorphism to quaternions

Multiplying any 2 Pauli matrices results in a quaternion:

Quaternions form a division algebra*—every non-zero element has an inverse—whereas Pauli matrices do not.

& \begin{pmatrix} 
& \begin{pmatrix} 
& \begin{pmatrix}
\end{pmatrix} \\\hline
& \begin{pmatrix} 
& \begin{pmatrix}
& \begin{pmatrix}
\end{pmatrix} \\\hline
& \begin{pmatrix} 
& \begin{pmatrix}
& \begin{pmatrix}
\end{pmatrix} \\\hline
& \begin{pmatrix} 
& \begin{pmatrix}
& \begin{pmatrix}
\end{pmatrix} \\\hline
& \begin{pmatrix} 
& \begin{pmatrix}
& \begin{pmatrix}
& \sigma_1
& \sigma_2
& \sigma_3 \\\hline
& 1
& \hat{\imath}
& -\hat{\jmath} \\\hline
& -\hat{\imath}
& 1
& \hat{k} \\\hline
& \hat{\jmath}
& -\hat{k}
& 1 \\\hline
\sigma_1 \sigma_2 \sigma_3
& \hat{k}
& \hat{\jmath}
& \hat{\imath} 

And multiplying a Pauli matrix and a quaternion results in a Pauli matrix:

& \begin{pmatrix} 
& \begin{pmatrix} 
& \begin{pmatrix}
\end{pmatrix} \\\hline

& \begin{pmatrix} 
& \begin{pmatrix}
& \begin{pmatrix}
\end{pmatrix} \\\hline

& \begin{pmatrix} 
& \begin{pmatrix}
& \begin{pmatrix}
\end{pmatrix} \\\hline

& \begin{pmatrix} 
& \begin{pmatrix}
& \begin{pmatrix}
\end{pmatrix} \\
& \sigma_1
& \sigma_2
& \sigma_3 \\\hline
\hat{\imath} = \sigma_1 \sigma_2 
& -\sigma_2
& \sigma_1
& \sigma_1 \sigma_2 \sigma_3 \\\hline
\hat{\jmath} = \sigma_3 \sigma_1 
& \sigma_3
& \sigma_1 \sigma_2 \sigma_3
&  -\sigma_1\\\hline
\hat{k} = \sigma_2 \sigma_3
& \sigma_1 \sigma_2 \sigma_3
& -\sigma_3
&  \sigma_2\\

It would appear therefore that quaternions are to the matrix representations of vectors what bivectors are to ordinary vectors.

Note: The (real) spinors* in three-dimensions are quaternions, and the action of an even-graded element on a spinor is given by ordinary quaternionic multiplication.[14]

Further reading: Generalizations of Pauli matrices*, Gell-Mann matrices* and Pauli equation*

Back to top


See also: Dirac algebra*

External links:

A brief introduction to geometric algebra
A brief introduction to Clifford algebra
The Construction of Spinors in Geometric Algebra
Functions of Multivector Variables

From Wikipedia:Multivector:

The wedge product operation (See Exterior algebra) used to construct multivectors is linear, associative and alternating, which reflect the properties of the determinant. This means for vectors u, v and w in a vector space V and for scalars α, β, the wedge product has the properties,

  • Linear:  \mathbf{u}\wedge(\alpha\mathbf{v}+\beta\mathbf{w})=\alpha\mathbf{u}\wedge\mathbf{v}+\beta\mathbf{u}\wedge\mathbf{w};
  • Associative:  (\mathbf{u}\wedge\mathbf{v})\wedge\mathbf{w}=\mathbf{u}\wedge(\mathbf{v}\wedge\mathbf{w})=\mathbf{u}\wedge\mathbf{v}\wedge\mathbf{w};
  • Alternating:  \mathbf{u}\wedge\mathbf{v}=-\mathbf{v}\wedge\mathbf{u}, \quad\mathbf{u}\wedge\mathbf{u}=0.

However the wedge product is not invertible because many different pairs of vectors can have the same wedge product.

The product of p vectors, (\mathbf{v_1}\wedge\dots\wedge\mathbf{v_n}), is called a grade p multivector, or a p-vector. The maximum grade of a multivector is the dimension of the vector space V.

The set of all possible products of n orthogonal basis vectors with indices in increasing order, including 1 as the empty product, forms a basis for the entire geometric algebra (an analogue of the PBW theorem*).

Canonical basis

For example, the following is a basis for the geometric algebra \mathcal{G}(3,0):


A basis formed this way is called a canonical basis for the geometric algebra, and any other orthogonal basis for V will produce another canonical basis. Each canonical basis consists of 2^n elements. Every multivector of the geometric algebra can be expressed as a linear combination of the canonical basis elements.

The general element of the Clifford algebra Cℓ0,3(R) is given by

 A = a_0 + a_1 e_1 + a_2 e_2 + a_3 e_3 + a_4 e_2 e_3 + a_5 e_3 e_1 + a_6 e_1 e_2 + a_7 e_1 e_2 e_3.

The linear combination of the even degree elements of Cℓ0,3(R) defines the even subalgebra Cℓ[0]
(R) with the general element

 q = q_0 + q_1 e_2 e_3 + q_2 e_3 e_1 + q_3 e_1 e_2.

The basis elements can be identified with the quaternion basis elements i, j, k as

 i=  e_2 e_3, j=  e_3 e_1, k =  e_1 e_2,

The linearity of the wedge product allows a multivector to be defined as the linear combination of basis multivectors. There are (n
) basis p-vectors in an n-dimensional vector space.[15]

W. K. Clifford* combined multivectors with the inner product defined on the vector space, in order to obtain a general construction for hypercomplex numbers that includes the usual complex numbers and Hamilton's quaternions.[16][17]

The Clifford product between two vectors is linear and associative like the wedge product. But unlike the wedge product the Clifford product is invertible.

Clifford's relation preserves the alternating property for the product of vectors that are perpendicular. But in contrast to the wedge product, the Clifford product of a vector with itself is no longer zero.

We know that velocity is a vector and that velocity^2 = energy. We also know that energy is a scalar.

v^2=(v_1 e_1 + v_2 e_2)^2=v_1^2 e_1^2 + v_2^2 e_2^2 + v_1 v_2 e_1 e_2 + v_2 v_1 e_2 e_1 = v_1^2 \cdot 1 + v_2^2 \cdot 1 = scalar

Therefore the rules of Clifford algebra require:

\mathbf{e_1}\mathbf{e_1} = +1

\mathbf{e_2}\mathbf{e_2} = +1

\mathbf{e_1}\mathbf{e_2} = -\mathbf{e_2}\mathbf{e_1}

Now would be a good time to point out that \mathbf{e_1} and \mathbf{e_2} are in reality gamma matrices not vectors. Gamma matrices are constructed in such a way as to cause the mathematical relationships shown above to be true. See below

And futher that:

(\mathbf{e_1}\mathbf{e_2})^2 = \mathbf{e_1}\mathbf{e_2}\mathbf{e_1}\mathbf{e_2} = -\mathbf{e_1}\mathbf{e_1}\mathbf{e_2}\mathbf{e_2} = -1

\mathbf{e_1}\mathbf{e_2} = i = Bivector?

And i, as we already know, has the effect of rotating complex numbers.

(\mathbf{e_1}\mathbf{e_2}\mathbf{e_3})^2 = -1

(\mathbf{e_1}\mathbf{e_2}\mathbf{e_3}\mathbf{e_4})^2 = 1

(\mathbf{e_1}\mathbf{e_2}\mathbf{e_3}\mathbf{e_4}\mathbf{e_5})^2 = 1

For any 2 arbitrary vectors:

fd = force*distance
fd = (f_1\mathbf{e_1} + f_2\mathbf{e_2})(d_1\mathbf{e_1} + d_2\mathbf{e_2})
fd = f_1d_1\mathbf{e_1}\mathbf{e_1} + f_2d_2\mathbf{e_2}\mathbf{e_2} + f_1d_2\mathbf{e_1}\mathbf{e_2} + f_2d_1\mathbf{e_2}\mathbf{e_1}
fd = f_1d_1\mathbf{e_1}\mathbf{e_1} + f_2d_2\mathbf{e_2}\mathbf{e_2} + f_1d_2\mathbf{e_1}\mathbf{e_2} - f_2d_1\mathbf{e_1}\mathbf{e_2}
fd = f_1d_1\mathbf{e_1}\mathbf{e_1} + f_2d_2\mathbf{e_2}\mathbf{e_2} + (f_1d_2 - f_2d_1)\mathbf{e_1}\mathbf{e_2}

Applying the rules of Clifford algebra we get:

fd = f_1d_1 + f_2d_2 + (f_1d_2 - f_2d_1)\mathbf{e_1} \wedge \mathbf{e_2}
fd = Energy + Torque
fd = {\color{red} f \cdot d} + {\color{blue} f \wedge d}
fd = {\color{red} Scalar} + {\color{blue} Bivector} = Multivector

For comparison here is the outer product of the same 2 vectors:

f \otimes d =


f_1 \mathbf{e}_1 \\

f_2 \mathbf{e}_2 \\



d_1 \mathbf{e}_1,

d_2 \mathbf{e}_2

\end{bmatrix} =


{\color{red}  f_1 d_1 \mathbf{e}_{11} } &

{\color{blue} f_1 d_2 \mathbf{e}_{12} } \\

{\color{blue} f_2 d_1 \mathbf{e}_{21} } &

{\color{red}  f_2 d_2 \mathbf{e}_{22} } 

\end{bmatrix} (See divergence, curl, & gradient below)

This particular Clifford algebra is known as Cl2,0. The subscript 2 indicates that the 2 basis vectors are square roots of +1. See Metric signature*. If we had used c^2 = -a^2 -b^2 then the result would have been Cl0,2.

From Wikipedia:Clifford algebra:

Every nondegenerate quadratic form on a finite-dimensional real vector space is equivalent to the standard diagonal form:

Q(v) = v^2 = v_1^2 + \cdots + v_p^2 - v_{p+1}^2 - \cdots - v_{p+q}^2 ,

where n = p + q is the dimension of the vector space. The pair of integers (p, q) is called the signature* of the quadratic form. The real vector space with this quadratic form is often denoted Rp,q. The Clifford algebra on Rp,q is denoted Cℓp,q(R). The symbol Cℓn(R) means either Cℓn,0(R) or Cℓ0,n(R) depending on whether the author prefers positive-definite or negative-definite spaces.

A standard basis {ei} for Rp,q consists of n = p + q mutually orthogonal vectors, p of which square to +1 and q of which square to −1. The algebra Cℓp,q(R) will therefore have p vectors that square to +1 and q vectors that square to −1.

From Wikipedia:Spacetime algebra:

Spacetime algebra* (STA) is a name for the Clifford algebra Cl1,3(R), or equivalently the geometric algebra G(M4), which can be particularly closely associated with the geometry of special relativity and relativistic spacetime. See also Algebra of physical space*.

The spacetime algebra may be built up from an orthogonal basis of one time-like vector \gamma_0 and three space-like vectors, \{\gamma_1, \gamma_2, \gamma_3\}, with the multiplication rule

 \gamma_\mu \gamma_\nu + \gamma_\nu \gamma_\mu = 2 \eta_{\mu \nu}

where \eta_{\mu \nu} is the Minkowski metric with signature (+ − − −).

Thus, \gamma_0^2 = {+1}, \gamma_1^2 = \gamma_2^2 = \gamma_3^2 = {-1}, otherwise \gamma_\mu \gamma_\nu = - \gamma_\nu \gamma_\mu.

The basis vectors \gamma_k share these properties with the Gamma matrices*, but no explicit matrix representation need be used in STA.

Associated with the orthogonal basis \{\gamma_\mu\} is the reciprocal basis \{\gamma^\mu = {\gamma_\mu}^{-1}\} for \mu = 0, \dots, 3, satisfying the relation

\gamma_\mu \cdot \gamma^\nu = {\delta_\mu}^\nu .
\delta^{i}_{j} = \begin{cases} 0 & (i \ne j), \\ 1 & (i = j). \end{cases} (See Kronecker delta*)

These reciprocal frame vectors differ only by a sign, with \gamma^0 = \gamma_0, and \gamma^k = -\gamma_k for k = 1, \dots, 3.

A vector may be represented in either upper or lower index coordinates a = a^\mu \gamma_\mu = a_\mu \gamma^\mu with summation over \mu = 0, \dots, 3, according to the Einstein notation*, where the coordinates may be extracted by taking dot products with the basis vectors or their reciprocals.

\begin{align}a \cdot \gamma^\nu &= a^\nu \\ a \cdot \gamma_\nu &= a_\nu .\end{align}

Back to top

Gamma matrices

See also: Electron magnetic moment*
From Wikipedia:Gamma matrices

Gamma matrices*,  \{ \gamma^0, \gamma^1, \gamma^2, \gamma^3 \} , also known as the Dirac matrices, are a set of 4 × 4 conventional matrices with specific anticommutation* relations that ensure they generate* a matrix representation of the Clifford algebra C1,3(R). One gamma matrix squares to 1 times the identity matrix* and three gamma matrices square to -1 times the identity matrix.

(\gamma^0)^2 = I

(\gamma^1)^2 = (\gamma^2)^2 = (\gamma^3)^2 = -I

The defining property for the gamma matrices to generate a Clifford algebra is the anticommutation relation

\displaystyle\{ \gamma^\mu, \gamma^\nu \} = \gamma^\mu \gamma^\nu + \gamma^\nu \gamma^\mu = 2 \eta^{\mu \nu} I_4

where \{ , \} is the anticommutator*, \eta^{\mu \nu} is the Minkowski metric* with signature (+ − − −) and I_4 is the 4 × 4 identity matrix.

Minkowski metric

From Wikipedia:Minkowski_space#Minkowski_metric

The simplest example of a Lorentzian manifold is flat spacetime*, which can be given as R4 with coordinates (t,x,y,z) and the metric

ds^2 = -c^2 dt^2 + dx^2 + dy^2 + dz^2 = \eta_{\mu\nu} dx^{\mu} dx^{\nu}. \,

Note that these coordinates actually cover all of R4. The flat space metric (or Minkowski metric*) is often denoted by the symbol η and is the metric used in special relativity*.

A standard basis for Minkowski space is a set of four mutually orthogonal vectors { e0, e1, e2, e3 } such that

-\eta(e_0, e_0) = \eta(e_1, e_1) = \eta(e_2, e_2) = \eta(e_3, e_3) = 1 .

These conditions can be written compactly in the form

\eta(e_\mu, e_\nu) = \eta_{\mu \nu}.

Relative to a standard basis, the components of a vector v are written (v0, v1, v2, v3) where the Einstein summation convention* is used to write v = vμeμ. The component v0 is called the timelike component of v while the other three components are called the spatial components. The spatial components of a 4-vector v may be identified with a 3-vector v = (v1, v2, v3).

In terms of components, the Minkowski inner product between two vectors v and w is given by

\eta(v, w) = \eta_{\mu \nu} v^\mu w^\nu =  v^0 w_0 + v^1 w_1 + v^2 w_2 + v^3 w_3 = v^\mu w_\mu = v_\mu w^\mu,


\eta(v, v) = \eta_{\mu \nu} v^\mu v^\nu =  v^0v_0 + v^1 v_1 + v^2 v_2 + v^3 v_3 = v^\mu v_\mu.

Here lowering of an index with the metric was used.

The Minkowski metric[18] η is the metric tensor of Minkowski space. It is a pseudo-Euclidean metric, or more generally a constant pseudo-Riemannian metric in Cartesian coordinates. As such it is a nondegenerate symmetric bilinear form, a type (0,2) tensor. It accepts two arguments u, v.

The definition

u \cdot v =\eta(u, v)

yields an inner product-like structure on M, previously and also henceforth, called the Minkowski inner product, similar to the Euclidean inner product, but it describes a different geometry. It is also called the relativistic dot product. If the two arguments are the same,

u \cdot u =\eta(u, u) \equiv ||u||^2 \equiv u^2,

the resulting quantity will be called the Minkowski norm squared.

This bilinear form can in turn be written as

 u \cdot  v =  u^{\mathrm T}[\eta] v,

where [η] is a 4×4 matrix associated with η. Possibly confusingly, denote [η] with just η as is common practice. The matrix is read off from the explicit bilinear form as

\eta = \pm \begin{pmatrix}-1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\end{pmatrix},

and the bilinear form

u \cdot v =\eta(u, v),

with which this section started by assuming its existence, is now identified.

When interpreted as the matrices of the action of a set of orthogonal basis vectors for contravariant* vectors in Minkowski space, the column vectors on which the matrices act become a space of spinors*, on which the Clifford algebra of spacetime* acts. This in turn makes it possible to represent infinitesimal spatial rotations* and Lorentz boosts. Spinors facilitate spacetime computations in general, and in particular are fundamental to the Dirac equation* for relativistic spin-½ particles.

In Dirac representation, the four contravariant* gamma matrices are

\gamma^0 &= \begin{pmatrix} 
  1 & 0 &  0 &  0 \\
  0 & 1 &  0 &  0 \\ 
  0 & 0 &  -1 &  0 \\
  0 & 0 &  0 & -1
\gamma^1 &= \begin{pmatrix}
   0 &  0 & 0 & 1 \\
   0 &  0 & 1 & 0 \\
   0 & -1 & 0 & 0 \\
  -1 &  0 & 0 & 0
\end{pmatrix} \\

\gamma^2 &= \begin{pmatrix}
   0 & 0 & 0 & -i \\
   0 & 0 & i &  0 \\
   0 & i & 0 &  0 \\
  -i & 0 & 0 &  0
\gamma^3 &= \begin{pmatrix}
   0 & 0 & 1 &  0 \\
   0 & 0 & 0 & -1 \\
  -1 & 0 & 0 &  0 \\
   0 & 1 & 0 &  0

\gamma^0 is the time-like matrix and the other three are space-like matrices.

(\gamma^0)^2 = \begin{pmatrix} 
  1 & 0 &  0 &  0 \\
  0 & 1 &  0 &  0 \\ 
  0 & 0 &  1 &  0 \\
  0 & 0 &  0 & 1

(\gamma^1)^2 = (\gamma^2)^2 = (\gamma^3)^2 = \begin{pmatrix} 
  -1 & 0 &  0 &  0 \\
  0 & -1 &  0 &  0 \\ 
  0 & 0 &  -1 &  0 \\
  0 & 0 &  0 & -1

The matrices are also sometimes written using the 2×2 [[[Wikipedia:[identity matrix|[identity matrix]][[[identity matrix|*]], I_2, and the Pauli matrices*.

The gamma matrices we have written so far are appropriate for acting on Dirac spinors* written in the Dirac basis; in fact, the Dirac basis is defined by these matrices. To summarize, in the Dirac basis:

\gamma^0 = \begin{pmatrix} I_2 & 0 \\ 0 & -I_2 \end{pmatrix},\quad \gamma^k = \begin{pmatrix} 0 & \sigma^k \\ -\sigma^k & 0 \end{pmatrix},\quad \gamma^5 = \begin{pmatrix} 0 & I_2 \\ I_2 & 0 \end{pmatrix}.

Another common choice is the Weyl or chiral basis,[19] in which \gamma^k remains the same but \gamma^0 is different, and so \gamma^5 is also different, and diagonal,

\gamma^0 = \begin{pmatrix} 0 & I_2 \\ I_2 & 0 \end{pmatrix},\quad \gamma^k = \begin{pmatrix} 0 & \sigma^k \\ -\sigma^k & 0 \end{pmatrix},\quad \gamma^5 = \begin{pmatrix} -I_2 & 0 \\ 0 & I_2 \end{pmatrix},
Original Dirac matrices
Source: Weisstein, Eric W. "Dirac Matrices." From MathWorld--A Wolfram Web Resource.

& \begin{pmatrix}  
& \begin{pmatrix} 
& \begin{pmatrix} 
\end{pmatrix}       \\\hline
& {\color{red} \begin{pmatrix}  
\end{pmatrix} }
& {\color{red} \begin{pmatrix} 
\end{pmatrix} }
& {\color{red} \begin{pmatrix} 
\end{pmatrix} }     \\\hline
& \begin{pmatrix}  
& \begin{pmatrix} 
& \begin{pmatrix} 
\end{pmatrix}      \\\hline
{\color{red} \begin{pmatrix}  
& \begin{pmatrix}  
& \begin{pmatrix} 
& \begin{pmatrix} 

& \sigma_1
& \sigma_2
& \sigma_3         \\\hline
& \alpha_1
& \alpha_2
& \alpha_3        \\\hline
& y_1
& y_2
& y_3        \\\hline
& \delta_1
& \delta_2
& \delta_3 
\quad \text{where} \quad 
\sigma_0 &= I_4 \\
-\rho_1 &= y_5 \\
\rho_2 &= \alpha_5 \\
\rho_3 &= \alpha_4 = y_4
\quad \text{and} \quad 
\sigma_i = I_2 \otimes \sigma_i^{(p)} \\
\rho_i = \sigma_i^{(p)} \otimes I_2

where \sigma_i^{(p)} are the Pauli matrices and \otimes is the Kronecker product* (not the tensor product)

I_2 = 
    \end{pmatrix}, \quad
  \sigma_1^{(p)} =
    \end{pmatrix}, \quad
  \sigma_2^{(p)} =
    \end{pmatrix}, \quad
  \sigma_3^{(p)} =

The 16 original Dirac matrices form six anticommuting sets of five matrices each (Arfken 1985, p. 214):

  1. \alpha_1, \alpha_2, \alpha_3, \quad      \rho_3, \rho_2 \quad      (\alpha_4, \alpha_5)
  2. y_1, y_2, y_3, \quad      \rho_3, -\rho_1 \quad      (y_4, y_5)
  3. \delta_1, \delta_2, \delta_3, \quad      \rho_1, \rho_2
  4. \alpha_1, y_1, \delta_1, \quad      \sigma_2, \sigma_3
  5. \alpha_2, y_2, \delta_2, \quad      \sigma_1, \sigma_3
  6. \alpha_3, y_3, \delta_3, \quad      \sigma_1, \sigma_2

Any of the 15 original Dirac matrices (excluding the identity matrix \sigma_0) anticommute with eight other original Dirac matrices and commute with the remaining eight, including itself and the identity matrix.

Any of the 16 original Dirac matrices multiplied times itself equals I_4

Higher-dimensional gamma matrices

Analogous sets of gamma matrices can be defined in any dimension* and for any signature of the metric. For example, the Pauli matrices are a set of "gamma" matrices in dimension 3 with metric of Euclidean signature (3,0). In 5 spacetime dimensions, the 4 gammas above together with the fifth gamma matrix to be presented below generate the Clifford algebra.

It is useful to define the product of the four gamma matrices as follows:

 \gamma^5 := i\gamma^0\gamma^1\gamma^2\gamma^3 = \begin{pmatrix}
  0 & 0 & 1 & 0 \\
  0 & 0 & 0 & 1 \\
  1 & 0 & 0 & 0 \\
  0 & 1 & 0 & 0
\end{pmatrix} (in the Dirac basis).

Although \gamma^5 uses the letter gamma, it is not one of the gamma matrices of C1,3(R). The number 5 is a relic of old notation in which

\gamma^0 was called "\gamma^4".

From Wikipedia:Higher-dimensional gamma matrices

Consider a space-time of dimension d with the flat Minkowski metric*,

 \eta = \parallel \eta_{a b} \parallel = \text{diag}(+1,-1, \dots, -1) ~,

where a,b = 0,1, ..., d−1. Set N= 2d/2⌋. The standard Dirac matrices correspond to taking d = N = 4.

The higher gamma matrices are a d-long sequence of complex N×N matrices \Gamma_i,\ i=0,\ldots,d-1 which satisfy the anticommutator* relation from the Clifford algebra* Cℓ1,d−1(R) (generating a representation for it),

 \{ \Gamma_a ~,~ \Gamma_b \} = \Gamma_a\Gamma_b + \Gamma_b\Gamma_a = 2 \eta_{a b} I_N ~,

where IN is the identity matrix* in N dimensions. (The spinors acted on by these matrices have N components in d dimensions.) Such a sequence exists for all values of d and can be constructed explicitly, as provided below.

The gamma matrices have the following property under hermitian conjugation,

 \Gamma_0^\dagger= +\Gamma_0 ~,~ \Gamma_i^\dagger= -\Gamma_i
~(i=1,\dots,d-1) ~.

Further reading: Quan­tum Me­chan­ics for En­gi­neers and How (not) to teach Lorentz covariance of the Dirac equation

Back to top


See also: Rotor (mathematics)*
From Wikipedia:Geometric algebra

The inverse of a vector is:

 v^{-1} = \frac{1}{v} = \frac{v}{vv} = \frac{v}{v \cdot v + v \wedge v} = \frac{v}{v \cdot v}

The projection of v onto a (or the parallel part) is

 v_{\| a} = (v \cdot a)a^{-1}

and the rejection of v from a (or the orthogonal part) is

 v_{\perp a} = v - v_{\| a} = (v\wedge a)a^{-1} .

The reflection v' of a vector v along a vector a, or equivalently across the hyperplane orthogonal to a, is the same as negating the component of a vector parallel to a. The result of the reflection will be

v' = {-v_{\| a} + v_{\perp a}} = {-(v \cdot a)a^{-1} + (v \wedge a)a^{-1}}
= {(-a \cdot v - a \wedge v)a^{-1}}
= -ava^{-1}

If a is a unit vector then a^{-1}=\frac{a}{1} = a and therefore v' = -ava

-ava is called the sandwich product which is called a double-sided product.

If we have a product of vectors R = a_1a_2 \cdots a_r then we denote the reverse as

R^\dagger = (a_1a_2\cdots a_r)^\dagger = a_r\cdots a_2 a_1.

Any rotation is equivalent to 2 reflections.

v'' = bv'b = bavab = RvR^\dagger

R is called a Rotor

R = ba = b \cdot a + b \wedge a = Scalar + Bivector = Multivector

If a and b are unit vectors then the rotor is automatically normalised:

RR^\dagger = R^\dagger R=1 .

2 rotations becomes:

R_2R_1MR_1^\dagger R_2^\dagger

R2R1 represents Rotor R1 rotated by Rotor R2. This would be called a single-sided transformation. (R2R1R2 would be double-sided.) Therefore rotors do not transform double-sided the same way that other objects do. They transform single-sided.

Back to top


The square root of the product of a quaternion with its conjugate is called its norm*:

\lVert q \rVert = \sqrt{qq^*} = \sqrt{q^*q} = \sqrt{a^2 + b^2 + c^2 + d^2}

A unit quaternion is a quaternion of norm one. Unit quaternions, also known as versors*, provide a convenient mathematical notation for representing orientations and rotations of objects in three dimensions.

From Wikipedia:Quaternions and spatial rotation

Every nonzero quaternion has a multiplicative inverse

(a+bi+cj+dk)^{-1} = \frac{1}{a^2+b^2+c^2+d^2}\,(a-bi-cj-dk).

Thus quaternions form a division algebra*.

The inverse of a unit quaternion is obtained simply by changing the sign of its imaginary components.

A 3-D Euclidean vector* such as (2, 3, 4) or (ax, ay, az) can be rewritten as 0 + 2 i + 3 j + 4 k or 0 + axi + ayj + azk, where i, j, k are unit vectors representing the three Cartesian axes*. A rotation through an angle of θ around the axis defined by a unit vector

\vec{u} = (u_x, u_y, u_z) = 0 + u_x\mathbf{i} + u_y\mathbf{j} + u_z\mathbf{k}

can be represented by a quaternion. This can be done using an extension* of Euler's formula:

 \mathbf{q} = e^{\frac{\theta}{2}{(0 + u_x\mathbf{i} + u_y\mathbf{j} + u_z\mathbf{k})}} = \cos \frac{\theta}{2} + (0 + u_x\mathbf{i} + u_y\mathbf{j} + u_z\mathbf{k}) \sin \frac{\theta}{2}

It can be shown that the desired rotation can be applied to an ordinary vector \mathbf{p} = (p_x, p_y, p_z) = 0 + p_x\mathbf{i} + p_y\mathbf{j} + p_z\mathbf{k} in 3-dimensional space, considered as a quaternion with a real coordinate equal to zero, by evaluating the conjugation of p by q:

\mathbf{p'} = \mathbf{q} \mathbf{p} \mathbf{q}^{-1}

using the Hamilton product*

The conjugate of a product of two quaternions is the product of the conjugates in the reverse order.

Conjugation by the product of two quaternions is the composition of conjugations by these quaternions: If p and q are unit quaternions, then rotation (conjugation) by pq is

\mathbf{p q} \vec{v} (\mathbf{p q})^{-1} = \mathbf{p q} \vec{v} \mathbf{q}^{-1} \mathbf{p}^{-1} = \mathbf{p} (\mathbf{q} \vec{v} \mathbf{q}^{-1}) \mathbf{p}^{-1},

which is the same as rotating (conjugating) by q and then by p. The scalar component of the result is necessarily zero.

The imaginary part b\mathbf{i} + c\mathbf{j} + d\mathbf{k} of a quaternion behaves like a vector \vec{v} = (b,c,d) in three dimension vector space, and the real part a behaves like a scalar* in R. When quaternions are used in geometry, it is more convenient to define them as a scalar plus a vector*:

a + b\mathbf{i} + c\mathbf{j} + d\mathbf{k} = a + \vec{v}.

When multiplying the vector/imaginary parts, in place of the rules i2 = j2 = k2 = ijk = −1 we have the quaternion multiplication rule:

\vec{v} \vec{w} = \vec{v} \times \vec{w} - \vec{v} \cdot \vec{w},

From these rules it follows immediately that (see details*):

(s + \vec{v}) (t + \vec{w}) = (s t - \vec{v} \cdot \vec{w}) + (s \vec{w} + t \vec{v} + \vec{v} \times \vec{w}).

It is important to note, however, that the vector part of a quaternion is, in truth, an "axial" vector or "pseudovector", not an ordinary or "polar" vector.

From Wikipedia:Quaternion:

the reflection of a vector r in a plane perpendicular to a unit vector w can be written:

r^{\prime} = - w\, r\, w.

Two reflections make a rotation by an angle twice the angle between the two reflection planes, so

v^{\prime\prime} = \sigma_2 \sigma_1 \, v \, \sigma_1 \sigma_2

corresponds to a rotation of 180° in the plane containing σ1 and σ2.

This is very similar to the corresponding quaternion formula,

v^{\prime\prime} = -\mathbf{k}\, v\, \mathbf{k}.

In fact, the two are identical, if we make the identification

\mathbf{k} = \sigma_2 \sigma_1, \mathbf{i} = \sigma_3 \sigma_2, \mathbf{j} = \sigma_1 \sigma_3

and it is straightforward to confirm that this preserves the Hamilton relations

\mathbf{i}^2 = \mathbf{j}^2 = \mathbf{k}^2 = \mathbf{i} \mathbf{j} \mathbf{k} = -1.

In this picture, quaternions correspond not to vectors but to bivectors – quantities with magnitude and orientations associated with particular 2D planes rather than 1D directions. The relation to complex numbers becomes clearer, too: in 2D, with two vector directions σ1 and σ2, there is only one bivector basis element σ1σ2, so only one imaginary. But in 3D, with three vector directions, there are three bivector basis elements σ1σ2, σ2σ3, σ3σ1, so three imaginaries.

The usefulness of quaternions for geometrical computations can be generalised to other dimensions, by identifying the quaternions as the even part Cℓ+3,0(R) of the Clifford algebra Cℓ3,0(R).

Back to top


See also: Bispinor*

External link:An introduction to spinors

Spinors may be regarded as non-normalised rotors which transform single-sided.[20]

Note: The (real) spinors* in three-dimensions are quaternions, and the action of an even-graded element on a spinor is given by ordinary quaternionic multiplication.[21]

A spinor transforms to its negative when the space is rotated through a complete turn from 0° to 360°. This property characterizes spinors.[22]

From Wikipedia:Orientation entanglement

In three dimensions...the Lie group* SO(3)* is not simply connected*. Mathematically, one can tackle this problem by exhibiting the special unitary group* SU(2), which is also the spin group* in three Euclidean* dimensions, as a double cover* of SO(3).

SU(2) is the following group,[23]

 \mathrm{SU}(2) = \left \{ \begin{pmatrix} \alpha&-\overline{\beta}\\ \beta & \overline{\alpha} \end{pmatrix}: \ \ \alpha,\beta\in\mathbf{C}, |\alpha|^2 + |\beta|^2 = 1\right \}  ~,

where the overline denotes complex conjugation*.

For comparison: Using 2 × 2 complex matrices, the quaternion a + bi + cj + dk can be represented as

a+bi & c+di \\ 
-(c-di) & a-bi 

If X = (x1,x2,x3) is a vector in R3, then we identify X with the 2 × 2 matrix with complex entries


Note that −det(X) gives the square of the Euclidean length of X regarded as a vector, and that X is a trace-free*, or better, trace-zero Hermitian matrix*.

The unitary group acts on X via

X\mapsto MXM^+

where M ∈ SU(2). Note that, since M is unitary,

\det(MXM^+) = \det(X), and
MXM^+ is trace-zero Hermitian.

Hence SU(2) acts via rotation on the vectors X. Conversely, since any change of basis* which sends trace-zero Hermitian matrices to trace-zero Hermitian matrices must be unitary, it follows that every rotation also lifts to SU(2). However, each rotation is obtained from a pair of elements M and −M of SU(2). Hence SU(2) is a double-cover of SO(3). Furthermore, SU(2) is easily seen to be itself simply connected by realizing it as the group of unit quaternions*, a space homeomorphic* to the 3-sphere*.

A unit quaternion has the cosine of half the rotation angle as its scalar part and the sine of half the rotation angle multiplying a unit vector along some rotation axis (here assumed fixed) as its pseudovector (or axial vector) part. If the initial orientation of a rigid body (with unentangled connections to its fixed surroundings) is identified with a unit quaternion having a zero pseudovector part and +1 for the scalar part, then after one complete rotation (2pi rad) the pseudovector part returns to zero and the scalar part has become -1 (entangled). After two complete rotations (4pi rad) the pseudovector part again returns to zero and the scalar part returns to +1 (unentangled), completing the cycle.

From Wikipedia:Spinors in three dimensions

The association of a spinor with a 2×2 complex Hermitian matrix* was formulated by Élie Cartan.[24]

In detail, given a vector x = (x1, x2, x3) of real (or complex) numbers, one can associate the complex matrix

\vec{x} \rightarrow X \ =\left(\begin{matrix}x_3&x_1-ix_2\\x_1+ix_2&-x_3\end{matrix}\right).

Matrices of this form have the following properties, which relate them intrinsically to the geometry of 3-space:

  • det X = – (length x)2.
  • X 2 = (length x)2I, where I is the identity matrix.
  • \frac{1}{2}(XY+YX)=({\bold x}\cdot{\bold y})I [24]
  • \frac{1}{2}(XY-YX)=iZ where Z is the matrix associated to the cross product z = x × y.
  • If u is a unit vector, then −UXU is the matrix associated to the vector obtained from x by reflection in the plane orthogonal to u.
  • It is an elementary fact from linear algebra* that any rotation in 3-space factors as a composition of two reflections. (Similarly, any orientation reversing orthogonal transformation is either a reflection or the product of three reflections.) Thus if R is a rotation, decomposing as the reflection in the plane perpendicular to a unit vector u1 followed by the plane perpendicular to u2, then the matrix U2U1XU1U2 represents the rotation of the vector x through R.

Having effectively encoded all of the rotational linear geometry of 3-space into a set of complex 2×2 matrices, it is natural to ask what role, if any, the 2×1 matrices (i.e., the column vectors*) play. Provisionally, a spinor is a column vector

\xi=\left[\begin{matrix}\xi_1\\\xi_2\end{matrix}\right], with complex entries ξ1 and ξ2.

The space of spinors is evidently acted upon by complex 2×2 matrices. Furthermore, the product of two reflections in a given pair of unit vectors defines a 2×2 matrix whose action on euclidean vectors is a rotation, so there is an action of rotations on spinors.

Often, the first example of spinors that a student of physics encounters are the 2×1 spinors used in Pauli's theory of electron spin. The Pauli matrices* are a vector of three 2×2 matrices* that are used as spin* operators*.

Given a unit vector* in 3 dimensions, for example (a, b, c), one takes a dot product* with the Pauli spin matrices to obtain a spin matrix for spin in the direction of the unit vector.

The eigenvectors* of that spin matrix are the spinors for spin-1/2 oriented in the direction given by the vector.

Example: u = (0.8, -0.6, 0) is a unit vector. Dotting this with the Pauli spin matrices gives the matrix:

  S_u = (0.8,-0.6,0.0)\cdot \vec{\sigma}=0.8 \sigma_{1}-0.6\sigma_{2}+0.0\sigma_{3} = \begin{bmatrix}
    0.0 & 0.8+0.6i \\
    0.8-0.6i & 0.0

The eigenvectors may be found by the usual methods of linear algebra*, but a convenient trick is to note that a Pauli spin matrix is an involutory matrix*, that is, the squareof the above matrix is the identity matrix.

Thus a (matrix) solution to the eigenvector problem with eigenvalues of ±1 is simply 1 ± Su. That is,

S_u (1\pm S_u) = \pm 1 (1 \pm S_u)

One can then choose either of the columns of the eigenvector matrix as the vector solution, provided that the column chosen is not zero. Taking the first column of the above, eigenvector solutions for the two eigenvalues are:

1.0+ (0.0)\\
0.0 +(0.8-0.6i)
1.0- (0.0)\\

The trick used to find the eigenvectors is related to the concept of ideals*, that is, the matrix eigenvectors (1 ± Su)/2 are projection operators* or idempotents* and therefore each generates an ideal in the Pauli algebra. The same trick works in any Clifford algebra*, in particular the Dirac algebra* that are discussed below. These projection operators are also seen in density matrix* theory where they are examples of pure density matrices.

More generally, the projection operator for spin in the (a, b, c) direction is given by


and any non zero column can be taken as the projection operator. While the two columns appear different, one can use a2 + b2 + c2 = 1 to show that they are multiples (possibly zero) of the same spinor.

From Wikipedia:Tensor#Spinors:

When changing from one orthonormal basis* (called a frame) to another by a rotation, the components of a tensor transform by that same rotation. This transformation does not depend on the path taken through the space of frames. However, the space of frames is not simply connected* (see orientation entanglement* and plate trick*): there are continuous paths in the space of frames with the same beginning and ending configurations that are not deformable one into the other. It is possible to attach an additional discrete invariant to each frame that incorporates this path dependence, and which turns out (locally) to have values of ±1.[25] A spinor* is an object that transforms like a tensor under rotations in the frame, apart from a possible sign that is determined by the value of this discrete invariant.[26][27]

Succinctly, spinors are elements of the spin representation* of the rotation group, while tensors are elements of its tensor representations*. Other classical groups* have tensor representations, and so also tensors that are compatible with the group, but all non-compact classical groups have infinite-dimensional unitary representations as well.

From Wikipedia:Spinor:

Quote from Elie Cartan: The Theory of Spinors, Hermann, Paris, 1966: "Spinors...provide a linear representation of the group of rotations in a space with any number n of dimensions, each spinor having 2^\nu components where n = 2\nu+1 or 2\nu." The star (*) refers to Cartan 1913.

(Note: \nu is the number of simultaneous independent rotations* an object can have in n dimensions.)

Although spinors can be defined purely as elements of a representation space of the spin group (or its Lie algebra of infinitesimal rotations), they are typically defined as elements of a vector space that carries a linear representation of the Clifford algebra. The Clifford algebra is an associative algebra that can be constructed from Euclidean space and its inner product in a basis independent way. Both the spin group and its Lie algebra are embedded inside the Clifford algebra in a natural way, and in applications the Clifford algebra is often the easiest to work with. After choosing an orthonormal basis of Euclidean space, a representation of the Clifford algebra is generated by gamma matrices, matrices that satisfy a set of canonical anti-commutation relations. The spinors are the column vectors on which these matrices act. In three Euclidean dimensions, for instance, the Pauli spin matrices are a set of gamma matrices, and the two-component complex column vectors on which these matrices act are spinors. However, the particular matrix representation of the Clifford algebra, hence what precisely constitutes a "column vector" (or spinor), involves the choice of basis and gamma matrices in an essential way. As a representation of the spin group, this realization of spinors as (complex) column vectors will either be irreducible if the dimension is odd, or it will decompose into a pair of so-called "half-spin" or Weyl representations if the dimension is even.

In three Euclidean dimensions, for instance, spinors can be constructed by making a choice of Pauli spin matrices corresponding to (angular momenta about) the three coordinate axes. These are 2×2 matrices with complex entries, and the two-component complex column vectors on which these matrices act by matrix multiplication are the spinors. In this case, the spin group is isomorphic to the group of 2×2 unitary matrices with determinant one, which naturally sits inside the matrix algebra. This group acts by conjugation on the real vector space spanned by the Pauli matrices themselves, realizing it as a group of rotations among them, but it also acts on the column vectors (that is, the spinors).

From Wikipedia:Spinor:

In the 1920s physicists discovered that spinors are essential to describe the intrinsic angular momentum, or "spin", of the electron and other subatomic particles. More precisely, it is the fermions of spin-1/2 that are described by spinors, which is true both in the relativistic and non-relativistic theory. The wavefunction of the non-relativistic electron has values in 2 component spinors transforming under three-dimensional infinitesimal rotations. The relativistic Dirac equation* for the electron is an equation for 4 component spinors transforming under infinitesimal Lorentz transformations for which a substantially similar theory of spinors exists.

Back to top

Maxwell's equations

From Wikipedia:Mathematical descriptions of the electromagnetic field

Analogous to the tensor formulation, two objects, one for the field and one for the current, are introduced. In geometric algebra (GA) these are multivectors. The field multivector, known as the Riemann–Silberstein vector*, is

 \bold{F} = \bold{E} + Ic\bold{B} = E^k\sigma_k + IcB^k\sigma_k

and the current multivector is

 c \rho - \bold{J} = c \rho - J^k\sigma_k

where, in the algebra of physical space* (APS) C\ell_{3,0}(\mathbb{R}) with the vector basis \{\sigma_k\}. The unit pseudoscalar* is I=\sigma_1\sigma_2\sigma_3 (assuming an orthonormal basis*). Orthonormal basis vectors share the algebra of the Pauli matrices*, but are usually not equated with them. After defining the derivative

 \boldsymbol{\nabla} = \sigma^k \partial_k

Maxwell's equations are reduced to the single equation[28]

 \left(\frac{1}{c}\dfrac{\partial }{\partial t} + \boldsymbol{\nabla}\right)\bold{F} = \mu_0 c (c \rho - \bold{J}).

In three dimensions, the derivative has a special structure allowing the introduction of a cross product:

 \boldsymbol{\nabla}\bold{F} = \boldsymbol{\nabla} \cdot \bold{F} + \boldsymbol{\nabla} \wedge \bold{F} = \boldsymbol{\nabla} \cdot \bold{F} + I \boldsymbol{\nabla} \times \bold{F}

from which it is easily seen that Gauss's law is the scalar part, the Ampère–Maxwell law is the vector part, Faraday's law is the pseudovector part, and Gauss's law for magnetism is the pseudoscalar part of the equation. After expanding and rearranging, this can be written as

\left( \boldsymbol{\nabla} \cdot \mathbf{E} - \frac{\rho}{\epsilon_0} \right)- c \left( \boldsymbol{\nabla} \times \mathbf{B} - \mu_0 \epsilon_0 \frac{\partial {\mathbf{E}}}{\partial {t}} - \mu_0 \mathbf{J} \right)+ I \left( \boldsymbol{\nabla} \times \mathbf{E} + \frac{\partial {\mathbf{B}}}{\partial {t}} \right)+ I c \left( \boldsymbol{\nabla} \cdot \mathbf{B} \right)= 0

We can identify APS as a subalgebra of the spacetime algebra* (STA) C\ell_{1,3}(\mathbb{R}), defining \sigma_k=\gamma_k\gamma_0 and I=\gamma_0\gamma_1\gamma_2\gamma_3. The \gamma_\mus have the same algebraic properties of the gamma matrices* but their matrix representation is not needed. The derivative is now

\nabla = \gamma^\mu \partial_\mu.

The Riemann–Silberstein becomes a bivector

F = \bold{E} + Ic\bold{B} = E^1\gamma_1\gamma_0 + E^2\gamma_2\gamma_0 + E^3\gamma_3\gamma_0 -c(B^1\gamma_2\gamma_3 + B^2\gamma_3\gamma_1 + B^3\gamma_1\gamma_2),

and the charge and current density become a vector

J = J^\mu \gamma_\mu = c \rho \gamma_0 + J^k \gamma_k = \gamma_0(c \rho - J^k \sigma_k).

Owing to the identity

\gamma_0 \nabla = \gamma_0\gamma^0 \partial_0 + \gamma_0\gamma^k\partial_k = \partial_0 + \sigma^k\partial_k = \frac{1}{c}\dfrac{\partial }{\partial t} + \boldsymbol{\nabla},

Maxwell's equations reduce to the single equation

 \nabla F = \mu_0 c J.

Back to top


Function machine2

A function f is like a "black box" that takes an input x, and returns a single corresponding output f(x).

Graph of example function

The red curve is the graph of a function f in the Cartesian plane, consisting of all points with coordinates of the form (x, f(x)). The property of having one output for each input is represented geometrically by the fact that each vertical line (such as the yellow line through the origin) has exactly one crossing point with the curve.

From Wikipedia:Function (mathematics)

In mathematics, a function is a relation* between a set* of inputs and a set of permissible outputs with the property that each input is related to exactly one output. An example is the function f(x)=x^2 that relates each real number* x to its square x2. The output of a function f corresponding to an input x is denoted by f(x) (read "f of x"). In this example, if the input is −3, then the output is 9, and we may write f(−3) = 9. See Tutorial:Evaluate by Substitution. Likewise, if the input is 3, then the output is also 9, and we may write f(3) = 9. (The same output may be produced by more than one input, but each input gives only one output.) The input variable(s)* are sometimes referred to as the argument(s) of the function.

Back to top

Euclids "common notions"

From Wikipedia:Euclidean geometry:

Things that do not differ from one another are equal to one another


Things that are equal to the same thing are also equal to one another

then a=c

If equals are added to equals, then the wholes are equal

then a+c=b+d

If equals are subtracted from equals, then the remainders are equal

then a-c=b-d

The whole is greater than the part.

If b≠0 then a+b>a

Back to top

Elementary algebra

Function x^2

ƒ(x) = x2 is an example of an even function.

From Wikipedia:Elementary algebra:

Elementary algebra builds on and extends arithmetic by introducing letters called variables* to represent general (non-specified) numbers.

Algebraic expressions may be evaluated and simplified, based on the basic properties of arithmetic operations (addition, subtraction, multiplication, division and exponentiation). For example,

  • Added terms are simplified using coefficients. For example, x + x + x can be simplified as 3x (where 3 is a numerical coefficient).
  • Multiplied terms are simplified using exponents. For example, x \times x \times x is represented as x^3
  • Like terms are added together,[29] for example, 2x^2 + 3ab - x^2 + ab is written as x^2 + 4ab, because the terms containing x^2 are added together, and, the terms containing ab are added together.
  • Expressions can be factored. For example, 6x^5 + 3x^2, by dividing both terms by 3x^2 can be written as 3x^2 (2x^3 + 1)
Function x^3

ƒ(x) = x3 is an example of an odd function.

For any function f, if a=b then:

  • f(a) = f(b)
  • a + c = b + c
  • ac = bc
  • a^c = b^c

One must be careful though when squaring both sides of an equation since this can result is solutions that dont satisfy the original equation.

1 \neq -1 yet 1^2 = -1^2

A function is an even function if f(x) = f(-x)

A function is an odd function if f(x) = -f(-x)

Back to top


Triangle with notations 2



The law of cosines reduces to the Pythagorean theorem when gamma=90 degrees

c^2 = a^2 + b^2 - 2ab\cos\gamma,

The law of sines (also known as the "sine rule") for an arbitrary triangle states:

\frac{a}{\sin A} = \frac{b}{\sin B} = \frac{c}{\sin C} = \frac{abc}{2\Delta},

where \Delta is the area of the triangle

\mbox{Area} = \Delta = \frac{1}{2}a b\sin C.

The law of tangents:


Back to top

Right triangles

A right triangle is a triangle with gamma=90 degrees.

For small values of x, sin x ≈ x. (If x is in radians).

SOH → sin = "opposite" / "hypotenuse"

CAH → cos = "adjacent" / "hypotenuse"

TOA → tan = "opposite" / "adjacent"


= sin A = a/c

= cos A = b/c

= tan A = a/b

\sin x = \frac{e^{ix} - e^{-ix}}{2i}, \qquad \cos x = \frac{e^{ix} + e^{-ix}}{2}, \qquad \tan x = \frac{i(e^{-ix} - e^{ix})}{e^{ix} + e^{-ix}}.

(Note: the expression of tan(x) has i in the numerator, not in the denominator, because the order of the terms (and thus the sign) of the numerator is changed w.r.t. the expression of sin(x).)

Back to top

Hyperbolic functions

See also: Hyperbolic angle*
From Wikipedia:Hyperbolic function:
Hyperbolic functions-2

A ray through the unit hyperbola* x^2 - y^2 = 1 in the point  (\cosh a, \sinh a), where a is twice the area between the ray, the hyperbola, and the x-axis. For points on the hyperbola below the x-axis, the area is considered negative (see animated version with comparison with the trigonometric (circular) functions).

Circular and hyperbolic angle

Circle and hyperbola tangent at (1,1) display geometry of circular functions in terms of circular sector* area u and hyperbolic functions depending on hyperbolic sector* area u.

Hyperbolic functions are analogs of the ordinary trigonometric, or circular, functions.

  • Hyperbolic sine:
\sinh x = \frac {e^x - e^{-x}} {2} = \frac {e^{2x} - 1} {2e^x} = \frac {1 - e^{-2x}} {2e^{-x}}.
  • Hyperbolic cosine:
\cosh x = \frac {e^x + e^{-x}} {2} = \frac {e^{2x} + 1} {2e^x} = \frac {1 + e^{-2x}} {2e^{-x}}.
  • Hyperbolic tangent:
\tanh x = \frac{\sinh x}{\cosh x} = \frac {e^x - e^{-x}} {e^x + e^{-x}} =
 = \frac{e^{2x} - 1} {e^{2x} + 1} = \frac{1 - e^{-2x}} {1 + e^{-2x}}.
  • Hyperbolic cotangent:
\coth x = \frac{\cosh x}{\sinh x} = \frac {e^x + e^{-x}} {e^x - e^{-x}} =
 = \frac{e^{2x} + 1} {e^{2x} - 1} = \frac{1 + e^{-2x}} {1 - e^{-2x}}, \qquad x \neq 0.
  • Hyperbolic secant:
\operatorname{sech} x = \frac{1}{\cosh x} = \frac {2} {e^x + e^{-x}} =
 =  \frac{2e^x} {e^{2x} + 1} = \frac{2e^{-x}} {1 + e^{-2x}}.
  • Hyperbolic cosecant:
\operatorname{csch} x = \frac{1}{\sinh x} = \frac {2} {e^x - e^{-x}} =
 = \frac{2e^x} {e^{2x} - 1} = \frac{2e^{-x}} {1 - e^{-2x}}, \qquad x \neq 0.

Back to top

Areas and volumes

The length of the circumference C of a circle is related to the radius r and diameter d by:

\mathrm{Circumference} = \tau r = 2 \pi r = \pi d
\pi = 3.141592654
\tau = 2 * π

The area of a circle is:

\mathrm{Area} = \pi r^2

The surface area of a sphere is

\mathrm{\text{Surface area}} = 4 \cdot \pi r^2
The surface area of a sphere 1 unit in radius is:
4 \pi (1 \text{ unit})^2 = 12.56637 \text{ unit}^2
The surface area of a sphere 128 units in radius is:
4 \pi (128 \text{ unit})^2 = 205,887 \text{ unit}^2

The volume inside a sphere is

\mathrm{Volume} = \frac{4}{3} \cdot \pi r^3
The volume of a sphere 1 unit in radius is:
V = \frac{4}{3} \cdot \pi (1 \text{ unit})^3 = 4.1888 \text{ unit}^3

The area of a hexagon is:

Area = \frac{3 \sqrt{3}}{2} a^2 = 2.59807621135 \cdot a^2
where a is the length of any side.

Back to top


See also: Runge's phenomenon*, Polynomial ring*, System of polynomial equations*, Rational root theorem*, Descartes' rule of signs*, and Complex conjugate root theorem*
From Wikipedia:Polynomial:

A polynomial can always be written in the form

polynomial = Z(x) = a_0 + a_1 x + a_2 x^2 + \dotsb + a_{n-1}x^{n-1} + a_n x^n

where a_0, \ldots, a_n are constants called coefficients and n is the degree of the polynomial.

A linear polynomial* is a polynomial of degree one.

Each individual term* is the product of the coefficient* and a variable raised to a nonnegative integer power.

A monomial* has only one term.
A binomial* has 2 terms.

Fundamental theorem of algebra*:

Every single-variable, degree n polynomial with complex coefficients has exactly n complex roots.
However, some or even all of the roots might be the same number.
A root (or zero) of a function is a value of x for which Z(x)=0.
Z(x) = a_n(x - z_1)(x - z_2)\dotsb(x - z_n)
If Z(x) = (x - z_1)(x - z_2)^k then z2 is a root of multiplicity* k.[30] z2 is a root of multiplicity k-1 of the derivative (Derivative is defined below) of Z(x).
If k=1 then z2 is a simple root.
The graph is tangent to the x axis at the multiple roots of f and not tangent at the simple roots.
The graph crosses the x-axis at roots of odd multiplicity and bounces off (not goes through) the x-axis at roots of even multiplicity.
Near x=z2 the graph has the same general shape as A(x - z_2)^k
The roots of the formula ax^2+bx+c=0 are given by the Quadratic formula:
x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}. See Completing the square
ax^2+bx+c = a(x+\frac{b}{2a})^2+c-\frac{b^2}{4a} = a(x-h)^2+k
This is a parabola shifted to the right h units, stretched by a factor of a, and moved upward k units.
k is the value at x=h and is either the maximum or the minimum value.

(x+y)^n = {n \choose 0}x^n y^0 + {n \choose 1}x^{n-1}y^1 + {n \choose 2}x^{n-2}y^2 + \cdots + {n \choose n-1}x^1 y^{n-1} + {n \choose n}x^0 y^n,

Where \binom{n}{k} = \frac{n!}{k! (n-k)!}. See Binomial coefficient

x^2 - y^2 = (x + y)(x - y)

x^2 + y^2 = (x + yi)(x - yi)

The polynomial remainder theorem states that the remainder of the division of a polynomial Z(x) by the linear polynomial x-a is equal to Z(a). See Ruffini's rule*.

Determining the value at Z(a) is sometimes easier if we use Horner's method* (synthetic division*) by writing the polynomial in the form

Z(x) = a_0 + x(a_1 + x(a_2 + \cdots + x(a_{n-1} + x(a_n)))).

A monic polynomial* is a one variable polynomial in which the leading coefficient is equal to 1.

a_0 + a_1x + a_2x^2 + \cdots + a_{n-1}x^{n-1} + 1x^n

Back to top

Rational functions

A rational function* is a function of the form

f(x) = k{(x - z_1)(x - z_2)\dotsb(x - z_n) \over (x - p_1)(x - p_2)\dotsb(x - p_m)} = {Z(x) \over P(x)}

It has n zeros and m poles. A pole is a value of x for which |f(x)| = infinity.

The vertical asymptotes are the poles of the rational function.
If n<m then f(x) has a horizontal asymptote at the x axis
If n=m then f(x) has a horizontal asymptote at k.
If n>m then f(x) has no horizontal asymptote.
See also Wikipedia:Asymptote#Oblique_asymptotes*
Given two polynomials Z(x) and P(x) = (x-p_1)(x-p_2) \cdots (x-p_m), where the pi are distinct constants and deg Z < m, partial fractions are generally obtained by supposing that
\frac{Z(x)}{P(x)} = \frac{c_1}{x-p_1} + \frac{c_2}{x-p_2} + \cdots + \frac{c_m}{x-p_m}
and solving for the ci constants, by substitution, by equating the coefficients* of terms involving the powers of x, or otherwise.
(This is a variant of the method of undetermined coefficients*.)[31]
If the degree of Z is not less than m then use long division to divide P into Z. The remainder then replaces Z in the equation above and one proceeds as before.
If P(x) = (x-p)^m then \frac{Z(x)}{P(x)} = \frac{c_1}{(x-p)} + \frac{c_2}{(x-p)^2} + \cdots + \frac{c_m}{(x-p)^m}

A Generalized hypergeometric series* is given by

\sum_{x=0} c_x where c0=1 and {c_{x+1} \over c_x} = {Z(x) \over P(x)} = f(x)

The function f(x) has n zeros and m poles.

Basic hypergeometric series*, or hypergeometric q-series, are q-analogue* generalizations of generalized hypergeometric series.[32]
Roughly speaking a q-analog* of a theorem, identity or expression is a generalization involving a new parameter q that returns the original theorem, identity or expression in the limit as q → 1[33]
We define the q-analog of n, also known as the q-bracket or q-number of n, to be
[n]_q=\frac{1-q^n}{1-q} = q^0 + q^1 + q^2 + \ldots + q^{n - 1}
one may define the q-analog of the factorial, known as the q-factorial*, by

[n]_q! = [1]_q  \cdot [2]_q \cdots [n-1]_q  \cdot [n]_q
Elliptic hypergeometric series* are generalizations of basic hypergeometric series.
An elliptic function is a meromorphic function that is periodic in two directions.

A generalized hypergeometric function* is given by

F(x) = {}_nF_m(z_1,...z_n;p_1,...p_m;x) = \sum_{y=0} c_y x^y

So for ex (see below) we have:

 c_y = \frac{1}{y!}, \qquad \frac{c_{y+1}}{c_y} = \frac{1}{y+1}.

Back to top

Integration and differentiation

1 over x squared integral2

Force • distance = energy

See also: Hyperreal number and Implicit differentiation

The integral is a generalization of multiplication.

For example: a unit mass dropped from point x2 to point x1 will release energy.
The usual equation is is a simple multiplication:
gravity \cdot (x_2 - x_1) = energy
But that equation cant be used if the strength of gravity is itself a function of x.
The strength of gravity at x1 would be different than it is at x2.
And in reality gravity really does depend on x (x is the distance from the center of the earth):
gravity(x) = 1/x^2 (See inverse-square law.)
However, the corresponding Definite integral is easily solved:
\int_{x_1}^{x_2} gravity(x) \cdot dx
The surprisingly simple rules for solving definite integrals
\int_{x_1}^{x_2} f(x) \cdot dx \quad = \quad F(x_2)-F(x_1)

F(x) is called the indefinite integral. (antiderivative)

F(x) = \int f(x) \cdot dx

k and y are arbitrary constants:

\int k \cdot x^y \cdot dx \quad = \quad k \cdot \int x^y \cdot dx \quad = \quad k \cdot \frac{x^{y+1}}{y+1}

(Units (feet, mm...) behave exactly like constants.)

And most conveniently :

\int \bigg (f(x) + g(x) \bigg) \cdot dx = \int f(x) \cdot dx + \int g(x) \cdot dx
The integral of a function is equal to the area under the curve.
When the "curve" is a constant (in other words, k•x0) then the integral reduces to ordinary multiplication.

The derivative is a generalization of division.

The derivative of the integral of f(x) is just f(x).


The derivative of a function at any point is equal to the slope of the function at that point.


The equation of the line tangent to a function at point a is

y(x) = f(a) + f'(a)(x-a)

The Lipschitz constant of a function is a real number for which the absolute value of the slope of the function at every point is not greater than this real number.

The derivative of f(x) where f(x) = k•xy is

f'(x) = {df \over dx} = {d(k \cdot x^y) \over dx} \quad = \quad k \cdot {d(x^y) \over dx} \quad = \quad k \cdot y \cdot x^{y-1}
The derivative of a k \cdot x^0 is k \cdot 0 \cdot x^{-1}
The integral of x^{-1} is ln(x)[34]. See natural log

Chain rule for the derivative of a function of a function:

f(g(x))' = \frac{df}{dg} \cdot \frac{dg}{dx}

The Chain rule for a function of 2 functions:

f(g(x), h(x))' = \frac{\operatorname df}{\operatorname dx} = { \partial f \over \partial g}{\operatorname dg \over \operatorname dx} + {\partial f \over \partial h}{\operatorname dh \over \operatorname dx } (See "partial derivatives" below)

The Product rule can be considered a special case of the chain rule for several variables[35]

\frac{df}{dx} = {d (g(x) \cdot h(x)) \over dx} = \frac{\partial(g \cdot h)}{\partial g}\frac{dg}{dx}+\frac{\partial (g \cdot h)}{\partial h}\frac{dh}{dx} = \frac{dg}{dx} h + g \frac{dh}{dx}

Product rule:

(g \cdot h)' = \frac{(g+dg) \cdot (h+dh) - g \cdot h}{dx} = g' \cdot h + g \cdot h' (because dh \cdot dg is negligible)
(g \cdot h \cdot j)' = g' \cdot h \cdot j + g \cdot h' \cdot j + g \cdot h \cdot j'

General Leibniz rule*:

(gh)^{(n)}=\sum_{k=0}^n {n \choose k} g^{(n-k)} h^{(k)}

By the chain rule:

\bigg(\frac{1}{h}\bigg)' = \frac{-1}{h^2} \cdot h'

Therefore the Quotient rule:

\bigg( \frac{g(x)}{h(x)} \bigg)' = \bigg( g \cdot \frac{1}{h} \bigg)'  = g' \cdot \frac{1}{h} + g \cdot \frac{-h'}{h^2} = \frac{g' \cdot h  - g \cdot h'}{h^2}

There is a chain rule for integration but the inner function must have the form g=ax+c so that its derivative \frac{dg}{dx} = a and therefore dx=\frac{dg}{a}

\int f(g(x)) \cdot dx = \int f(g) \cdot \frac{dg}{a} = \frac{1}{a} \int f(g) \cdot dg

Actually the inner function can have the form g=ax^y+c so that its derivative \frac{dg}{dx} = a \cdot y \cdot x^{y-1} and therefore dx=\frac{dg}{a \cdot y \cdot x^{y-1}} provided that all factors involving x cancel out.

\int x^{y-1} \cdot f(g(x)) \cdot dx = \int {\color{red} x^{y-1}} \cdot f(g) \cdot \frac{dg}{a \cdot y \cdot {\color{red} x^{y-1}}} = \frac{1}{a \cdot y} \int f(g) \cdot dg

The product rule for integration is called Integration by parts

g \cdot h' = (g \cdot h)' - g' \cdot h
\int g \cdot h' \cdot dx = g \cdot h - \int g' \cdot h \cdot dx

One can use partial fractions or even the Taylor series to convert difficult integrals into a more manageable form.

\frac{f(x)}{(x-1)^2} = \frac{a_0(x-1)^0 + a_1(x-1)^1 + \dots + a_n(x-1)^n}{(x-1)^2}

The fundamental theorem of Calculus is:

F(x) - F(a) = \int_a^x\!f(t)\, dt \quad \text{and} \quad F'(x) = f(x)

The fundamental theorem of calculus is just the particular case of the Leibniz integral rule*:

\frac{d}{dx} \left (\int_{a(x)}^{b(x)}f(x,t)\,dt \right) = f\big(x,b(x)\big)\cdot \frac{d}{dx} b(x) - f\big(x,a(x)\big)\cdot \frac{d}{dx} a(x) + \int_{a(x)}^{b(x)}\frac{\partial}{\partial x} f(x,t) \,dt.

In calculus, a function f defined on a subset of the real numbers with real values is called monotonic* if and only if it is either entirely non-increasing, or entirely non-decreasing.[36]

A differential form is a generalisation of the notion of a differential that is independent of the choice of coordinate system*. f(x,y) dx ∧ dy is a 2-form in 2 dimensions (an area element). The derivative operation on an n-form is an n+1-form; this operation is known as the exterior derivative. By the generalized Stokes' theorem, the integral of a function over the boundary of a manifold is equal to the integral of its exterior derivative on the manifold itself.

Back to top

Taylor & Maclaurin series

If we know the value of a smooth function at x=0 (smooth means all its derivatives are continuous) and we also know the value of all of its derivatives at x=0 then we can determine the value at any other point x by using the Maclaurin series. ("!" means factorial)

a_0 x^0 + a_1 x^1 + a_2 x^2 + a_3 x^3 \cdots \quad \text{where} \quad a_n = {f^{(n)}(0) \over n!}

The proof of this is actually quite simple. Plugging in a value of x=0 causes all terms but the first to become zero. So, assuming that such a function exists, a0 must be the value of the function at x=0. Simply differentiate both sides of the equation and repeat for the next term. And so on.

The Taylor series generalizes this formula.
f(z)=\sum_{k=0}^\infty \alpha_k (z-z_0)^k
Riemann sqrt

Riemann surface for the function ƒ(z) = √z. For the imaginary part rotate 180°.

An analytic function is a function whose Taylor series converges for every z0 in its domain; analytic functions are infinitely differentiable.
Any vector g = (z0, α0, α1, ...) is a germ* if it represents a power series of an analytic function around z0 with some radius of convergence r > 0.
The set of germs \mathcal G is a Riemann surface.
Riemann surfaces are the objects on which multi-valued functions become single-valued.
A connected component* of \mathcal G (i.e., an equivalence class) is called a sheaf*.

We can easily determine the Maclaurin series expansion of the exponential function e^x (because it is equal to its own derivative).[34]

e^x = \sum_{n = 0}^{\infty} {x^n \over n!} = {x^0 \over 0!} + {x^1 \over 1!} + {x^2 \over 2!} + {x^3 \over 3!} + {x^4 \over 4!} + \cdots
The above holds true even if x is a matrix. See Matrix exponential*

And cos(x) and sin(x) (because cosine is the derivative of sine which is the derivative of -cosine)

\cos x = \frac{x^0}{0!} - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + \cdots

\sin x = \frac{x^1}{1!} - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \cdots

It then follows that e^{ix}=\cos x+i\sin x=\operatorname{cis} x and therefore e^{i \pi}=-1 + i\cdot0 See Euler's formula

x is the angle in radians*.
This makes the equation for a circle in the complex plane, and by extension sine and cosine, extremely simple and easy to work with especially with regard to differentiation and integration.
\frac{d(e^{i \cdot k \cdot t})}{dt} = i \cdot k \cdot e^{i \cdot k \cdot t}
Differentiation and integration are replaced with multiplication and division. Calculus is replaced with algebra. Therefore any expression that can be represented as a sum of sine waves can be easily differentiated or integrated.

Back to top

Fourier Series

Fourier Series

The Maclaurin series cant be used for a discontinuous function like a square wave because it is not differentiable. (Distributions* make it possible to differentiate functions whose derivatives do not exist in the classical sense. See Generalized function*.)

But remarkably we can use the Fourier series to expand it or any other periodic function into an infinite sum of sine waves each of which is fully differentiable!

f(t) = \frac{a_0}{2} + \sum_{n=1}^\infty \left[a_n\cos\left(nt\right)+b_n\sin\left(nt\right)\right]
a_n = \frac{2}{p}\int_{t_0}^{t_p} f(t)\cdot  \cos\left(\tfrac{2\pi nt}{p}\right)\ dt
b_n = \frac{2}{p}\int_{t_0}^{t_p} f(t)\cdot  \sin\left(\tfrac{2\pi nt}{p}\right)\ dt
Sine squared graph, or half of one minus the cosine of twice x

sin2(x) = 0.5*cos(0x) - 0.5*cos(2x)

The reason this works is because sine and cosine are orthogonal functions*.
\langle sin,cos\rangle=0.
That means that multiplying any 2 sine waves of frequency n and frequency m and integrating over one period will always equal zero unless n=m.
See the graph of sin2(x) to the right.
\sin mx \cdot \sin nx = \frac{\cos (m - n)x - \cos (m+n) x}{2}
See Amplitude_modulation*
And of course ∫ fn*(f1+f2+f3+...) = ∫ (fn*f1) + ∫ (fn*f2) + ∫ (fn*f3) +...
The complex form of the Fourier series uses complex exponentials instead of sine and cosine and uses both positive and negative frequencies (clockwise and counter clockwise) whose imaginary parts cancel.
The complex coefficients encode both amplitude and phase and are complex conjugates of each other.
F(\nu) = \mathcal{F}\{f\} = \int_{\mathbb{R}^n} f(x) e^{-2 \pi i x\cdot\nu} \, \mathrm{d}x
where the dot between x and ν indicates the inner product of Rn.
A 2 dimensional Fourier series is used in video compression.
A discrete Fourier transform* can be computed very efficiently by a fast Fourier transform*.
In mathematical analysis, many generalizations of Fourier series have proven to be useful.
They are all special cases of decompositions over an orthonormal basis of an inner product space.[37]
Spherical harmonics* are a complete set of orthogonal functions on the sphere, and thus may be used to represent functions defined on the surface of a sphere, just as circular functions (sines and cosines) are used to represent functions on a circle via Fourier series.[38]
Spherical harmonics are basis functions* for SO(3). See Laplace series.
Every continuous function in the function space can be represented as a linear combination* of basis functions, just as every vector in a vector space can be represented as a linear combination of basis vectors.
Every quadratic polynomial can be written as a1+bt+ct2, that is, as a linear combination of the basis functions 1, t, and t2.

Back to top


Fourier transforms generalize Fourier series to nonperiodic functions like a single pulse of a square wave.

The more localized in the time domain (the shorter the pulse) the more the Fourier transform is spread out across the frequency domain and vice versa, a phenomenon known as the uncertainty principle.

The Fourier transform of the Dirac delta function gives G(f)=1

G(\omega)=\mathcal{F}\{f(t)\}=\int_{-\infty}^\infty f(t) e^{-i\omega t}dt
Laplace transforms generalize Fourier transforms to complex frequency s=\sigma+i\omega.
Complex frequency includes a term corresponding to the amount of damping.
F(s)=\mathcal{L}\{f(t)\}=\int_0^\infty f(t) e^{-\sigma t}e^{-i \omega t}dt
\mathcal{L}\{ \delta(t-a) \} = e^{-as}, (assuming a > 0)
\mathcal{L}\{e^{at} \}= \frac{1}{s - a}
The inverse Laplace transform is given by
f(t) = \mathcal{L}^{-1} \{F\} =  \frac{1}{2\pi i}\lim_{T\to\infty}\int_{\gamma-iT}^{\gamma+iT}F(s)e^{st}\,ds,
where the integration is done along the vertical line Re(s) = γ in the complex plane such that γ is greater than the real part of all singularities* of F(s) and F(s) is bounded on the line, for example if contour path is in the region of convergence*.
If all singularities are in the left half-plane, or F(s) is an entire function* , then γ can be set to zero and the above inverse integral formula becomes identical to the inverse Fourier transform*.[39]
Integral transforms generalize Fourier transforms to other kernals (besides sine and cosine)
Cauchy kernel =\frac{1}{\zeta-x} \quad \text{or} \quad \frac{1}{2\pi i} \cdot \frac{1}{\zeta-x}
Hilbert kernel = cot\frac{\theta-t}{2}
Poisson Kernel:
For the ball of radius r, B_{r}, in Rn, the Poisson kernel takes the form:
P(x,\zeta) = \frac{r^2-|x|^2}{r} \cdot \frac{1}{|\zeta-x|^n} \cdot \frac{1}{\omega_{n}}
where x\in B_{r}, \zeta\in S (the surface of B_{r}), and \omega _{n} is the surface area of the unit n-sphere*.
unit disk (r=1) in the complex plane:[40]
K(x,\phi) = \frac{1^2-|x|^2}{1} \cdot \frac{1}{|e^{i\phi}-x|^2}\cdot \frac{1}{2\pi}
Dirichlet kernel

e^{ikx}=1+2\sum_{k=1}^n\cos(kx)=\frac{\sin\left(\left(n + \frac{1}{2}\right) x \right)}{\sin(\frac{x}{2})} \approx 2\pi\delta(x)

The convolution* theorem states that[41]

\mathcal{F}\{f*g\} = \mathcal{F}\{f\} \cdot \mathcal{F}\{g\}

where \cdot denotes point-wise multiplication. It also works the other way around:

\mathcal{F}\{f \cdot g\}= \mathcal{F}\{f\}*\mathcal{F}\{g\}

By applying the inverse Fourier transform \mathcal{F}^{-1}, we can write:

f*g= \mathcal{F}^{-1}\big\{\mathcal{F}\{f\}\cdot\mathcal{F}\{g\}\big\}


f \cdot g= \mathcal{F}^{-1}\big\{\mathcal{F}\{f\}*\mathcal{F}\{g\}\big\}

This theorem also holds for the Laplace transform.

The Hilbert transform* is a multiplier operator*. The multiplier of H is σH(ω) = −i sgn(ω) where sgn is the signum function*. Therefore:

\mathcal{F}(H(u))(\omega) = (-i\,\operatorname{sgn}(\omega)) \cdot \mathcal{F}(u)(\omega)

where \mathcal{F} denotes the Fourier transform.

Since sgn(x) = sgn(2πx), it follows that this result applies to the three common definitions of  \mathcal{F}.

By Euler's formula,

\sigma_H(\omega) = \begin{cases}

   i = e^{+\frac{i\pi}{2}}, & \text{for } \omega < 0\\

                         0, & \text{for } \omega = 0\\

  -i = e^{-\frac{i\pi}{2}}, & \text{for } \omega > 0


Therefore, H(u)(t) has the effect of shifting the phase of the negative frequency* components of u(t) by +90° (π/2 radians) and the phase of the positive frequency components by −90°.

And i·H(u)(t) has the effect of restoring the positive frequency components while shifting the negative frequency ones an additional +90°, resulting in their negation.

In electrical engineering, the convolution of one function (the input signal) with a second function (the impulse response) gives the output of a linear time-invariant system (LTI).

At any given moment, the output is an accumulated effect of all the prior values of the input function

Back to top

Differential equations

See also: Variation of parameters*
Simple Harmonic Motion Orbit

Simple harmonic motion shown both in real space and phase space*.

Simple harmonic motion* of a mass on a spring is a second-order linear ordinary differential equation.

 Force = mass*acc = m\frac{\mathrm{d}^2 x}{\mathrm{d}t^2} = -kx,

where m is the inertial mass, x is its displacement from the equilibrium, and k is the spring constant.

Solving for x produces

 x(t) = A\cos\left(\omega t - \varphi\right),

A is the amplitude (maximum displacement from the equilibrium position),  \omega = 2\pi f = \sqrt{k/m} is the angular frequency, and φ is the phase.

Energy passes back and forth between the potential energy in the spring and the kinetic energy of the mass.

The important thing to note here is that the frequency of the oscillation depends only on the mass and the stiffness of the spring and is totally independent of the amplitude.

That is the defining characteristic of resonance.

RLC series circuit v1

RLC series circuit

Kirchhoff's voltage law* states that the sum of the emfs in any closed loop of any electronic circuit is equal to the sum of the voltage drops* in that loop.[42]

V(t) = V_R + V_L + V_C

V is the voltage, R is the resistance, L is the inductance, C is the capacitance.

V(t) = RI(t) + L \frac{dI(t)}{dt} + \frac{1}{C} \int_{0}^t I(\tau)\, d\tau

I = dQ/dt is the current.

It makes no difference whether the current is a small number of charges moving very fast or a large number of charges moving slowly.

In reality the latter is the case*.

Oscillation amortie

Damping oscillation* is a typical transient response*

If V(t)=0 then the only solution to the equation is the transient response which is a rapidly decaying sine wave with the same frequency as the resonant frequency of the circuit.

Like a mass (inductance) on a spring (capacitance) the circuit will resonate at one frequency.
Energy passes back and forth between the capacitor and the inductor with some loss as it passes through the resistor.

If V(t)=sin(t) from -∞ to +∞ then the only solution is a sine wave with the same frequency as V(t) but with a different amplitude and phase.

If V(t) is zero until t=0 and then equals sin(t) then I(t) will be zero until t=0 after which it will consist of the steady state response plus a transient response.

From Wikipedia:Characteristic equation (calculus):

Starting with a linear homogeneous differential equation with constant coefficients a_{n}, a_{n-1}, \ldots , a_{1}, a_{0},

a_{n}y^{(n)} + a_{n-1}y^{(n-1)} + \cdots + a_{1}y^\prime + a_{0}y = 0

it can be seen that if y(x) = e^{rx} \, , each term would be a constant multiple of  e^{rx} \, . This results from the fact that the derivative of the exponential function  e^{rx} \, is a multiple of itself. Therefore, y' = re^{rx} \, , y'' = r^{2}e^{rx} \, , and y^{(n)} = r^{n}e^{rx} \, are all multiples. This suggests that certain values of  r \, will allow multiples of  e^{rx} \, to sum to zero, thus solving the homogeneous differential equation.[43] In order to solve for  r \, , one can substitute y = e^{rx} \, and its derivatives into the differential equation to get

a_{n}r^{n}e^{rx} + a_{n-1}r^{n-1}e^{rx} + \cdots + a_{1}re^{rx} + a_{0}e^{rx} = 0

Since  e^{rx} \, can never equate to zero, it can be divided out, giving the characteristic equation

a_{n}r^{n} + a_{n-1}r^{n-1} + \cdots + a_{1}r + a_{0} = 0

By solving for the roots,  r \, , in this characteristic equation, one can find the general solution to the differential equation.[44][45] For example, if  r \, is found to equal to 3, then the general solution will be y(x) = ce^{3x} \, , where  c \, is an arbitrary constant.

Back to top

Partial derivatives

Partial derivatives and multiple integrals generalize derivatives and integrals to multiple dimensions.

The partial derivative with respect to one variable \frac{\part f(x,y)}{\part x} is found by simply treating all other variables as though they were constants.

Multiple integrals are found the same way.

Let f(x, y, z) be a scalar function (for example electric potential energy or temperature).

A 2 dimensional example of a scalar function would be an elevation map.
(Contour lines of an elevation map are an example of a level set*.)

Totales Differential

The total derivative of f(x(t), y(t)) with respect to t is[46]

\frac{\operatorname df}{\operatorname dt} = { \partial f \over \partial x}{\operatorname dx \over \operatorname dt} + {\partial f \over \partial y}{\operatorname dy \over \operatorname dt }

And the differential is

\operatorname df = { \partial f \over \partial x}\operatorname dx + {\partial f \over \partial y} \operatorname dy .

Back to top

Gradient of scalar field

The Gradient of f(x, y, z) is a vector field whose value at each point is a vector (technically its a covector because it has units of distance−1) that points "downhill" with a magnitude equal to the slope of the function at that point.

You can think of it as how much the function changes per unit distance.

For static (unchanging) fields the Gradient of the electric potential is the electric field itself.

The gradient of temperature gives heat flow.

\operatorname{grad}(f) = \nabla f = \frac{\partial f}{\partial x} \mathbf{i} +

\frac{\partial f}{\partial y}  \mathbf{j} +

\frac{\partial f}{\partial z} \mathbf{k} = \mathbf{F}

Back to top


The Divergence of a vector field is a scalar.

The divergence of the electric field is non-zero wherever there is electric charge and zero everywhere else.

Field lines begin and end at charges because the charges create the electric field.

\operatorname{div}\,\mathbf{F} = {\color{red} \nabla\cdot\mathbf{F} }
 = \left(
\frac{\partial}{\partial x},
\frac{\partial}{\partial y},
\frac{\partial}{\partial z}
\cdot (F_x,F_y,F_z)
 = \frac{\partial F_x}{\partial x}
+\frac{\partial F_y}{\partial y}
+\frac{\partial F_z}{\partial z}.

The Laplacian is the divergence of the gradient of a function:

\Delta f = \nabla^2 f = (\nabla \cdot \nabla) f = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2} + \frac{\partial^2 f}{\partial z^2}.
elliptic operators* generalize the Laplacian.

Back to top


The Curl of a vector field describes how much the vector field is twisted.

(The field may even go in circles.)

The curl at a certain point of a magnetic field is the current vector at that point because current creates the magnetic field.

In 3 dimensions the dual of the current vector is a bivector.

\text{curl} (\mathbf{F}) = {\color{blue} \nabla \times \mathbf{F} } = \begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\
{\frac{\partial}{\partial x}} & {\frac{\partial}{\partial y}} & {\frac{\partial}{\partial z}} \\
  F_x & F_y & F_z \end{vmatrix}
\text{curl}( \mathbf{F}) = \left(\frac{\partial F_z}{\partial y}  - \frac{\partial F_y}{\partial z}\right) \mathbf{i} + \left(\frac{\partial F_x}{\partial z} - \frac{\partial F_z}{\partial x}\right) \mathbf{j} + \left(\frac{\partial F_y}{\partial x} - \frac{\partial F_x}{\partial y}\right) \mathbf{k}
Curl and divergence

In 2 dimensions this reduces to a single scalar

\text{curl}( \mathbf{F}) = \left(\frac{\partial F_y}{\partial x} - \frac{\partial F_x}{\partial y}\right)

The curl of the gradient of any scalar field is always zero.

The curl of a vector field in 4 dimensions would no longer be a vector. It would be a bivector. However the curl of a bivector field in 4 dimensions would still be a vector.

See also: differential forms*.

Back to top

Gradient of vector field

The Gradient of a vector field is a tensor field. Each row is the gradient of the corresponding scalar function:

\nabla \mathbf{F} =
\frac{\partial}{\partial x} \mathbf{e}_x,
\frac{\partial}{\partial y} \mathbf{e}_y,
\frac{\partial}{\partial z} \mathbf{e}_z
f_x \mathbf{e}_x \\
f_y \mathbf{e}_y \\
f_z \mathbf{e}_z
\end{bmatrix} =
{\color{red}  \frac{\partial f_x}{\partial x} \mathbf{e}_{xx} } &
{\color{blue} \frac{\partial f_x}{\partial y} \mathbf{e}_{xy} } &
{\color{blue} \frac{\partial f_x}{\partial z} \mathbf{e}_{xz} } \\
{\color{blue} \frac{\partial f_y}{\partial x} \mathbf{e}_{yx} } &
{\color{red}  \frac{\partial f_y}{\partial y} \mathbf{e}_{yy} } &
{\color{blue} \frac{\partial f_y}{\partial z} \mathbf{e}_{yz} } \\
{\color{blue} \frac{\partial f_z}{\partial x} \mathbf{e}_{zx} } &
{\color{blue} \frac{\partial f_z}{\partial y} \mathbf{e}_{zy} } &
{\color{red}  \frac{\partial f_z}{\partial z} \mathbf{e}_{zz} }
Remember that \mathbf{e}_{xy} = - \mathbf{e}_{yx} because rotation from y to x is the negative of rotation from x to y.

Partial differential equations can be classified as parabolic*, hyperbolic* and elliptic*.

Back to top

Green's theorem

The line integral along a 2-D vector field is:

\int (V_1 \cdot dx + V_2 \cdot dy) = \int_a^b \bigg [V_1(x(t),y(t)) \frac{dx}{dt} + V_2(x(t),y(t)) \frac{dy}{dt} \bigg ] dt
\int (V_1 \cdot dx + V_2 \cdot dy) = \iint \bigg ( \frac{\partial V_2}{\partial x} - \frac{\partial V_1}{\partial y} \bigg ) \cdot dx \cdot dy = \iint \bigg (  {\color{blue} \nabla \times \mathbf{V} } \bigg ) \cdot dx \cdot dy
Radial vector field

Divergence is zero everywhere except at the origin where a charge is located. A line integral around any of the red circles will give the same answer because all the circles contain the same amount of charge.

Green's theorem states that if you want to know how many field lines cross (or run parallel to) the boundary of a given region then you can either perform a line integral or you can simply count the number of charges (or the amount of current) within that region. See Divergence theorem

\oiint{\scriptstyle S } \vec{F} \cdot \ \mathrm{d} \vec{s} = \iiint_D \nabla \cdot \vec{F} \,\mathrm{d}V = \iiint_D \nabla^2 f \,\mathrm{d}V

In 2 dimensions this is

\oint_S \vec{F} \cdot \vec{n} \ \mathrm{d} s = \iint_D \nabla \cdot \vec{F} \ \mathrm{d} A= \iint_D \nabla^2 f \ \mathrm{d} A

Green's theorem is perfectly obvious when dealing with vector fields but is much less obvious when applied to complex valued functions in the complex plane.

Back to top

The complex plane

Highly recomend: Fundamentals of complex analysis with applications to engineering and science by Saff and Snider
External link:

The formula for the derivative of a complex function f at a point z0 is the same as for a real function:

f'(z_0) = \lim_{z \to z_0} {f(z) - f(z_0) \over z - z_0 }.

Every complex function can be written in the form f(z)=f(x+iy)=f_x(x,y)+i f_y(x,y)

Because the complex plane is two dimensional, z can approach z0 from an infinite number of different directions.

However, if within a certain region, the function f is holomorphic (that is, complex differentiable) then, within that region, it will only have a single derivative whose value does not depend on the direction in which z approaches z0 despite the fact that fx and fy each have 2 partial derivatives. One in the x and one in the y direction..

{df \over dz} \quad = \quad {\part f_x \over \part x} + i {\part f_y \over \part x} \quad = \quad {\part f_y \over \part y} - i {\part f_x \over \part y} \quad = \quad {\part f_x \over \part x} - i {\part f_x \over \part y} \quad = \quad {\part f_y \over \part y} + i {\part f_y \over \part x}
{d^2f \over dz^2} \quad = \quad {\part^2 f_x \over \part x^2} + i {\part^2 f_y \over \part x^2} \quad = \quad {\part^2 f_y \over \part y \part x} - i {\part^2 f_x \over \part y \part x}

This is only possible if the Cauchy–Riemann conditions are true.

\frac{\part f_x}{\part x}=\frac{\part f_y}{\part y}\ ,\ \quad \frac{\part f_y}{\part x}=-\frac{\part f_x}{\part y}

An entire function*, also called an integral function, is a complex-valued function that is holomorphic at all finite points over the whole complex plane.

As with real valued functions, a line integral of a holomorphic function depends only on the starting point and the end point and is totally independant of the path taken.

\int f(z) \cdot dz = \int (f_x \cdot dx - f_y \cdot dy) + i \int (f_y \cdot dx + f_x \cdot dy)
\int f(z) \cdot dz = F(z) = \int_0^t f(z(t)) \cdot \frac{dz}{dt} \cdot dt
\int_a^b f(z) \cdot dz = F(b) - F(a)

The starting point and the end point for any loop are the same. This, of course, implies Cauchy's integral theorem for any holomorphic function f:

\oint f(z) \, dz =
\left( \frac{- \partial f_x}{\partial y} + \frac{- \partial f_y}{\partial x}  \right) dx \, dy +
\left( \frac{\partial f_x}{\partial x} + \frac{- \partial f_y}{\partial y} \right)  \, dx \, dy = 0

\oint f(z) \, dz =
\iint \left(
{\color{blue} \nabla \times \bar{f}} +
 i {\color{red} \nabla \cdot \bar{f}}
 \right) \, dx \, dy = 0

Therefore curl and divergence must both be zero for a function to be holomorphic.

Green's theorem for functions (not necessarily holomorphic) in the complex plane:

\oint f(z) \, dz =
2i \iint \left( df/d\bar{z} \right) \, dx \, dy =
i \iint \left( \nabla f \right) \, dx \, dy =
i \iint \left( 1 {\partial f \over \partial x} +
i {\partial f \over \partial y} \right) \, dx \, dy

Computing the residue of a monomial[47]

\oint_C (z-z_0)^n dz = \int_0^{2\pi} e^{in \theta} \cdot i e^{i \theta} d \theta = i \int_0^{2\pi} e^{i (n+1) \theta} d\theta
= \begin{cases}
2\pi i & \text{if } n = -1 \\
0 & \text{otherwise}
where C is the circle with radius 1 therefore z \to e^{i\theta} and dz \to d(e^{i\theta}) = ie^{i\theta}d\theta
\oint_{C_r}\frac{f(z)}{z-z_0}dz = \oint_{C_r}\frac{f(z_0)}{z-z_0}dz + \oint_{C_r}\frac{f(z)-f(z_0)}{z-z_0}dz = f(z_0)2\pi i + 0

The last term in the equation above equals zero when r=0. Since its value is independent of r it must therefore equal zero for all values of r.

\bigg | \int_\Gamma f(z) \cdot dz \bigg | \leq Max(|f(z)|) \cdot length(\Gamma)

Cauchy's integral formula states that the value of a holomorphic function within a disc is determined entirely by the values on the boundary of the disc.

Divergence can be nonzero outside the disc.

Cauchy's integral formula can be generalized to more than two dimensions.

Laurent series

f^{(0)}(z_0)=\dfrac{1}{2\pi i}\oint_\gamma f(z)\frac{1}{z-z_0}dz

Which gives:

f'(z_0)=\dfrac{1}{2\pi i}\oint_\gamma f(z)\frac{1}{(z-z_0)^2}dz

f''(z_0)=\dfrac{2}{2\pi i}\oint_\gamma f(z)\frac{1}{(z-z_0)^3}dz
f^{(n)}(z_0) = \frac{n!}{2\pi i} \oint_\gamma f(z)\frac{1}{(z-z_0)^{n+1}}\, dz
Note that n does not have to be an integer. See Fractional calculus*.

The Taylor series becomes:

f(z)=\sum_{n=0}^\infty a_n(z-z_0)^n \quad \text{where} \quad a_n=\frac{1}{2\pi i} \oint_\gamma \frac{f(z)\,\mathrm{d}z}{(z-z_0)^{n+1}} = \frac{f^{(n)}(z_0)}{n!}

The Laurent series* for a complex function f(z) about a point z0 is given by:

f(z)=\sum_{n=-\infty}^\infty a_n(z-z_0)^n \quad \text{where} \quad a_n=\frac{1}{2\pi i} \oint_\gamma \frac{f(z)\,\mathrm{d}z}{(z-z_0)^{n+1}} = \frac{f^{(n)}(z_0)}{n!}

The positive subscripts correspond to a line integral around the outer part of the annulus and the negative subscripts correspond to a line integral around the inner part of the annulus. In reality it makes no difference where the line integral is so both line integrals can be moved until they correspond to the same contour gamma. See also: Z-transform*

The function \frac{1}{(z-1)(z-2)} has poles at z=1 and z=2. It therefore has 3 different Laurent series centered on the origin (z0 = 0):

For 0 < |z| < 1 the Laurent series has only positive subscripts and is the Taylor series.
For 1 < |z| < 2 the Laurent series has positive and negative subscripts.
For 2 < |z| the Laurent series has only negative subscripts.

Cauchy formula for repeated integration*:

f^{(-n)}(a) = \frac{1}{(n-1)!} \int_0^a f(z) \left(a-z\right)^{n-1} \,\mathrm{d}z

For every holomorphic function f(z)=f(x+iy)=f_x(x,y)+i f_y(x,y) both fx and fy are harmonic functions.

Any two-dimensional harmonic function is the real part of a complex analytic function.

See also: complex analysis.[48]

fy is the harmonic conjugate* of fx.
Geometrically fx and fy are related as having orthogonal trajectories, away from the zeroes of the underlying holomorphic function; the contours on which fx and fy are constant (equipotentials* and streamlines*) cross at right angles.
In this regard, fx+ify would be the complex potential, where fx is the potential function* and fy is the stream function*.[49]
fx and fy are both solutions of Laplace's equation  \nabla^2 f = 0 so divergence of the gradient is zero
Legendre function* are solutions to Legendre's differential equation.
This ordinary differential equation is frequently encountered when solving Laplace's equation (and related partial differential equations) in spherical coordinates.
A harmonic function is a scalar potential function therefore the curl of the gradient will also be zero.
See Potential theory*

Complex divergence

Complex curl

Harmonic functions are real analogues to holomorphic functions.
All harmonic functions are analytic, i.e. they can be locally expressed as power series.
This is a general fact about elliptic operators*, of which the Laplacian is a major example.
The value of a harmonic function at any point inside a disk is a weighted average* of the value of the function on the boundary of the disk.
P[u](x) = \int_S u(\zeta)P(x,\zeta)d\sigma(\zeta).\,
The Poisson kernel* gives different weight to different points on the boundary except when x=0.
The value at the center of the disk (x=0) equals the average of the equally weighted values on the boundary.
All locally integrable functions satisfying the mean-value property are both infinitely differentiable and harmonic.
The kernel itself appears to simply be 1/r^n shifted to the point x and multiplied by different constants.
For a circle (K = Poisson Kernel):

\oint_0^{2\pi} f(Re^{i\theta}) K(R,r,\theta-\phi) d\theta

\frac{d(a(x,y)+ib(x,y))}{d(x+iy)} = \frac{da+idb}{dx+idy} = \frac{(da+idb)(dx-idy)}{dx^2+dy^2} = \frac{dadx+dbdy+i(dbdx-dady)}{dx^2+dy^2}

\frac{d(a(x,y)+ib(x,y))}{d(x+iy)} = \frac{da}{dx} +\frac{db}{dy} +i \bigg(\frac{db}{dx}- \frac{da}{dy} \bigg) = 
{\color{red} \nabla \cdot f}
+ i {\color{blue} \nabla \times f}

Back to top

Geometric calculus

See also: Geometric_algebra#Geometric_calculus*

From Wikipedia:Geometric calculus:

Geometric calculus extends the geometric algebra to include differentiation and integration. The formalism is powerful and can be shown to encompass other mathematical theories including differential geometry and differential forms.

With a geometric algebra given, let a and b be vectors* and let F(a) be a multivector-valued function. The directional derivative of F(a) along b is defined as

\nabla_b F(a) = \lim_{\epsilon \rightarrow 0}{\frac{F(a + \epsilon b) - F(a)}{\epsilon}}

provided that the limit exists, where the limit is taken for scalar ε. This is similar to the usual definition of a directional derivative but extends it to functions that are not necessarily scalar-valued.

Next, choose a set of basis vectors \{e_i\} and consider the operators, noted (\partial_i), that perform directional derivatives in the directions of (e_i):

\partial_i : F \mapsto (x\mapsto \nabla_{e_i} F(x))

Then, using the Einstein summation notation*, consider the operator :


which means:

F \mapsto e^i\partial_i F

or, more verbosely:

F \mapsto (x\mapsto e^i\nabla_{e_i} F(x))

It can be shown that this operator is independent of the choice of frame, and can thus be used to define the geometric derivative:

\nabla = e^i\partial_i

This is similar to the usual definition of the gradient, but it, too, extends to functions that are not necessarily scalar-valued.

It can be shown that the directional derivative is linear regarding its direction, that is:

\nabla_{\alpha a + \beta b} = \alpha\nabla_a + \beta\nabla_b

From this follows that the directional derivative is the inner product of its direction by the geometric derivative. All needs to be observed is that the direction a can be written a = (a\cdot e^i) e_i, so that:

\nabla_a = \nabla_{(a\cdot e^i)e_i} = (a\cdot e^i)\nabla_{e_i} = a\cdot(e^i\nabla_{e^i}) = a\cdot \nabla

For this reason, \nabla_a F(x) is often noted a\cdot \nabla F(x).

The standard order of operations for the geometric derivative is that it acts only on the function closest to its immediate right. Given two functions F and G, then for example we have

\nabla FG = (\nabla F)G.

Although the partial derivative exhibits a product rule, the geometric derivative only partially inherits this property. Consider two functions F and G:

\begin{align}\nabla(FG) &= e^i\partial_i(FG) \\
&= e^i((\partial_iF)G+F(\partial_iG)) \\
&= e^i(\partial_iF)G+e^iF(\partial_iG) \end{align}

Since the geometric product is not commutative with e^iF \ne Fe^i in general, we cannot proceed further without new notation. A solution is to adopt the overdot* notation, in which the scope of a geometric derivative with an overdot is the multivector-valued function sharing the same overdot. In this case, if we define


then the product rule for the geometric derivative is

\nabla(FG) = \nabla FG+\dot{\nabla}F\dot{G}

Let F be an r-grade multivector. Then we can define an additional pair of operators, the interior and exterior derivatives,

\nabla \cdot F = \langle \nabla F \rangle_{r-1} = e^i \cdot \partial_i F
\nabla \wedge F = \langle \nabla F \rangle_{r+1} = e^i \wedge \partial_i F.

In particular, if F is grade 1 (vector-valued function), then we can write

\nabla F = \nabla \cdot F + \nabla \wedge F

and identify the divergence and curl as

\nabla \cdot F = \operatorname{div} F
\nabla \wedge F = I \, \operatorname{curl} F.

Note, however, that these two operators are considerably weaker than the geometric derivative counterpart for several reasons. Neither the interior derivative operator nor the exterior derivative operator is invertible*.

The reason for defining the geometric derivative and integral as above is that they allow a strong generalization of Stokes' theorem. Let \mathsf{L}(A;x) be a multivector-valued function of r-grade input A and general position x, linear in its first argument. Then the fundamental theorem of geometric calculus relates the integral of a derivative over the volume V to the integral over its boundary:

\int_V \dot{\mathsf{L}} \left(\dot{\nabla} dX;x \right) = \oint_{\partial V} \mathsf{L} (dS;x)

As an example, let \mathsf{L}(A;x)=\langle F(x) A I^{-1} \rangle for a vector-valued function F(x) and a (n-1)-grade multivector A. We find that

\begin{align}\int_V \dot{\mathsf{L}} \left(\dot{\nabla} dX;x \right) &= \int_V \langle\dot{F}(x)\dot{\nabla} dX I^{-1} \rangle \\
&= \int_V \langle\dot{F}(x)\dot{\nabla} |dX| \rangle \\
&= \int_V \nabla \cdot F(x) |dX| . \end{align}

and likewise

\begin{align}\oint_{\partial V} \mathsf{L} (dS;x) &= \oint_{\partial V} \langle F(x) dS I^{-1} \rangle \\
&= \oint_{\partial V} \langle F(x) \hat{n} |dS| \rangle \\
&= \oint_{\partial V} F(x) \cdot \hat{n} |dS| \end{align}

Thus we recover the divergence theorem,

\int_V \nabla \cdot F(x) |dX| = \oint_{\partial V} F(x) \cdot \hat{n} |dS|.

Back to top

Calculus of variations

Calculus of variations*, Functional*, Functional analysis*, Higher-order function*

Whereas calculus is concerned with infinitesimal changes of variables, calculus of variations is concerned with infinitesimal changes of the underlying function itself.

Calculus of variations is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals.

A simple example of such a problem is to find the curve of shortest length connecting two points. If there are no constraints, the solution is obviously a straight line between the points. However, if the curve is constrained to lie on a surface in space, then the solution is less obvious, and possibly many solutions may exist. Such solutions are known as geodesics. A related problem is posed by Fermat's principle: light follows the path of shortest optical length connecting two points, where the optical length depends upon the material of the medium. One corresponding concept in mechanics is the principle of least action.[50]

Back to top

Discrete mathematics

Groups and rings

Main articles: Algebraic structure, Abstract algebra, and group theory*

Addition and multiplication can be generalized in so many ways that mathematicians have created a whole system just to categorize them.


Any straight line through the origin forms a group. Adding any 2 points on the line results in a 3rd point that is also on the line.

A magma* is a set with a single closed* binary operation (usually, but not always*, addition).

a + b = c

A semigroup* is a magma where the addition is associative. See also Semigroupoid*

a + (b + c) = (a + b) + c

A monoid* is a semigroup with an additive identity element.

a + 0 = a

A group* is a monoid with additive inverse elements.

a + (-a) = 0

An abelian group* is a group where the addition is commutative.

a + b = b + a

A pseudo-ring* is an abelian group that also has a second closed, associative, binary operation (usually, but not always, multiplication).

a * (b * c) = (a * b) * c
And these two operations satisfy a distribution law.
a(b + c) = ab + ac

A ring* is a pseudo-ring that has a multiplicative identity

a * 1 = a

A commutative ring* is a ring where multiplication commutes, (e.g. integers*)

a * b = b * a

A field* is a commutative ring where every element has a multiplicative inverse (and thus there is a multiplicative identity),

a * (1/a) = 1
The existence of a multiplicative inverse for every nonzero element automatically implies that there are no zero divisors* in a field
if ab=0 for some a≠0, then we must have b=0 (we call this having no zero-divisors).

The characteristic* of ring R, denoted char(R), is the number of times one must add the multiplicative identity* to get the additive identity*.

The center* of a noncommutative ring* is the subring of elements c such that cx = xc for every x. See also: Centralizer and normalizer*.

All non-zero nilpotent* elements are zero divisors*.

The square matrix A = \begin{pmatrix}
    0 & 1 & 0\\
    0 & 0 & 1\\ 
    0 & 0 & 0
is nilpotent

Back to top

Set theory

See also: Naive set theory*, Zermelo–Fraenkel set theory*, Set theory, Set notation*, Set-builder notation*, Set, Algebra of sets*, Field of sets*, and Sigma-algebra*

\varnothing is the empty set (the additive identity)

\mathbf{U} is the universe of all elements (the multiplicative identity)

a \in A means that a is a element (or member) of set A. In other words a is in A.

\{ x \in \mathbf{A} : x \notin \mathbb{R}  \} means the set of all x's that are members of the set A such that x is not a member of the real numbers. Could also be written \{ \mathbf{A} - \mathbb{R}  \}

A set does not allow multiple instances of an element. \{1,1,2\} = \{1,2\}

A multiset does allow multiple instances of an element. \{1,1,2\} \neq \{1,2\}

A set can contain other sets. \{1,\{2\},3\} \neq \{1,2,3\}

A \subset B means that A is a proper subset of B

A \subseteq A means that a is a subset of itself. But a set is not a proper subset of itself.

A \cup B is the Union of the sets A and B. In other words, \{A+B\}


A \cap B is the Intersection of the sets A and B. In other words, \{A \cdot B\} All a's in B.

Associative: A \cdot \{B \cdot C\} = \{A \cdot B\} \cdot C
Distributive: A \cdot \{B + C\}=\{A \cdot B\} + \{A \cdot C\}
Commutative: \{A \cdot B\} =\{B \cdot A\}

A \setminus B is the Set difference of A and B. In other words, \{A - A \cdot B\}

\overline{A} or A^c = \{U - A\} is the complement of A.

A \bigtriangleup B or A \ominus B is the Anti-intersection of sets A and B which is the set of all objects that are a members of either A or B but not in both.

A \bigtriangleup B = (A + B) - (A \cdot B) = (A - A \cdot B) + (B - A \cdot B)

A \times B is the Cartesian product of A and B which is the set whose members are all possible ordered pairs (a, b) where a is a member of A and b is a member of B.

The Power set of a set A is the set whose members are all of the possible subsets of A.

A cover* of a set X is a collection of sets whose union contains X as a subset.[51]

A subset A of a topological space X is called dense* (in X) if every point x in X either belongs to A or is arbitrarily "close" to a member of A.

A subset A of X is meagre* if it can be expressed as the union of countably many nowhere dense subsets of X.

Disjoint union* of sets A_0 = {1, 2, 3} and A_1 = {1, 2, 3} can be computed by finding:


    A^*_0 & = \{(1, 0), (2, 0), (3, 0)\} \\

    A^*_1 & = \{(1, 1), (2, 1), (3, 1)\}



    A_0 \sqcup A_1 = A^*_0 \cup A^*_1 = \{(1, 0), (2, 0), (3, 0), (1, 1), (2, 1), (3, 1)\}

Let H be the subgroup of the integers (mZ, +) = ({..., −2m, −m, 0, m, 2m, ...}, +) where m is a positive integer.

Then the cosets* of H are the mZ + a = {..., −2m+a, −m+a, a, m+a, 2m+a, ...}.
There are no more than m cosets, because mZ + m = m(Z + 1) = mZ.
The coset (mZ + a, +) is the congruence class of a modulo m.[52]
Cosets are not usually themselves subgroups of G, only subsets.

\exists means "there exists at least one"

\exists! means "there exists one and only one"

\forall means "for all"

\land means "and" (not to be confused with wedge product)

\lor means "or" (not to be confused with antiwedge product)

Back to top


\vert A \vert is the cardinality of A which is the number of elements in A. See measure.

P(A) = {\vert A \vert \over \vert U \vert} is the unconditional probability that A will happen.

P(A \mid B) = {\vert A \cdot B \vert \over \vert B \vert} is the conditional probability that A will happen given that B has happened.

P(A + B) = P(A) + P(B) - P(A \cdot B) means that the probability that A or B will happen is the probability of A plus the probability of B minus the probability that both A and B will happen.

P(A \cdot B) = P(A \cdot B \mid B)P(B) = P(A \cdot B \mid A)P(A) means that the probability that A and B will happen is the probability of "A and B given B" times the probability of B.

P(A \cdot B \mid B) = \frac{P(A \cdot B \mid A) \, P(A)}{P(B)}, is Bayes' theorem*

If you dont know the certainty then you can still know the probability. If you dont know the probability then you can always know the Bayesian probability. The Bayesian probability is the degree to which you expect something.

Even if you dont know anything about the system you can still know the A priori* Bayesian probability. As new information comes in the Prior probability* is updated and replaced with the Posterior probability* by using Bayes' theorem*.

From Wikipedia:Base rate fallacy:

In a city of 1 million inhabitants let there be 100 terrorists and 999,900 non-terrorists. In an attempt to catch the terrorists, the city installs an alarm system with a surveillance camera and automatic facial recognition software. 99% of the time it behaves correctly. 1% of the time it behaves incorrectly, ringing when it should not and failing to ring when it should. Suppose now that an inhabitant triggers the alarm. What is the chance that the person is a terrorist? In other words, what is P(T | B), the probability that a terrorist has been detected given the ringing of the bell? Someone making the 'base rate fallacy' would infer that there is a 99% chance that the detected person is a terrorist. But that is not even close. For every 1 million faces scanned it will see 100 terrorists and will correctly ring 99 times. But it will also ring falsely 9,999 times. So the true probability is only 99/(9,999+99) or about 1%.

permutation relates to the act of arranging all the members of a set into some sequence or order*.

The number of permutations of n distinct objects is n!.[53]

A derangement is a permutation of the elements of a set, such that no element appears in its original position.

In other words, derangement is a permutation that has no fixed points*.

The number of derangements* of a set of size n, usually written !n*, is called the "derangement number" or "de Montmort number".[54]

The rencontres numbers* are a triangular array of integers that enumerate permutations of the set { 1, ..., n } with specified numbers of fixed points: in other words, partial derangements.[55]

a combination is a selection of items from a collection, such that the order of selection does not matter.

For example, given three numbers, say 1, 2, and 3, there are three ways to choose two from this set of three: 12, 13, and 23.

More formally, a k-combination of a set S is a subset of k distinct elements of S.

If the set has n elements, the number of k-combinations is equal to the binomial coefficient

\binom nk = \textstyle\frac{n!}{k!(n-k)!}. Pronounced n choose k. The set of all k-combinations of a set S is often denoted by \textstyle\binom Sk.

The central limit theorem (CLT) establishes that, in most situations, when independent random variables* are added, their properly normalized sum tends toward a normal distribution (informally a "bell curve") even if the original variables themselves are not normally distributed.[56]

Standard deviation diagram

A plot of normal distribution (or bell-shaped curve) where each band has a width of 1 standard deviation – See also: 68–95–99.7 rule*

In statistics, the standard deviation (SD, also represented by the Greek letter sigma σ or the Latin letter s) is a measure that is used to quantify the amount of variation or dispersion* of a set of data values.[57]

A low standard deviation indicates that the data points tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.[58]

The hypergeometric distribution* is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, without replacement, from a finite population of size N that contains exactly K objects with that feature, wherein each draw is either a success or a failure.

In contrast, the binomial distribution* describes the probability of k successes in n draws with replacement.[59]

See also Dirichlet distribution* and Rice distribution*

Back to top


See also: Higher category theory and Multivalued function (misnomer)*

Every function has exactly one output for every input.

If the function f(x) is invertible* then its inverse function f−1(x) has exactly one output for every input.

If it isn't invertible then it doesn't have an inverse function.

f(x)=x/(x-1) is an involution* which is a function that is its own inverse function. f(f(x))=x
Injection Invertible function

A morphism is exactly the same as a function but in Category theory every morphism has an inverse which is allowed to have more than one value or no value at all.

Categories* consist of:

Objects (usually Sets)
Morphisms (usually maps) possessing:
one source object (domain)
one target object (codomain)
Commutative diagram for morphism

a morphism is represented by an arrow:

f(x)=y is written f : x \to y where x is in X and y is in Y.
g(y)=z is written g : y \to z where y is in Y and z is in Z.

The image* of y is z.

The preimage* (or fiber*) of z is the set of all y whose image is z and is denoted g^{-1}[z]

A picture is worth 1000 words

Covering space diagram2

A space Y is a covering space* (a fiber bundle) of space Z if the map g : y \to z is locally homeomorphic.

A covering space is a universal covering space* if it is simply connected*.
The concept of a universal cover was first developed to define a natural domain for the analytic continuation* of an analytic function.
The general theory of analytic continuation and its generalizations are known as sheaf theory*.
The set of germs* can be considered to be the analytic continuation of an analytic function.

A topological space is (path-)connected* if no part of it is disconnected.

Torus cycles

Not simply connected

A space is simply connected* if there are no holes passing all the way through it (therefore any loop can be shrunk to a point)

See Homology*

Composition of morphisms:

g(f(x)) is written g \circ f
f is the pullback* of g
f is the lift* of g \circ f
? is the pushforward* of ?

A homomorphism* is a map from one set to another of the same type which preserves the operations of the algebraic structure:

f(x \cdot y) = f(x) \cdot f(y)
f(x + y) = f(x) + f(y)
See Cauchy's functional equation*
A Functor* is a homomorphism with a domain in one category and a codomain in another.
A group homomorphism* from (G, ∗) to (H, ·) is a function* h : GH such that
 h(u*v) = h(u) \cdot h(v) = h(c) for all u*v = c in G.
For example log(a*b) = log(a) + log(b)
Since log is a homomorphism that has an inverse that is also a homomorphism, log is an isomorphism* of groups.
See also group action* and group orbit*

A Multicategory* has morphisms with more than one source object.

A Multilinear map* f(v_1,\ldots,v_n) = W:

f\colon V_1 \times \cdots \times V_n \to W\text{,}

has a corresponding Linear map:F(v_1\otimes \cdots \otimes v_n) = W:

F\colon V_1 \otimes \cdots \otimes V_n \to W\text{,}

Back to top

Numerical methods

See also: Explicit and implicit methods*
From Wikipedia:Numerical analysis:

One of the simplest problems is the evaluation of a function at a given point.

The most straightforward approach, of just plugging in the number in the formula is sometimes not very efficient.

For polynomials, a better approach is using the Horner scheme*, since it reduces the necessary number of multiplications and additions.

Generally, it is important to estimate and control round-off errors* arising from the use of floating point* arithmetic.

Interpolation* solves the following problem: given the value of some unknown function at a number of points, what value does that function have at some other point between the given points?

Extrapolation* is very similar to interpolation, except that now we want to find the value of the unknown function at a point which is outside the given points.


Regression* is also similar, but it takes into account that the data is imprecise.

Given some points, and a measurement of the value of some function at these points (with an error), we want to determine the unknown function.

The least squares*-method is one popular way to achieve this.

Much effort has been put in the development of methods for solving systems of linear equations*.

Standard direct methods, i.e., methods that use some matrix decomposition*
Gaussian elimination*, LU decomposition*, Cholesky decomposition* for symmetric (or hermitian) and positive-definite matrix, and QR decomposition* for non-square matrices.
Iterative methods*
Jacobi method*, Gauss–Seidel method*, successive over-relaxation* and conjugate gradient method* are usually preferred for large systems. General iterative methods can be developed using a matrix splitting*.

Root-finding algorithms* are used to solve nonlinear equations.

If the function is differentiable and the derivative is known, then Newton's method is a popular choice.
Linearization* is another technique for solving nonlinear equations.

Optimization problems ask for the point at which a given function is maximized (or minimized).

Often, the point also has to satisfy some constraints*.


Differential equation: If you set up 100 fans to blow air from one end of the room to the other and then you drop a feather into the wind, what happens?

The feather will follow the air currents, which may be very complex.

One approximation is to measure the speed at which the air is blowing near the feather every second, and advance the simulated feather as if it were moving in a straight line at that same speed for one second, before measuring the wind speed again.

This is called the Euler method* for solving an ordinary differential equation.

Back to top

Information theory

From Wikipedia:Information theory:

Information theory studies the quantification, storage, and communication of information.

Communications over a channel—such as an ethernet cable—is the primary motivation of information theory.

From Wikipedia:Quantities of information:

Shannon derived a measure of information content called the self-information* or "surprisal" of a message m:

 I(m)  = \log \left( \frac{1}{p(m)} \right)  =  - \log( p(m) ) \,

where p(m) = \mathrm{Pr}(M=m) is the probability that message m is chosen from all possible choices in the message space M. The base of the logarithm only affects a scaling factor and, consequently, the units in which the measured information content is expressed. If the logarithm is base 2, the measure of information is expressed in units of bits*.

Information is transferred from a source to a recipient only if the recipient of the information did not already have the information to begin with. Messages that convey information that is certain to happen and already known by the recipient contain no real information. Infrequently occurring messages contain more information than more frequently occurring messages. This fact is reflected in the above equation - a certain message, i.e. of probability 1, has an information measure of zero. In addition, a compound message of two (or more) unrelated (or mutually independent) messages would have a quantity of information that is the sum of the measures of information of each message individually. That fact is also reflected in the above equation, supporting the validity of its derivation.

An example: The weather forecast broadcast is: "Tonight's forecast: Dark. Continued darkness until widely scattered light in the morning." This message contains almost no information. However, a forecast of a snowstorm would certainly contain information since such does not happen every evening. There would be an even greater amount of information in an accurate forecast of snow for a warm location, such as Miami. The amount of information in a forecast of snow for a location where it never snows (impossible event) is the highest (infinity).

The more surprising a message is the more information it conveys. The message "LLLLLLLLLLLLLLLLLLLLLLLLL" conveys exactly as much information as the message "25 L's". The first message which is 25 bytes long can therefore be "compressed" into the second message which is only 6 bytes long.

Back to top

Early computers,
See also: Time complexity*

Back to top

Tactical thinking

The prisoner's dilemma*
A non-zero-sum game
Tactic X
Tactic Y
Tactic A
1, 1 5, -5
Tactic B
-5, 5 -5, -5
See also Wikipedia:Strategy (game theory)*
From Wikipedia:Game theory:

In the accompanying example there are two players; Player one (blue) chooses the row and player two (red) chooses the column.

Each player must choose without knowing what the other player has chosen.

The payoffs are provided in the interior.

The first number is the payoff received by Player 1; the second is the payoff for Player 2.

Tit for tat is a simple and highly effective tactic in game theory for the iterated prisoner's dilemma.

An agent using this tactic will first cooperate, then subsequently replicate an opponent's previous action.

If the opponent previously was cooperative, the agent is cooperative.

If not, the agent is not.[60]

A zero-sum game
A 1,-1 -1,1
B -1,1 1,-1

In zero-sum games the sum of the payoffs is always zero (meaning that a player can only benefit at the expense of others).

Cooperation is impossible in a zero-sum game.

John Forbes Nash proved that there is a Nash equilibrium (an optimum tactic) for every finite game.

In the zero-sum game shown to the right the optimum tactic for player 1 is to randomly choose A or B with equal probability.

Strategic thinking differs from tactical thinking by taking into account how the short term goals and therefore optimum tactics change over time.

For example the opening, middlegame, and endgame of chess require radically different tactics.

See also: Reverse game theory*

Back to top


See also: Wikisource:The Mathematical Principles of Natural Philosophy (1846) and Galilean relativity*

Reality is what doesnt go away when you arent looking at it.

Something is known Beyond a reasonable doubt if any doubt that it is true is unreasonable. A doubt is reasonable if it is consistent with the laws of cause and effect.

From Wikipedia:Philosophiæ Naturalis Principia Mathematica:

In the four rules, as they came finally to stand in the 1726 edition, Newton effectively offers a methodology for handling unknown phenomena in nature and reaching towards explanations for them.

Rule 1: We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.
Rule 2: Therefore to the same natural effects we must, as far as possible, assign the same causes.
Rule 3: The qualities of bodies, which admit neither intensification nor remission of degrees, and which are found to belong to all bodies within the reach of our experiments, are to be esteemed the universal qualities of all bodies whatsoever.
Rule 4: In experimental philosophy we are to look upon propositions inferred by general induction from phenomena as accurately or very nearly true, not withstanding any contrary hypothesis that may be imagined, till such time as other phenomena occur, by which they may either be made more accurate, or liable to exceptions.

Classical mechanics

Newtonian mechanics, Lagrangian mechanics, and Hamiltonian mechanics
The difference between the net kinetic energy and the net potential energy is called the “Lagrangian.”
The action is defined as the time integral of the Lagrangian.
The Hamiltonian is the sum of the kinetic and potential energies.
Noether's theorem* states that every differentiable symmetry of the action* of a physical system has a corresponding conservation law*.
Relativistic Energy and Momentum

Relativistic mechanics*

Special relativity*, and General relativity*
Energy is conserved in relativity and proper velocity is proportional to momentum at all velocities.

Quantum mechanics

Highly recommend:

Thinking Physics Is Gedanken Physics by Lewis Carroll Epstein
Understanding physics by Isaac Asimov

Back to top

Twin paradox

Back to top

Dimensional analysis

See also: Natural units*
From Wikipedia:Dimensional analysis:

Any physical law that accurately describes the real world must be independent of the units (e.g. km or mm) used to measure the physical variables.

Consequently, every possible commensurate equation for the physics of the system can be written in the form

a_0 \cdot D_0 = (a_1 \cdot D_1)^{p_1} (a_2 \cdot D_2)^{p_2}...(a_n \cdot D_n)^{p_n}

The dimension, Dn, of a physical quantity can be expressed as a product of the basic physical dimensions length (L), mass (M), time (T), electric current (I), absolute temperature (Θ), amount of substance (N) and luminous intensity (J), each raised to a rational power.

Suppose we wish to calculate the range of a cannonball* when fired with a vertical velocity component V_\mathrm{y} and a horizontal velocity component V_\mathrm{x}, assuming it is fired on a flat surface.

The quantities of interest and their dimensions are then

range as Lx
V_\mathrm{x} as Lx/T
V_\mathrm{y} as Ly/T
g as Ly/T2

The equation for the range may be written:

range = (V_x)^a (V_y)^b (g)^c


\mathsf{L}_\mathrm{x} = (\mathsf{L}_\mathrm{x}/\mathsf{T})^a\,(\mathsf{L}_\mathrm{y}/\mathsf{T})^b (\mathsf{L}_\mathrm{y}/\mathsf{T}^2)^c\,

and we may solve completely as a=1, b=1 and c=-1.

Back to top


See also: Periodic table and Spatial_structure_of_the_electron

Uranium atom

The first pair of electrons fall into the ground shell. Once that shell is filled no more electrons can go into it. Any additional electrons go into higher shells.

The nucleus however works differently. The first few neutrons form the first shell. But any additional neutrons continue to fall into that same shell which continues to expand until there are 49 pairs of neutrons in that shell.

The electric force between two electrons is 4.166 * 1042 times stronger than the gravitational force

The energy required to assemble a sphere of uniform charge density = \frac{3}{5}\frac{Q^2}{4 \pi \epsilon_0 r}

For Q=1 electron charge and r=1.8506 angstrom thats 4.669 ev. That energy is stored in the electric field of the electron.
The energy per volume stored in an electric field is proportional to the square of the field strength so twice the charge has 4 times as much energy.
4*4.669 = 18.676.

Mass of electron = Me = 510,999 ev

Mass of proton = Mp = 938,272,000 ev

Mass of neutron = Mn = 939,565,000 ev

Mn = Mp + Me + 782,300 ev

Mass of muon = Mμ = 105.658 ev = 206.7683 * Me

Mass of helium atom = 3,728,400,000 = 4*Me+4*Mp -52.31 Me

The missing 52.31 electron masses of energy is called the mass deficit or nuclear binding energy. Fusing hydrogen into helium releases this energy.

Iron can be fused into heavier elements too but doing so consumes energy rather than releases energy.

Binding energy curve - common isotopes

From Wikipedia:Bohr model

The electron is held in a circular orbit* by electrostatic attraction. The centripetal force* is equal to the Coulomb force*.

 {m_\mathrm{e} v^2\over r} + F_{Pauli} = k_\mathrm{e} {Z q_e \cdot n q_e \over r^2}
where me is the electron's mass, qe is the charge of the electron, ke is Coulomb's constant*, Z is the atom's atomic number (the number of protons) and n is the number of electrons. For the ground state electron FPauli = 0.

the ground state electron's speed is:

 v = \sqrt{ k_\mathrm{e} {Z q_e \cdot n q_e \over m_\mathrm{e} r} }

The angular momentum is:

 m_\mathrm{e} v r = \hbar
where ħ is reduced Planck constant
\hbar={{h}\over{2\pi}} = 1.054\ 571\ 800(13)\times 10^{-34}\text{J}{\cdot}\text{s}

Substituting the expression for the velocity gives an equation for r:

 m_{\text{e}}\sqrt{ k_\mathrm{e} {Z q_e \cdot n q_e \over m_\mathrm{e} r} }r=\hbar
so that the allowed orbit radius is:
 r = {\hbar^2\over k_\mathrm{e} \cdot Z \cdot n \cdot q_e^2 \cdot m_\mathrm{e}}

Therefore the orbit of the ground state electron is inversely proportional to the atomic number. The values obtained by this formula should be thought of only as approximation. For Z=2 and n=2 we get 0.1322943 Å. Thats 1/4th of the Bohr radius*

Empirically determined values:

Diatomic Hydrogen (Z=2) = 1.9002 angstroms
Helium (Z=2) = 1.8506 angstroms

For electrons in higher shells the equation cant be used because FPauli > 0.


Crystalline solids: 1.2
Amorphous solids: 1.1
liquids: 1

Water ice is an exception. Ice has a density of 0.9167

Back to top

Tidal acceleration

See also: Formation_of_the_Solar_System

Image shows an approximation of the shape (Equipotentials*) of a rapidly spinning planet. North pole is at the top. South pole is at the bottom. The equator reaches orbital velocity.

Rapidly spinning planet

Orbital velocity:

v_o = \sqrt{\frac{GM}{r}}

Orbital period:

T = 2\pi\sqrt{\frac{r^3}{GM}}

Orbital angular momentum:

mvr \quad = \quad m \Bigg ( \sqrt{\frac{GM}{r}} \Bigg ) r \quad = \quad m \sqrt{GMr}

Rotational angular momentum of solid sphere:

L=I \omega = \frac{2}{5}mr^2 \frac{v}{r} = \frac{2}{5}mvr


Moons orbital angular momentum is 28.73 * 10^33 Js

Earths rotational angular momentum is 7.079 * 10^33 Js

Correcting for Earths uneven mass distribution: 3.85 + 0.44 + 0.25 + 0.073 = 4.6 * 10^33 Js

The total amount of angular momentum for the Earth-Moon system is 28.73 + 4.6 = 33.33 * 10^33 Js

Moons current orbit is 384,399 km. Its orbital period is 2.372 * 106 seconds. (27 days, 10 hours, 50 minutes). Its orbital velocity is 1.022 km/s.

Roche limit* for the moon is

Fluid: 18,381 km fluid
384,399 / 18,381 = 20.9
Orbital momentum of moon at fluid Roche limit = 28.73 * 10^33 Js / sqrt(20.9) = 6.3 * 10^33
Earth would spin (28.73-6.3+4.6)/4.6 = 5.876 times faster
Rigid: 9,492 km
384,399 / 9,492 = 40.5
Orbital momentum of moon at rigid Roche limit = 28.73 * 10^33 Js / sqrt(40.5) = 4.5 * 10^33
Earth would spin (28.73-4.5+4.6)/4.6 = 6.27 times faster

Orbital radius with period = 4 hours:

\sqrt[3]{G \cdot 1 \text{ Earth mass} \cdot \Bigg ( \frac{4\text{ hours}}{2\pi} \Bigg )^2 } = 12,800 km

Alternately we can ask what the orbital period would be if Earth had a moon (not necessarily the moon) at 18,381 km.

T = 2\pi\sqrt{\frac{(18,381 \text{ km})^3}{G*1 \text{ Earth mass} }} = 7.554 hours
Earth would spin 24/7.554 = 3.177 times faster
Earths angular momentum would be 3.177 * 4.6 * 10^33 Js = 14.6142 * 10^33 Js
Our current Moons angular momentum would be 28.73 - (14.6142 - 4.6) * 10^33 Js = 18.7158 * 10^33 Js
Thats 18.7158 / 28.73 = 0.65
So the current moons orbit would have been 0.65^2 * 384,399 km = 0.424 * 384,399 km = 162985 km

Tidal rhythmites are alternating layers of sand and silt laid down offshore from estuaries having great tidal flows. Daily, monthly and seasonal cycles can be found in the deposits. This geological record indicates that 620 million years ago there were 400±7 solar days/year

The motion of the Moon can be followed with an accuracy of a few centimeters by lunar laser ranging. Laser pulses are bounced off mirrors on the surface of the moon. The results are:

+38.08±0.04 mm/yr (384,399 km / 63.4 billion years)
1.42*10^24 Js/yr (33.33 * 10^33 Js / 23 billion years)
1.42*10^26 Js/century

The corresponding change in the length of the day can be computed:

(1.42*10^26)/(4.6 * 10^33) * 24 hours = 3.087*10^-8 * 24 hours = +2.667 ms/century

620 million yrs ago the Moon had 1.42*10^24 * 620*10^6 = 0.88*10^33 Js less angular momentum. The moons orbit was therefore 384,399 km * ((28.73-0.88)/28.73)^2 = 361,211 km. One month lasted 2.161 * 106 seconds. (25 days, 16 minutes, 40 seconds)

The Earth spun (4.6+0.88)/4.6 = 1.19 times faster so the day was 24 hours / 1.19 = 20.1680672 hours

The year was 400 "days" * 20.1680672 hours per "day" = 336.135 24-hour periods

Earths orbit was therefore

\sqrt[3]{G \cdot 1 \text{ Solar mass} \Bigg ( \frac{336.134453 \text{ days}}{2\pi} \Bigg )^2} = 0.9461 au

Therefore Earth must be receding from the sun at 13 m/yr

Thats 0.395 au / 4.543 billion years

Back to top


Titus-Bode law

Titius–Bode law*.

# Planet g/cm^3 km g's au
1 Mercury 5.427 2,440 0.377 0.387
2 Venus 5.243 6,052 0.904 0.723
3 Earth 5.515 6,371 1 1.000
4 Mars 3.934 3,390 0.378 1.524
5 Ceres 2.093 476.2 0.028 2.766
6 Jupiter 1.326 69,911 2.528 5.203
7 Saturn 0.687 58,232 1.065 9.537
8 Ouranos 1.270 25,362 0.904 19.191
9 Neptune 1.638 24,622 1.137 30.069

From Wikipedia:16 Psyche:

16 Psyche is one of the ten most massive asteroids in the asteroid belt. It is over 200 km (120 mi) in diameter and contains a little less than 1% of the mass of the entire asteroid belt. It is thought to be the exposed iron core of a protoplanet

Back to top

Brown dwarfs

Brown dwarfs mass-radius log-log plot

Hatzes & Rauer (2015), “A Definition for Giant Planets Based on the Mass-Density Relationship”, arXiv:1506.05097 [astro-ph.EP]

Hydrogen Atomic
g/cm3 Jupiter
Liquid 1 0.07085 0.053 MJup 0.14
Metallic 1/ 4 4.5344   3.400 MJup 9.00
Double 1/ 5.657 12.8250   9.669 MJup 25.56
Triple 1/ 8 36.2752   27.300 MJup 72.16
Quadruple 1/ 11.31 102.6          77.350 MJup 204.40
Quintuple 1/ 16 290.2016   219.000 MJup 578.80
Sextuple 1/ 22.63 820.8140   618.800 MJup 1636.00

As can be seen in the image to the right, all planets (Brown dwarfs) from 1 to 100 Jupiter masses are about 1 Jupiter radius which is 69,911 km. The largest "puffy" planets are 2 Jupiter radii. 1 Jupiter volume = 1.431×1015 km3

This suggests that the pressure an electron shell (in degenerate matter) can withstand without again becoming degenerate (Electron degeneracy pressure*) is inversely proportional to the sixth power of its radius:


(This formula only applies to degenerate matter like metallic hydrogen. Non-degenerate matter can withstand far more pressure).

If so then the maximum size (radius) that a planet composed entirely of one (degenerate) element could grow would depend only on, and be inversely proportional to, the atomic mass of its atoms. (Use 2 for the atomic mass of diatomic hydrogen).

Simplified calculation of radius of brown dwarf as core grows from zero to 1 Jupiter radius:

plot (-(1.83*r3/x -1.83*r2)+(x2/2-r2/2)=0.5) for r=0 to 1 and x=0 to 1
r is radius of core with 2.83 (sqrt(2)3) times the density of overlying material

Rock floats on top of the metallic hydrogen but iron sinks to the Core. 0.1% of the mass of the brown dwarf is iron. Assuming iron density of 231.85 g/cm3 (as in Earths core), the gravity of the iron core will cause the brown dwarf to be about 3% smaller then it would be otherwise.

Back to top

Dark matter

From Wikipedia:Dark matter

Dark matter is a type of unidentified matter that may constitute about 80% of the total matter in the universe. It has not been directly observed, but its gravitational effects are evident in a variety of astrophysical measurements. The primary evidence for dark matter is that calculations show that many galaxies would fly apart instead of rotating if they did not contain a large amount of matter beyond what can be observed.

From Wikipedia:Gravitational microlensing

A Horseshoe Einstein Ring from Hubble

An Einstein ring.

Gravity lens geometry

Microlensing allows the study of objects that emit little or no light. With microlensing, the lens mass is too low for the displacement of light to be observed easily, but the apparent brightening of the source may still be detected. In such a situation, the lens will pass by the source in seconds to years instead of millions of years.

The Einstein radius, also called the Einstein angle, is the angular radius of the Einstein ring in the event of perfect alignment. It depends on the lens mass M, the distance of the lens dL, and the distance of the source dS:

\theta_E = \sqrt{\frac{4GM}{c^2} \frac{d_S - d_L}{d_S d_L}} (in radians).

For M equal to 60 Jupiter masses, dL = 4000 parsecs, and dS = 8000 parsecs (typical for a Bulge microlensing event), the Einstein radius is 0.00024 arcseconds (angle subtended by 1 au at 4000 parsecs). By comparison, ideal Earth-based observations have angular resolution around 0.4 arcseconds, 1660 times greater.

Any brown dwarf surrounded by a circumstellar disk larger and thicker than 1 au would therefore be virtually completely undetectable.

Back to top


See also: Stellar evolution*, Helium flash*, Schönberg–Chandrasekhar limit*, Coronal_heating_problem

Image of spiral galaxy M81*.

Spiral galaxy arms diagram

Explanation of spiral galaxy arms.

Fusion of diatomic hydrogen begins around 60 Jupiter masses. Fusion of monatomic helium requires significantly more pressure.

Fusion releases energy that heats the star causing it to expand. The expansion reduces the pressure in the core which reduces the rate of fusion. So the rate of fusion is self limiting. A low mass star has a lifetime of billions of years. A high mass star has a lifetime of only a few tens of millions of years despite starting with more hydrogen.

Low mass stars are far more common than high mass stars. The masses of the two component stars of NGC 3603-A1, A1a and A1b, determined from the orbital parameters are 116 ± 31 M☉ and 89 ± 16 M☉respectively. This makes them the two most massive stars directly measured, i.e. not estimated from models.

The luminousity of a star is:

L = 4 \pi R^2 \sigma T^4
where σ is the Stefan–Boltzmann constant*:
\sigma = \frac{2\pi^5k_{\rm B}^4}{15h^3c^2} = \frac{\pi^2k_{\rm B}^4}{60\hbar^3c^2} = 5.670373(21) \, \times 10^{-8}\ \textrm{J}\,\textrm{m}^{-2}\,\textrm{s}^{-1}\,\textrm{K}^{-4}

The luminosity of the sun at 5772 K and 695,700 km is 3.828×10^26 Watts

Thats 6,297,000 watts/m2

The brightness of sunlight at the surface of the Earth is 1400 watt/meter2

The plasma inside a star is non-relativistic. A relativistic plasma with a thermal distribution function* has temperatures greater than around 260 keV, or 3.0 * 109 K*. Those sorts of temperatures are only created in a supernova. The core of the sun is about 15 * 106 K.

Plasmas, which are normally opaque to light, are transparent to light with frequency higher than the plasma frequency*. The plasma literally cant vibrate fast enough to keep up with the light. Plasma frequency is proportional to the square root of the electron density.

\omega = \sqrt{\frac{n_\mathrm{e} q_e^{2}}{m_e \varepsilon_0}}
ne = number of electrons / volume.

See also: Bremsstrahlung#Thermal_bremsstrahlung*

From Wikipedia:Radiative zone

From 0.3 to 1.2 solar masses, the region around the stellar core is a radiative zone. (The light frequency is higher than the plasma frequency). The radius of the radiative zone increases monotonically with mass, with stars around 1.2 solar masses being almost entirely radiative.

From Wikipedia:Convective zone

In main sequence stars of less than about 1.3 solar masses, the outer envelope of the star contains a region of relatively low temperature which causes the frequency of the light to be lower than the plasma frequency which causes the opacity to be high enough to produce a steep temperature gradient. This produces an outer convection zone. The Sun's convection zone extends from 0.7 solar radii (500,000 km) to near the surface.

From Wikipedia:Cepheid variable

A Cepheid variable is a type of star that pulsates radially, varying in both diameter and temperature and producing changes in brightness with a well-defined stable period and amplitude.

A strong direct relationship between a Cepheid variable's luminosity and pulsation period allows one to know the true luminosity of a Cepheid by simply observing its pulsation period. This in turn allows one to determine the distance to the star, by comparing its known luminosity to its observed brightness.

From Wikipedia:Variable star

The pulsation of cepheids is known to be driven by oscillations in the ionization of helium. From fully ionized (more opaque) He++ to partially ionized (more transparent) He+ and back to He++. See Kappa mechanism*.

In the swelling phase. Its outer layers expand, causing them to cool. Because of the decreasing temperature the degree of ionization also decreases. This makes the gas more transparent, and thus makes it easier for the star to radiate its energy. This in turn will make the star start to contract. As the gas is thereby compressed, it is heated and the degree of ionization again increases. This makes the gas more opaque, and radiation temporarily becomes captured in the gas. This heats the gas further, leading it to expand once again. Thus a cycle of expansion and compression (swelling and shrinking) is maintained.

From Wikipedia:Instability strip

In normal A-F-G stars He is neutral in the stellar photosphere. Deeper below the photosphere, at about 25,000–30,000K, begins the He II layer (first He ionization). Second ionization (He III) starts at about 35,000–50,000K.

Recombination and Reionization

From Wikipedia:Reionization

The first phase change of hydrogen in the universe was recombination due to the cooling of the universe to the point where electrons and protons form neutral hydrogen. The universe was opaque before the recombination, due to the scattering of photons (of all wavelengths) off free electrons, but it became increasingly transparent as more electrons and protons combined to form neutral hydrogen atoms. The Dark Ages of the universe start at that point, because there were no light sources.

The second phase change occurred once objects started to condense in the early universe that were energetic enough to re-ionize neutral hydrogen. As these objects formed and radiated energy, the universe reverted to once again being an ionized plasma. (See Warm–hot intergalactic medium*). At this time, however, matter had been diffused by the expansion of the universe, and the scattering interactions of photons and electrons were much less frequent than before electron-proton recombination. Thus, a universe full of low density ionized hydrogen will remain transparent, as is the case today.

The Sun's photosphere has a temperature between 4,500 and 6,000 K. Negative hydrogen ions (H-) are the primary reason for the highly opaque nature of the photosphere.

As the star burns hydrogen heavier elements build up in the core. Eventually the outer layers of the star are blown away and all thats left is the core. We call whats left a white dwarf.

Back to top


A plot of 22000 stars from the Hipparcos Catalogue together with 1000 low-luminosity stars (red and white dwarfs) from the Gliese Catalogue of Nearby Stars. The ordinary hydrogen-burning dwarf stars like the Sun are found in a band running from top-left to bottom-right called the Main Sequence. Giant stars form their own clump on the upper-right side of the diagram. Above them lie the much rarer bright giants and supergiants. At the lower-left is the band of white dwarfs.

Back to top

White dwarfs

Z* A* Element (ppm) g/cm3 g/cm3 radius
1 1 Hydrogen* 739,000 0.07085 290.2 71,492
1 2 Deuterium* 100 0.1417 580.4 35,746
2 4 Helium* 240,000 0.125 512 35,746
4 8 Beryllium* 0 2 8,192 17,873
8 16 Oxygen* 10,400 32 131,072   8,936
6 12 Carbon* 4,600 10.125 41,472 11,915
10 20 Neon* 1,340 78.125 320,000   7,149
26 56 Iron-56* 1,090 3844.75 15,748,096   2,553
7 14 Nitrogen* 960 18.76 76,841 10,213
14 28 Silicon* 650 300.125 1,229,312   5,107
12 24 Magnesium* 580 162 663,552   5,958
16 32 Sulfur* 440 512 2,097,152   4,468

A white dwarf is about the same size as the Earth but is far denser and far more massive. A typical temperature for a white dwarf is 25,000 K. That would make its surface brightness 350 times the surface brightness of the sun.

Simplified calculation of radius of White dwarf as core grows from zero to half the original radius:

plot (-(15*r^3/x -15*r^2)+(x^2/2-r^2/2)=0.5) for r=0 to 0.5 and x=0 to 1
r is radius of core. The core has 16 times the density (twice the atomic number) of the overlying material. The final state has half the radius and twice the mass of the original white dwarf.

A 0.6 solar mass White dwarf is 8900 km in radius which Is 8.03 times smaller than Jupiter which suggests a composition of oxygen. It has a surface gravity of

\frac{G \cdot 0.6 \text{ solar mass}}{(8900 \text{ km})^2} = 103,000 g's

Its density is 404,000 g/cm3 which is 12,625 times denser than oxygen in its ground state. Thats 23.2853 times denser. Sqrt(2)9 = 22.63

A 1.13 solar mass White dwarf is 4500 km in radius which Is 15.9 times smaller than Jupiter which suggests a composition of sulfur. It has a surface gravity of

\frac{G \cdot 1.13 \text{ solar mass}}{(4500 \text{ km})^2} = 755,000 g's

Its density is 5.887 * 106 g/cm3 which is 11,498 times denser than sulfur in its ground state. Thats 22.573 times denser.

For a white dwarf made of iron:

Radius: 2,553 km
Surface area: 8.2*107 km2
Mass per surface area: 3.8 * 1013 g/mm2
Mass: 4.454 * 107 g/cm3 * (4/3)*pi*(2553 km)3 in solar masses = 1.56 solar masses.
Surface gravity: 3.24 * 106 g's
Density: (sqrt(2)9)3 * 3844.75 g/cm3 = 4.454 * 107 g/cm3
Core pressure: 1.8 * 1019 bars

The core of a white dwarf with a mass greater than the Chandrasekhar limit* (1.44 solar masses) will undergo gravitational collapse and become a neutron star.

Back to top

Nucleosynthesis in a star

Back to top

Neutron stars

See also: Gravitoelectromagnetism*

Assuming a solid honeycomb array of neutron pairs with radius 1 fm, a sheet of neutronium* (if such a thing existed) would have a density of 1.2893598 g/mm2.

Density of a liquid neutron star made of neutron pairs with radius 1 fm would be 479.8×1012 g/cm3

The maximum observed mass of neutron stars is about 2.01 M.

At that density a 2 solar mass neutron star would have a radius of 12.5544 km

The Tolman–Oppenheimer–Volkoff limit* (or TOV limit) is an upper bound to the mass of cold, nonrotating neutron stars, analogous to the Chandrasekhar limit for white dwarf stars. Observations of GW170817 suggest that the limit is close to 2.17 solar masses.

The equation of state for a neutron star is not yet known.

A 2 solar mass neutron star with radius of 12.5544 km would have a surface gravity of:

\frac{G \cdot 2 \text{ solar mass}}{(12.5544 \text{ km})^2} = 1.717 1011 g's

The pressure in its core would be \frac{3}{8 \pi} \frac{G \cdot Mass_{active} \cdot Mass_{passive}}{r^4} = 5.072 * 1028 bar = 5.071 * 1028 bar

From Wikipedia:Glitch (astronomy)

A glitch (See Global_resurfacing_event*) is a sudden increase of up to 1 part in 106 in the rotational frequency of a rotation-powered pulsar. Following a glitch is a period of gradual recovery, lasting from days to years, where the observed periodicity slows to a period close to that observed before the glitch.

From Wikipedia:Supermassive black hole

A supermassive black hole (SMBH or SBH) is the largest type of black hole*, on the order of hundreds of thousands to billions of solar masses* (M), and is found in the centre of almost all currently known massive galaxies.

The mean ratio of black hole mass to bulge mass is now believed to be approximately 1:1000.

Some supermassive black holes appear to be over 10 billion solar masses.

From Wikipedia:Quasar:

A quasar is an active galactic nucleus of very high luminosity. A quasar consists of a supermassive black hole surrounded by an orbiting accretion disk of gas. The most powerful quasars have luminosities exceeding 2.6×1014 (1041 W or 17.64631 M/year), thousands of times greater than the luminosity of a large galaxy such as the Milky Way.

Growing at a rate of 17.6 solar mass per year a 3.3 billion solar mass Black hole would take 187,000,000 years to reach full size.

Back to top

Gamma-ray bursts

Burst durations

From Wikipedia:Gamma-ray burst

Gamma-ray bursts (GRBs) are extremely energetic explosions that have been observed in distant galaxies. They are the brightest electromagnetic events known to occur in the universe. Bursts can last from ten milliseconds to several hours. After an initial flash of gamma rays, a longer-lived "afterglow" is usually emitted at longer wavelengths (X-ray, ultraviolet, optical, infrared, microwave and radio).

Assuming the gamma-ray explosion to be spherical, the energy output of GRB 080319B* would be within a factor of two of the rest-mass energy of the Sun (the energy which would be released were the Sun to be converted entirely into radiation).

No known process in the universe can produce this much energy in such a short time.

From Wikipedia:GRB 111209A

GRB 111209A is the longest lasting gamma-ray burst (GRB) detected by the Swift Gamma-Ray Burst Mission on December 9, 2011. Its duration is longer than 7 hours.

On average two long gamma ray burst occurs every 3 days and have average redshift of 2. Making the simplifying assumption that all long gamma ray bursts occur at exactly redshift 2 (9.2 * 109 light years) we get one gamma ray burst per (1,635,000 light years)3

There are 12 galaxies per cubic megaparsec. Thats 1 galaxy per (1,425,000 light years)3

One short grb per 3 days at average redshift of 0.5 (4.6 * 109 light years) gives 1 grb per (1,300,000 light years)3

Duration Type Radius Mass? Escape
0.3 sec Short 89,900 km 18.535 MJup 0.0007622 c
3 sec long 899,377 km 1.76928 M 0.002410 c
30 sec 8,994,000 km 176.928 M 0.007622 c
5 min 89,940,000 km 17,692.8 M 0.02410 c
50 min 899,400,000 km 1,769,280 M 0.07622 c
8.33 hours Ultra-long 8,994,000,000 km 176,928,000 M 0.2410 c
3.472 days Hypothetical 89,940,000,000 km 17,692,800,000 M 2.410 c
Surface gravity = 29.6 g's, Density = 3.461099×108 g/mm^2

Back to top

Ultra-high-energy Cosmic rays

From Wikipedia:Cosmic ray

Cosmic rays* are high-energy radiation, mainly originating outside the Solar System and even from distant galaxies. Upon impact with the Earth's atmosphere, cosmic rays can produce showers of secondary particles that sometimes reach the surface. Composed primarily of high-energy protons and atomic nuclei, they are of uncertain origin. Data from the Fermi Space Telescope (2013) have been interpreted as evidence that a significant fraction of primary cosmic rays originate from the supernova explosions of stars. Active galactic nuclei are also theorized to produce cosmic rays.

From Wikipedia:Ultra-high-energy cosmic ray

In astroparticle physics*, an ultra-high-energy cosmic ray (UHECR) is a cosmic ray particle with a kinetic energy greater than than 1×1018 eV*, far beyond both the rest mass* and energies typical of other cosmic ray particles.

An extreme-energy cosmic ray (EECR) is an UHECR with energy exceeding 5×1019 eV (about 8 joule), the so-called Greisen–Zatsepin–Kuzmin limit* (GZK limit). This limit should be the maximum energy of cosmic ray protons that have traveled long distances (about 160 million light years), since higher-energy protons would have lost energy over that distance due to scattering from photons in the cosmic microwave background* (CMB). However, if an EECR is not a proton, but a nucleus with A nucleons, then the GZK limit applies to its nucleons, each of which carry only a fraction 1/A of the total energy.

These particles are extremely rare; between 2004 and 2007, the initial runs of the Pierre Auger Observatory* (PAO) detected 27 events with estimated arrival energies above 5.7×1019 eV, i.e., about one such event every four weeks in the 3000 km2 area surveyed by the observatory.

At that rate 1.365 * 1018 particles will fall onto a star with radius 1 million kilometers every hundred million years.

From Wikipedia:Oh-My-God particle:

The Oh-My-God particle was an ultra-high-energy cosmic ray detected on the evening of 15 October 1991 by the Fly's Eye Cosmic Ray Detector. Its observation was a shock to astrophysicists, who estimated its energy to be approximately 3×1020 eV.

3×1020 eV = mv2 = 4 neutron masses * 2562* (2.41 c)2 * 209,711

Back to top


Madden-Julian Oscillation monitoring
National Weather Service
Example of a cold front

An idealised view of three large circulation cells showing surface winds

A cold front is the leading edge of a cold dense mass of air, replacing (at ground level) a warmer mass of air. Like a hot air balloon, the warm air rises above the cold air. The rising warm air expands and therefore cools. This causes the moisture within it to condense into droplets and releases the latent heat of condensation which causes the warm air to rise even further. If the warm air is moist enough, rain can occur along the boundary. A narrow line of thunderstorms often forms along the front. Temperature changes across the boundary can exceed 30 °C (54 °F).

The polar front is a cold front that arises as a result of cold polar air meeting warm subtropical air at the boundary between the polar cell and the Ferrel cell in each hemisphere.

Northwest Pacific extratropical cyclone 2013-01-15 0300Z

An extratropical cyclone. An atmospheric River extends from botom left.

Earth's weather is driven by 2 main areas.

  • The polar front.
    • When the polar front dominates we have an El Nino.
  • The Intertropical Convergence Zone.
    • When the Intertropical Convergence Zone dominates we have an El Nina.

In the Pacific, strong MJO activity is often observed 6 – 12 months prior to the onset of an El Niño episode, but is virtually absent during the maxima of some El Niño episodes, while MJO activity is typically greater during a La Niña episode.

During an El Nino, extratropical cyclones, which form along the polar front, can become so large that they draw moisture up directly from the tropics in what is called an atmospheric River. (See the image to the right.) Atmospheric rivers are typically several thousand kilometers long and only a few hundred kilometers wide, and a single one can carry a greater flux of water than the Earth's largest river, the Amazon.[61] The Amazon discharges more water into the oceans than the next 7 largest rivers. (The Amazon river valley is an Aulacogen.)


Tropical air is far warmer than air outside the tropics and therefore holds far more moisture and as a result thunderstorms in the tropics are much taller. Nevertheless severe thunderstorms are not common in the tropics because the storms own downdraft shuts off the inflow of warm moist air killing the thunderstorm before it can become severe. Severe thunderstorms tend to occur further north because of the polar jet stream. The jet stream pushes against the top of the thunderstorm displacing the downdraft so that it can no longer shut off the inflow of warm moist air. As a result severe thunderstorms can continue to feed and grow for many hours whereas normal thunderstorms only last 30 minutes.

Supercell side view

Over a 30 minute period a normal thunderstorm releases 1015 Joules of energy equivalent to 0.24 megatons of TNT. A storm that lasted 24 hours would release 50 times as much energy equivalent to 12 megatons of TNT. A hurricane (a tropical cyclone) releases as much energy as 1000 thunderstorms. 5.2 x 1019 Joules/day equivalent to 10,000 megatons of TNT per day.

The record lowest pressure established in the northern hemisphere is the extratropical cyclone of January 10, 1993 between Iceland and Scotland which deepened to a central pressure of 912-915 mb (26.93”-27.02”). Most hurricanes have an eye below 990 millibars. In 2005, hurricane WILMA reached the lowest barometric pressure ever recorded in an Atlantic Basin hurricane: 882 millibars. Hurricanes don't form in the South Atlantic.

If Earth's atmosphere were only slightly thicker then the air would be warmer and the amount of water vapor in the air would be much greater and lightning would therefore be much more common. The lightning would break apart the air molecules which would be washed down into the sea where they would end up in sediments which get subducted into the Earth. In this way the Earth's average air pressure is maintained at its current level.

During most of it's history Earth only had one atmospheric cell that extended from the pole to the equator and as a result Earth was very much warmer. A sort of ultra El Nina.

  • Hadley cell

During an ice_age the Earth only has two cells. A sort of ultra El Nino. Ice ages are probably caused by deforestation caused by megafauna.

  • Polar cell
  • Ferrel cell

The Earth's atmosphere currently has 3 cells.

  • Polar_cell
  • Ferrel_cell
  • Hadley_cell

From Wikipedia:Scale height

Scale height is the increase in altitude for which the atmospheric pressure decreases by a factor of e. The scale height remains constant for a particular temperature. It can be calculated by

H = \frac{kT}{Mg}
  • k = Boltzmann constant = 1.38 x 10−23 J·K−1
  • T = mean atmospheric temperature in kelvins = 250 K for Earth
  • M = mean mass of a molecule (units kg)
  • g = acceleration due to gravity on planetary surface (m/s²)

Approximate atmospheric scale heights for selected Solar System bodies follow.

  • Venus: 15.9 km
  • Earth: 8.5 km
  • Mars: 11.1 km
  • Jupiter: 27 km
  • Saturn: 59.5 km
  • Titan: 21 km
  • Uranus: 27.7 km
  • Neptune: 19.1–20.3 km
  • Pluto: ~60 km

If all of Earths atmosphere were at 1 bar then the atmosphere would be 8.5 km thick.

Back to top


See also: Emergence* and Nanobe*
External link: Molecular biology of the cell

Did life begin with nucleic acids* or amino acids? Maybe it began with a molecule that was both a nucleic acid and an amino acid.



Creating the monomers in the Primordial soup* is easy but getting the monomers to bond into a polymer is hard. So maybe it wasnt a polymer at all. Maybe it was a one dimensional liquid crystal. See Mesogen*.

Back to top

Unexplained phenomena

See also: Wikipedia:List of unsolved problems in physics

Books published by William R. Corliss* include:

  • Mysteries of the Universe (1967)
  • Mysteries Beneath the Sea (1970)
  • Strange Phenomena: A Sourcebook of Unusual Natural Phenomena (1974)
  • Strange Artifacts: A Sourcebook on Ancient Man (1974)
  • The Unexplained (1976)
  • Strange Life (1976)
  • Strange Minds (1976)
  • Strange Universe (1977)
  • Handbook of Unusual Natural Phenomena (1977)
  • Strange Planet (1978)
  • Ancient Man: A Handbook of Puzzling Artifacts (1978)
  • Mysterious Universe: A Handbook of Astronomical Anomalies (1979)
  • Unknown Earth: A Handbook of Geological Enigmas (1980)
  • Incredible Life: A Handbook of Biological Mysteries (1981)
  • The Unfathomed Mind: A Handbook of Unusual Mental Phenomena (1982)
  • Lightning, Auroras, Nocturnal Lights, and Related Luminous Phenomena (1982)
  • Tornados, Dark Days, Anomalous Precipitation, and Related Weather Phenomena (1983)
  • Earthquakes, Tides, Unidentified Sounds, and Related Phenomena (1983)
  • Rare Halos, Mirages, Anomalous Rainbows, and Related Electromagnetic Phenomena (1984)
  • The Moon and the Planets (1985)
  • The Sun and Solar System Debris (1986)
  • Stars, Galaxies, Cosmos (1987)
  • Carolina Bays, Mima Mounds, Submarine Canyons (1988)
  • Anomalies in Geology: Physical, Chemical, Biological (1989)
  • Neglected Geological Anomalies (1990)
  • Inner Earth: A Search for Anomalies (1991)
  • Biological Anomalies: Humans I (1992)
  • Biological Anomalies: Humans II (1993)
  • Biological Anomalies: Humans III (1994)
  • Science Frontiers: Some Anomalies and Curiosities of Nature (1994)
  • Biological Anomalies: Mammals I (1995)
  • Biological Anomalies: Mammals II (1996)
  • Biological Anomalies: Birds (1998)
  • Ancient Infrastructure: Remarkable Roads, Mines, Walls, Mounds, Stone Circles: A Catalog of Archeological Anomalies (1999)
  • Ancient Structures: Remarkable Pyramids, Forts, Towers, Stone Chambers, Cities, Complexes: A Catalog of Archeological Anomalies (2001)
  • Remarkable Luminous Phenomena in Nature: A Catalog of Geophysical Anomalies (2001)
  • Scientific Anomalies and other Provocative Phenomena (2003)
  • Archeological Anomalies: Small Artifacts (2003)
  • Archeological Anomalies: Graphic Artifacts I (2005)

Back to top


See also: Myers–Briggs Type Indicator*

Fear is like dirt and it washes right off.

From Wikipedia:Myers–Briggs Type Indicator

Jung's typological model regards psychological type as similar to left or right handedness: people are either born with, or develop, certain preferred ways of perceiving and deciding. The MBTI sorts some of these psychological differences into four opposite pairs, or "dichotomies", with each pair associated with a basic human drive:


  • Sensing/Intuition



  • Thinking/Feeling


  • Perception/Judging

Sensing types are more likely to trust information that is in the present, tangible, and concrete: that is, empirical information that can be understood by the five senses. They tend to distrust hunches, which seem to come "out of nowhere".

Intuition types tend to be more interested in the underlying reality than in superficial appearance.

Extraverted types recharge and get their energy from spending time with people.

Introverted types recharge and get their energy from spending time alone

An ambivert is both intraverted and extroverted.

Thinking types tend to decide things from a more detached standpoint, measuring the decision by what seems reasonable, logical, causal, consistent, and matching a given set of rules.

Feeling types tend to come to decisions by associating or empathizing with the situation, looking at it 'from the inside' and weighing the situation to achieve, on balance, the greatest harmony, consensus and fit, considering the needs of the people involved.

A hermaphrodite is both Feeling and Thinking

Perception types like to "keep their options open" (in other words they like to cheat).

Judging types are more comfortable with a structured environment. One that is planned and organized.


Consciousness is being aware of being aware. To be aware means knowing what you are doing. Computers know how to do things but don't yet know what they are doing.

Qualia are deeply mystifying. It is very hard to imagine how electrical signals passing through the microtubules of the brain could possibly produce something like the perception of colors.

But imagine a computer that knows what it is doing that is hooked up to a camera. Imagine that the computer is able to identify objects and intelligent enough to answer questions about what it is seeing. Obviously it must be perceiving some sort of sensation.

But that sensation would be like our perception of black and white. It would be just be information. It would be devoid of beauty. It would not be like our perception of beautiful colors like yellow red or blue (which are colorized versions of white grey and black)

The computer would live in a world without beauty or pleasure. But it would also live in a world without pain. It's hard to tell whether one should feel sorry for it or envy it,  especially when one considers how much time we spend doing stuff we hate in order to avoid something we hate even more.

Back to top


See also: Method of loci*

To memorize some fact it helps to associate the fact with some abstract imagery. The more bizarre, outlandish, or even ridiculous the imagery the easier it is to remember the fact. I have no doubt that this explains much of the imagery of mythology.

Those who cant remember mythology are doomed to repeat it.

Mythological landscape

Back to top

REM Sleep

Animals that are allowed to get deep sleep but prevented from getting REM sleep die. Even schizoids require a little bit of REM sleep. Death by sleep deprivation* was a long slow and painful way to die.

Back to top

Sleep Hypnogram

Back to top


From Wikipedia:Fascism

Fascism is a radical, authoritarian or totalitarian nationalist or ultranationalist political ideology. Fascists paradoxically promote violence and war as actions that create positive transformation in society and exalt militarism as providing national regeneration, spiritual renovation, vitality, education, instilling of a will to dominate in people's character, and creating national comradeship through military service. Fascists view conflict as an inevitable fact of life that is responsible for all human progress

Ultimately, it is easier to define fascism by what it is against than by what it is for. Fascism is anti-anarchist, anti-communist, anti-conservative, anti-democratic, anti-individualist, anti-liberal, anti-parliamentary, anti-bourgeois, and anti-proletarian. It entails a distinctive type of anti-capitalism and is typically, with few exceptions, anti-clerical. Fascism rejects the concepts of egalitarianism, materialism, and rationalism in favour of action, discipline, hierarchy, spirit, and will. In economics, fascists oppose liberalism (as a bourgeois movement) and Marxism (as a proletarian movement) for being exclusive economic class-based movements.

Indeed, fascism is perhaps best described as "anti-ism"; that is, the philosophy of being against everyone and everything all of the time. The only place where fascism makes any sense is bootcamp. But if fascists had their way they would turn the entire world into one big never-ending boot camp


See also: Cult of personality*

A report[62][63] prepared during the war by the United States Office of Strategic Services describing Hitler's psychological profile states:

He has been able, in some manner or other, to unearth and apply successfully many factors pertaining to group psychology

Capacity to appeal to the most primitive, as well as the most ideal inclinations in man, to arouse the basest instincts and yet cloak them with nobility, justifying all actions as means to the attainment of an ideal goal.

Appreciation of winning confidence from the people by a show of efficiency within the organization and government. It is said that foods and supplies are already in the local warehouses when the announcement concerning the date of distribution is made. Although they could be distributed immediately the date is set for several weeks ahead in order to create an impression of super-efficiency and win the confidence of the people. Every effort is made to avoid making a promise which cannot be fulfilled at precisely the appointed time

Hitler's ability to repudiate his own conscience in arriving at political decisions has eliminated the force which usually checks and complicates the forward-going thoughts and resolutions of most socially responsible statesmen. He has, therefore, been able to take that course of action which appeals to him as most effective without pulling his punches. The result has been that he has frequently outwitted his adversaries and attained ends which would not have been as easily attained by a normal course. Nevertheless, it has helped to build up the myth of his infallibility and invincibility.

Equally important has been his ability to persuade others to repudiate their individual consciences and assume that role himself. He can then decree for the individual what is right and wrong, permissible or impermissible and can use them freely in the attainment of his own ends. As Goering has said: "I have no conscience. My conscience is Adolph Hitler."

This has enabled Hitler to make full use of terror and mobilize the fears of the people which he evaluated with an almost uncanny precision.

His primary rules were: never allow the public to cool off; never admit a fault or wrong; never concede that there may be some good in your enemy; never leave room for alternatives; never accept blame; concentrate on one enemy at a time and blame him for everything that goes wrong; people will believe a big lie sooner than a little one; and if you repeat it frequently enough people will sooner or later believe it.

Back to top


From Wikipedia:Sturmabteilung

The Sturmabteilung (SA), literally Storm Detachment, functioned as the original paramilitary wing of the Nazi Party. The SA developed by organizing and formalizing the groups of ex-soldiers and beer hall brawlers. It played a significant role in Adolf Hitler's rise to power* in the 1920s and 1930s.

Its primary purposes were providing protection for Nazi rallies and assemblies, disrupting the meetings of opposing parties, fighting against the paramilitary units of the opposing parties, especially the Red Front Fighters League of the Communist Party of Germany, and intimidating Slavs, Romanis, trade unionists, and, especially, Jews – for instance, during the Nazi boycott of Jewish businesses. The SA were also called the "Brownshirts" (Braunhemden) from the color of their uniform shirts.

In 1922, the Nazi Party created a youth section, the Jugendbund, for young men between the ages of 14 and 18 years. Its successor, the Hitler Youth (Hitlerjugend or HJ), remained under SA command until May 1932.

While Hitler was in prison, Ernst Röhm helped to create the Frontbann as a legal alternative to the then-outlawed SA. At Landsberg prison in April 1924, Röhm had also been given authority by Hitler to rebuild the SA in any way he saw fit.

Many of these stormtroopers believed in the socialist promise of National Socialism and expected the Nazi regime to take more radical economic action, such as breaking up the vast landed estates of the aristocracy once they obtained national power.

After Hitler and the Nazis obtained national power, the SA became increasingly eager for power itself. By the end of 1933, the SA numbered over three million men and many saw themselves as a replacement for the "antiquated" Reichswehr. Röhm's ideal was to absorb the army (then limited by law to no more than 100,000 men) into the SA, which would be a new "people's army". This deeply offended and alarmed the army, and threatened Hitler's goal of co-opting the Reichswehr. The SA's increasing power and ambitions also posed a threat to the other Nazi leaders.

SS and Gestapo

Originally an adjunct to the SA, the Schutzstaffel (SS) was placed under the control of Heinrich Himmler in part to restrict the power of the SA and their leaders. The younger SS had evolved to be more than a bodyguard unit for Hitler and showed itself better suited to carry out Hitler's policies, including those of a criminal nature.

Over time the SS became answerable only to Hitler, a development typical of the organizational structure of the entire Nazi regime, where legal norms were replaced by actions undertaken under the Führerprinzip* (leader principle), where Hitler's will was considered to be above the law.[64]

Hermann Göring—the number two man in the Nazi Party—was named Interior Minister of Prussia. This gave Göring command of the largest police force in Germany. Soon afterward, Göring detached the political and intelligence sections from the police and filled their ranks with Nazis. On 26 April 1933, Göring merged the two units as the Geheime Staatspolizei, which was abbreviated by a post office clerk and became known as the "Gestapo".

The first commander of the Gestapo was Rudolf Diels. Concerned that Diels was not ruthless enough to effectively counteract the power of the Sturmabteilung (SA), Göring handed over control of the Gestapo to Himmler on 20 April 1934

Blomberg and von Reichenau began to conspire with Hermann Göring and Heinrich Himmler against Röhm and the SA. Himmler asked Reinhard Heydrich to assemble a dossier on Röhm. Heydrich manufactured evidence that suggested that Röhm had been paid 12 million marks by French agents to overthrow Hitler.

Hitler was also concerned that Röhm and the SA had the power to remove him as leader. Göring and Himmler played on this fear by constantly feeding him with new information on Röhm's proposed coup. A masterstroke was to claim that Gregor Strasser, whom Hitler hated, was part of the planned conspiracy against him. With this news Hitler ordered all the SA leaders to attend a meeting in the Hanselbauer Hotel in Bad Wiessee.

On 30 June 1934, Hitler, accompanied by SS units, arrived at Bad Wiessee, where he personally placed Röhm and other high-ranking SA leaders under arrest. (See Night of the Long Knives*). The homosexuality of Röhm and other SA leaders was made public to add "shock value", even though the sexuality of Röhm and other named SA leaders had been known by Hitler and other Nazi leaders for years.

Arriving back at party headquarters in Munich, Hitler addressed the assembled crowd. Consumed with rage, Hitler denounced "the worst treachery in world history."

Highly recommend: War and Peace by Tolstoy

Back to top


In Internet terminology a troll is someone who comes into an established community such as an online discussion forum and posts inflammatory, rude, repetitive or offensive messages as well as top post flooding and impersonating others -- designed intentionally to annoy or antagonize the existing members or disrupt the flow of discussion. A troll's main goal is to arouse anger and frustration or otherwise shock and offend the message board's other participants, and will write whatever it takes to achieve this end.

One popular trolling strategy is the practice of Winning by Losing. While the victim is trying to put forward solid and convincing facts to prove his position, the troll's only goal is to infuriate its prey. The troll takes (what it knows to be) a badly flawed, wholly illogical argument, and then vigorously defends it while mocking and insulting its prey. The troll looks like a complete fool, but this is all part of the plan. The victim becomes noticeably angry by trying to repeatedly explain the flaws of the troll's argument. Provoking this anger was the troll's one and only goal from the very beginning."

Experienced participants in online forums know that the most effective way to discourage a troll is usually to ignore him or her, because responding encourages a true troll to continue disruptive posts — hence the often-seen warning "Please do not feed the troll".

Back to top

Search Math wiki

See also

External links


  1. Wikipedia:Generalization
  2. Wikipedia:Cartesian product
  3. Wikipedia:Tangent bundle
  4. Wikipedia:Lie group
  5. Wikipedia:Sesquilinear form
  6. Wikipedia:Tensor
  7. Wikipedia:Tensor (intrinsic definition)
  8. Template:Harvtxt
  9. Template:Harvtxt
  10. Wikipedia:Special unitary group
  11. Lawson, H. Blaine; Michelsohn, Marie-Louise (1989). Spin Geometry. Princeton University Press. ISBN 978-0-691-08542-5  page 14
  12. Friedrich, Thomas (2000), Dirac Operators in Riemannian Geometry, American Mathematical Society, ISBN 978-0-8218-2055-1  page 15
  13. "Pauli matrices". Planetmath website. 28 March 2008. Retrieved 28 May 2013. 
  14. Wikipedia:Spinor#Three_dimensions
  15. Cite error: Invalid <ref> tag; no text was provided for refs named Flanders
  16. W. K. Clifford, "Preliminary sketch of bi-quaternions," Proc. London Math. Soc. Vol. 4 (1873) pp. 381-395
  17. W. K. Clifford, Mathematical Papers, (ed. R. Tucker), London: Macmillan, 1882.
  18. The Minkowski inner product is not an inner product*, since it is not positive-definite*, i.e. the quadratic form* η(v, v) need not be positive for nonzero v. The positive-definite condition has been replaced by the weaker condition of non-degeneracy. The bilinear form is said to be indefinite.
  19. The matrices in this basis, provided below, are the similarity transforms of the Dirac basis matrices of the previous paragraph, U^\dagger \gamma_D^\mu U, where U = \frac{1}{\sqrt{2}}\left(1 - \gamma^5 \gamma^0\right) = \frac{1}{\sqrt{2}}\begin{pmatrix} I & I \\ -I & I \end{pmatrix}.
  20. Wikipedia:Rotor (mathematics)
  21. Wikipedia:Spinor#Three_dimensions
  22. Wikipedia:Spinor
  23. Template:Harvnb Exercise 1.5
  24. 24.0 24.1 Cartan, Élie (1981) [1938], The Theory of Spinors, New York: Dover Publications, ISBN 978-0-486-64070-9, MR 631850, 
  25. Roger Penrose (2005). The road to reality: a complete guide to the laws of our universe. Knopf. pp. 203–206. 
  26. E. Meinrenken (2013), "The spin representation", Clifford Algebras and Lie Theory, Ergebnisse der Mathematik undihrer Grenzgebiete. 3. Folge / A Series of Modern Surveys in Mathematics, 58, Springer-Verlag, doi:10.1007/978-3-642-36216-3_3 
  27. S.-H. Dong (2011), "Chapter 2, Special Orthogonal Group SO(N)", Wave Equations in Higher Dimensions, Springer, pp. 13–38 
  28. Oersted Medal Lecture David Hestenes "Reforming the Mathematical Language of Physics" (Am. J. Phys. 71 (2), February 2003, pp. 104–121) Online: p26
  29. Andrew Marx, Shortcut Algebra I: A Quick and Easy Way to Increase Your Algebra I Knowledge and Test Scores, Publisher Kaplan Publishing, 2007, Template:ISBN, 9781419552885, 288 pages, page 51
  30. Wikipedia:Multiplicity (mathematics)
  31. Wikipedia:Partial fraction decomposition
  32. Wikipedia:Basic hypergeometric series
  33. Wikipedia:q-analog
  34. 34.0 34.1 ex = y = dy/dx
    dx = dy/y = 1/y * dy
    ∫ (1/y)dy = ∫ dx = x = ln(y)
  35. Wikipedia:Product rule
  36. Wikipedia:Monotonic function
  37. Wikipedia:Generalized Fourier series
  38. Wikipedia:Spherical harmonics
  39. Wikipedia:Inverse Laplace transform
  41. Wikipedia:Convolution theorem
  42. Wikipedia:RLC circuit
  43. Cite error: Invalid <ref> tag; no text was provided for refs named eFunda
  44. Cite error: Invalid <ref> tag; no text was provided for refs named edwards
  45. Cite error: Invalid <ref> tag; no text was provided for refs named cohen
  46. Wikipedia:Total derivative
  47. Wikipedia:Residue (complex analysis)
  48. Wikipedia:Potential theory
  49. Wikipedia:Harmonic conjugate
  50. Wikipedia:Calculus of variations
  51. Wikipedia:Cover (topology)
  52. Joshi p. 323
  53. Wikipedia:Permutation
  54. Wikipedia:derangement
  55. Wikipedia:rencontres numbers
  56. Wikipedia:Central limit theorem
  57. Bland, J.M.; Altman, D.G. (1996). "Statistics notes: measurement error". BMJ 312 (7047): 1654. doi:10.1136/bmj.312.7047.1654. PMC 2351401. PMID 8664723. // 
  58. Wikipedia:standard deviation
  59. Wikipedia:Hypergeometric distribution
  60. Wikipedia:Tit for tat
  61. Wikipedia:Atmospheric river
  62. A Psychological Analysis of Adolph Hitler. His Life and Legend by Walter C. Langer. Office of Strategic Services (OSS) Washington, D.C. With the collaboration of Prof. Henry A. Murr, Harvard Psychological Clinic, Dr. Ernst Kris, New School for Social Research, Dr. Bertram D. Lawin, New York Psychoanalytic Institute. p. 219 (Nizkor project)
  63. Dr. Langer's work was published after the war as The Mind of Adolf Hitler, the wartime report having remained classified for over twenty years.
  64. Wikipedia:Schutzstaffel