Why the cross product?

This is of course a ridiculously simple thing, that many encounter in their first year of university mathematics, and perhaps even in high school. But, like many things I encountered in those days, I kind-of just accepted its existence and moved on. The calculations were easy enough, so I never really had to think about it.

These days I am having a lot of fun examining the undergraduate maths I haven’t touched in years, ever since being forced to do a lot of linear algebra for a research project. From a more mature perspective (I guess), there is so much I missed on in my undergraduate studies by not questioning the things I was learning more. I have a simple question I ask myself now when encountering a mathematical concept: “What the hell is this and why should it be this way?” It might seem a bit belligerent, but I think a bit of aggression is not entirely unwarranted when encountering mathematics. To me, a mathematical concept or entity is not satisfying until I can say where it came from and what it is good for, and I am slightly ashamed of the years I spent just coasting on the definitions of others, never questioning why things should be this way. Perhaps we too easily accept the work that has gone before, in all its abstract glory. Sure, things were done that way for a reason, but you would be so much better off if you knew that reason…

Hence, the cross or vector product in Euclidean space. For now I am going to accept the existence of the dot/scalar product as is, although if you would like a really nice explanation of it, I would suggest 3Blue1Brown‘s very excellent video. (The entire series of videos is worth watching, and should make you appreciate the wonderful world of linear transformations.) The video on the cross product did not entirely satisfy me though, which is why I’ve written this post.

Firstly, why should there be such a thing as a vector product? In other words, how could such a concept arise naturally? I mean, the definition is certainly not obvious. For the moment, let us leave aside the size of the product and simply ask: when will I need to get a vector perpendicular to two other (linearly independent) vectors? As with so much of mathematics, a physical intuition is the best answer. Suppose we have some kind of flow (for now, we can suppose we are dealing with a liquid) through some surface. The flow is not completely uniform, and the surface is not necessarily flat. Supposing we have a function describing the flow at each point in space (in other words, a vector field), can we determine how much is flowing through the surface?

As with so much of calculus, we are going to approximate the surface linearly (assuming the surface is nice enough to do this, of course – let us assume that it is given by a smooth enough function). Therefore, we consider the an approximation to the surface given by tiny parallelograms. By making these small enough, we can estimate the flow through the surface arbitrarily well. Thus, in order to find the total flow, we need some way of determining how much is flowing out of each of our parallelograms. Any part of the flow that is parallel to the plane created by our parallelogram is not relevant, because it is not flowing out. Therefore, we only need to consider the flow with components perpendicular to our parallelogram, in the direction of what we would call a normal vector.

Suppose we know how to determine our tiny parallelograms (for instance, by using partial derivatives of the surface at certain points), and that we know our parallelograms is defined by two vectors, say, $\vec{v} = (v_1 ,v_2 ,v_3 )$ and $\vec{w} = (w_1 ,w_2 ,w_3 )$ (supposing that these vectors are not parallel). To get a normal vector, we have to get a vector which is perpendicular to both of these. Fortunately, we can assume that we already know about the dot product, and can use it to test for perpendicularity. If there is a vector $\vec{z} = (z_1 ,z_2 ,z_3 )$ perpendicular to both of the above, it will have to satisfy the equations

$\begin{array}{rcl} z_1 v_1 +z_2 v_2 +z_3 v_3 &=& 0\\ z_1 w_1 +z_2 w_2 +z_3 w_3& =&0. \end{array}$

We now have two equations with three unknowns. But if we decide not to care about the length of the vector, we can set any of its components equal to $1$ , and solve for the other two. We therefore get the equations

$\begin{array}{rcl} v_1 +z_2 v_2 +z_3 v_3 &=& 0\\ w_1 +z_2 w_2 +z_3 w_3& =&0. \end{array}$

Now it becomes a simple case of solving two equations in two unknowns, and we get

$\begin{array}{rcl} z_1 &=& 1\\ z_2 & = & \frac{w_1 v_3 -v_1 w_3}{v_2 w_3 -w_2 v_3} \\ z_3 & = & -\frac{w_1 v_2 - v_1 w_2}{w_3 v_2 - v_3 w_2}.\end{array}$

So far it’s not very pretty, but we can make it look nicer by multiplying every vector by $v_2 w_3 -w_2 v_3$ :

$\begin{array}{rcl} z_1 &=& v_2 w_3 - w_2 v_3 \\ z_2 &=& - (v_1 w_3 - w_1 v_3) \\ z_3 &=& v_1 w_2 - w_1 v_2. \end{array}$

This is now starting to look very familiar. In order to get the unit normal vector that we require, we simply need to take

$\vec{n} = \frac{\vec{z}}{|\vec{z}|}.$

(Exercise: Show that the cross product corresponds to a unique linear map from $\mathbb{R}^3 \otimes \mathbb{R}^3$ to $\mathbb{R}$ .)

We’re not quite done with the original form of $\vec{z}$ , though. Supposing once again that we are trying to calculate the flow through some surface by approximating the surface with small parallelograms. To find the flow through the surface then, we have to find the part of the flow perpendicular to the parallelogram and multiply with the area. As if by magic though, we already have the surface area of the parallelogram! Specifically, it is given by the magnitude of the normal vector $\vec{z}$ that we created earlier.

This is something that we are usually taught, and just assume. But why is that magnitude the area of the parallelogram? It is easy enough to get the surface area of a parallelogram in two dimensions; supposing that the parallelogram is determined by the vectors $(a,b)$ and $(c,d)$ , the area is given by $|ad-bc|$ (show this!) . This looks familiar, at least, and conforms to the determinant method we usually use to get the area.

In three dimensions, the calculation is not very hard either, but let us forget about the cross product for a moment. To get the area of any parallelogram we, of course, only need to multiply the base times the height. In this case, we can take one of the vectors as the base, and find its length in the usual way. To find the height, we just need the dot product and Pythagoras’ theorem.

Specifically, suppose we have the vectors $\vec{x} = (a,b,c)$ and $\vec{y} = (d,e,f)$ , since I’m tired of using subscripts. Taking $\vec{y}$ as the base, we have the first term, $\sqrt{d^2 + e^2 +f^2}$ . To get the height, we first take the length of $\vec{x}$ projected onto $\vec{y}$ :

$\frac{\vec{x}\cdot \vec{y}}{|\vec{y}|} = \frac{ad+be+cf}{|y|} = \frac{ad+be+cf}{\sqrt{d^2 + e^2 +f^2}} .$

Using Pythagoras to evaluate the height $h$ , we get

$h^2 = |x|^2 - \frac{(ad+be+cf)^2}{d^2 + e^2 +f^2}.$

The square of the area is then given be

$\begin{array}{ll} & (d^2 +e^2 +f^2) \left( a^2 +b^2 +c^2 - \frac{(ad+be+cf)^2}{d^2+e^2+f^2 }\right) \\ &= (a^2 +b^2 +c^2)(d^2 +e^2 +f^2) - (ad+be+cf)^2\\ & = a^2 e^2 + a^2 f^2 +b^2d^2 + b^2 f^2 +c^2 d^2 +c^2 e^2 -2adbe -2adcf - 2becf \\ &= (bf-ec)^2 + (af-dc)^2 + (ae-bd)^2 \end{array}$

We can see from this that the usual, determinant form of evaluation the area agrees with the more elementary version above. At least we know that the form of the cross product is justified. There is still, to my mind, one deeper issue that needs to be resolved. We now know, formally, that the area of the parallelogram formed by two independent vectors is given by the size of the cross product, but we can also notice that the determinant form of the cross product consists of evaluating the size of three two-dimensional parallelograms. Why should this be? Left to the reader…

Trying to be a mathematician

Mathematics, optimisation and such

Why the cross product?

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply