Why is matrix arithmetic associative?

I’m trying to get my head around homotopy type theory at the moment. One aspect of this involves paying far closer attention to how you prove a theorem rather than merely noting the end result and forgetting how you got there.

Here’s a fairly trivial example of what I mean, involving matrices over a ring R. How do we know that addition and multiplication of matrices are associative?

For addition, it’s straightforward – matrix addition is associative because addition in R is associative:

((A+B)+C)ij = (A+B)ij+Cij = (Aij+Bij)+Cij
= Aij+(Bij+Cij) = Aij+(B+C)ij = (A+(B+C))ij

But for matrix multiplication to be associative we need a lot more than mere associativity of multiplication in R. We also need addition in R to be commutative, and multiplication to distribute over addition:

(A.(B.C))ij = ΣAik.(B.C)kj = ΣAik.(ΣBkl.Clj)
= ΣΣAik.(Bkl.Clj) = ΣΣ(Aik.Bkl).Clj = ΣΣ(Aik.Bkl).Clj
= Σ(ΣAik.Bkl).Clj = Σ(A.B)il.Clj = ((A.B).C)ij

Note that we need distributivity to “bubble” the Σs to the front of the expression and back again, and we need commutativity to swap round the two Σs in the central step (since we can only swap them round because the order of summands doesn’t matter).

Moreover, we need distributivity to prove that the identity matrix is in fact a multiplicative unit, since the proof relies on 0.r = 0 in R, and you need distribution to prove that.

So what? you might cry. Well I’m not entirely sure, but I certainly am bewildered that in all my years fiddling with matrices I’ve only just noticed all this. It suggests that the extra discipline type theory imposes on mathematics at the very least throws up all sorts of extra subtleties that would otherwise be missed.

TL;DR version: you need almost all the ringyness of a ring in order for matrix multiplication to be associative.

Tagged: #maths
  1. bat020 posted this