This kind of stuff just infuriates me. I can't see any good reason for having so many competing notational standards for the same thing:
The notation used here is commonly used in statistics and engineering, while the tensor index notation is preferred in physics.
Two competing notational conventions split the field of matrix calculus into two separate groups. The two groups can be distinguished by whether they write the derivative of a scalar with respect to a vector as a column vector or a row vector. Both of these conventions are possible even when the common assumption is made that vectors should be treated as column vectors when combined with matrices (rather than row vectors). A single convention can be somewhat standard throughout a single field that commonly uses matrix calculus (e.g. econometrics, statistics, estimation theory and machine learning). However, even within a given field different authors can be found using competing conventions. Authors of both groups often write as though their specific convention is standard.
Seriously? So if I want to read a paper that uses Matrix Calculus, it's not enough to just understand Matrix Calculus in general.. no, first I have to decipher which of a legion of possible notations the author used, and then keep that state in mind when thinking about that paper in relation to another, which might use yet another notation.
I understand that ultimately nobody is an position to mandate the adoption of a universal standard, but part of me wishes there were (this is, of course, not a problem that is limited to Matrix Calculus).