Matrices in Machine Learning Part 02

Matrices in Machine Learning

Transpose of a Matrix

Definition. (Transpose of a Matrix). Let [katex]A=\left[a_{i j}\right][/katex] be an [katex]m \times n[/katex] matrix over a field F. The transpose of A, denoted by [katex]A^T[/katex], is an [katex]n \times m[/katex] matrix obtained by interchanging rows and columns of A. Thus [katex]A^T=\left[b_{i j}\right][/katex], where [katex]b_{i j}=a_{j i}[/katex]. In other words, (i, j)th element of [katex]A^T=(j, i)th[/katex] element of A.

Earn Money online without investment Earn Money online from home without investment

Thus the first row of [katex]A^T[/katex] is the first column of A, second row of [katex]A^T[/katex] is the second column of A and so on. Theorem. If the matrices A and B are conformable for the sum A+B and the product AB, then (i) [katex](A \pm B)^T=A^T \pm B^T[/katex] (ii) [katex]\left(A^T\right)^T=A[/katex] (iii) [katex](k A)^T=k A^T, \quad[/katex] where [katex]k \in F[/katex] (iv) [katex](A B)^T=B^T A^T[/katex]

Proof. (i) Let [katex]A=\left[a_{i j}\right][/katex] and [katex]B=\left[b_{i j}\right][/katex] be [katex]m \times n[/katex] matrices. Then A+B is an [katex]m \times n[/katex] matrix and [katex](A+B)^{T}[/katex] is an [katex]n \times m[/katex] matrix. Now [katex]A^{T}[/katex] is an [katex]n \times m[/katex] matrix and [katex]B^{T}[/katex] also is an [katex]n \times m[/katex] matrix. Therefore, [katex]A^{T}+B^{T}[/katex] is an [katex]n \times m[/katex] matrix.

Thus each of [katex](A+B)^{T}[/katex] and [katex]A^{T}+B^{T}[/katex] is an [katex]n \times m[/katex] matrix. Now

\begin{aligned}(i, j) \text { th element of }(A+B)^{T} &=(j, i) \text { th element of } A+B \\ 
&=a_{ji}+b_{ji}  \end{aligned}
\begin{aligned}(i, j) \text { th element of }(A+B)^{T} &=(i, j) \text { th element of } A^{T}+(i, j) \text { th element of } B^{T}\\&=(j, i) \text {th element of A}+ (j, i) \text {th element of B}\\
&=a_{ji}+b_{ji}\end{aligned}

Thus, for all i and j,

(i, j)th element of [katex] (A+B)^{T}=(i, j)th[/katex] element of [katex] A^{T}+B^{T}[/katex] .

Hence

(A+B)^{T} \quad=A^{T}+B^{T} \text {. }

Similarly

(A-B)^{T} \quad=(A+(-B))^{T}=A^{T}+(-B)^{T}=A^{T}-B^{T}

We can similarly prove that

\left(A_{1}+A_{2}+\cdots+A_{n}\right)^{T}=A_{1}^{T}+A_{2}^{T}+\cdots+A_{n}^{T} .

(ii) (i, j)th element of [katex]\left(A^{T}\right)^{T}=(j, i)th[/katex] element of [katex]A^{T}[/katex]=(i, j)th element of A.

Thus

\left(A^{T}\right)^{T}=A .

(iii) (i, j)th element of [katex](k A)^{T}[/katex]

\begin{aligned}
(i, j) \text { th element of } (kA)^T&=(j, i) \text { th element of } k A \\
&=k a_{j i} \\
&=(i, j) \text { th element of } k A^{T} .
\end{aligned}\\
\text {Hence    } (k A)^{T}=k A^{T}

(iv) Let [katex]A=\left[a_{i j}\right][/katex] be an [katex]m \times n[/katex] matrix and [katex]B=\left[b_{i j}\right][/katex] be an [katex]n \times p[/katex] matrix.

Then AB is of order [katex]m \times p[/katex] and so [katex](A B)^{T}[/katex] is of order [katex]p \times m[/katex].

Now [katex]B^{T}[/katex] is of order [katex]p \times i[/katex] and [katex]A^{T}[/katex] is of order [katex]n \times m[/katex]. Hence [katex]B^{T} A^{T}[/katex] is of order [katex]p \times m[/katex]. So

\begin{aligned}
(i, j) \text { th element of }(A B)^{T} &=(j, i) \text { th element of } A B . \\
&=\sum_{\lambda=1}^{n} a_{j \lambda} b_{\lambda i}
\end{aligned}

Hence

(A B)^{T}=B^{T} A^{T}.

This result can be generalized for the product of a finite number of matrices [katex]A_{1}, A_{2}, \cdots, A_{n}[/katex]. Thus

\left(A_{1} A_{2} \cdots A_{n}\right)^{T}=A^{T}{ }_{n} \cdots A^{T}{ }_{2} A^{T}{ }_{1} .

Definition. Let A be a square matrix of order n. We define

A^{n}=A \cdot A \cdot A \cdots A \text { ( } n \text { factors). }

It can be proved by induction that, for all [katex]m, n \in N[/katex]

\begin{aligned}
A^{m} \cdot A^{n} &=A^{m+n}, \\
\left(A^{m}\right)^{n} &=A^{m n} .
\end{aligned}

We also define [katex]A^{0}=I_{n}[/katex].

Also, if A and B are conformable for multiplication then [katex](A B)^{n} \neq A^{n} B^{n}[/katex] in general. However, [katex](A B)^{n}=A^{n} B^{n}[/katex] holds if AB=BA.

Periodic matrix

A square matrix A for which [katex]A^{k+1}=A[/katex], ( k being a positive integer), is called a periodic matrix.

If k is the Test positive integer for which [katex]A^{k+1}=A[/katex], then A is said to be of period k.

Idempotent matrix.

If k=1, so that [katex]A^{2}=A[/katex], then A is called an idempotent matrix.

Example. Consider

\begin{aligned}
& A=\left[\begin{array}{rrr}2 & -2 & -4 \\-1 & 3 & 4 \\1 & -2 & -3\end{array}\right] \\
& A^{2}=\left[\begin{array}{rrr}2 & -2 & -4 \\-1 & 3 & 4 \\1 & -2 & -3\end{array}\right]\left[\begin{array}{rrr}2 & -2 & -4 \\-1 & 3 & 4 \\1 & -2 & -3\end{array}\right] \\
& =\left[\begin{array}{rrr}2 & -2 & -4 \\-1 & 3 & 4 \\1 & -2 & -3\end{array}\right]=A \text {. }
\end{aligned}

Hence A is idempotent.

Nilpotent Matrix

Definition. A square matrix A for which [katex]A^{p}=0[/katex], (p being a positive integer), is called nilpotent. If p is the least positive integer for which [/katex]A^{p}=0[/katex], then A is said to be nilpotent of nilpotency index p.

Example. Let

\begin{aligned}
& A=\left[\begin{array}{rrr}1 & 1 & 3 \\5 & 2 & 6 \\-2 & -1 & -3\end{array}\right] \text { Then } \\
& A^{2}=\left[\begin{array}{rrr}1 & 1 & 3 \\5 & 2 & 6 \\-2 & -1 & -3\end{array}\right]\left[\begin{array}{rrr}1 & 1 & 3 \\5 & 2 & 6 \\-2 & -1 & -3\end{array}\right] \\
& =\left[\begin{array}{rrr}0 & 0 & 0 \\3 & 3 & 9 \\-1 & -1 & -3\end{array}\right] \text {. } \\
& A^{3}=A^{2} A=\left[\begin{array}{rrr}0 & 0 & 0 \\3 & 3 & 9 \\-1 & -1 & -3\end{array}\right]\left[\begin{array}{rrr}1 & 1 & 3 \\5 & 2 & 6 \\-2 & -1 & -3\end{array}\right] \\
& =\left[\begin{array}{lll}0 & 0 & 0 \\0 & 0 & 0 \\0 & 0 & 0\end{array}\right]=0 \text {. }
\end{aligned}

Hence A is nilpotent of nilpotency index 3 .

Involutory matrix

Definition. A square matrix A such phat [katex]A^{2}=I[/katex] is called an involutory matrix.

Example. Let

\begin{aligned}
A &=\left[\begin{array}{rrr}
1 & -1 & 0 \\
0 & -1 & 0 \\
0 & 0 & 1
\end{array}\right] \text {. Then } \\
A^{2} &=\left[\begin{array}{rrr}
1 & -1 & 0 \\
0 & -1 & 0 \\
0 & 0 & 1
\end{array}\right]\left[\begin{array}{rrr}
1 & -1 & 0 \\
0 & -1 & 0 \\
0 & 0 & 1
\end{array}\right]=\left[\begin{array}{lll}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{array}\right]=I
\end{aligned}

so that A is involuntary.

Symmetric matrix

Definition. A square matrix A for which [katex]A^{T}=A[/katex] is called a symmetric matrix.

Skew Symmetric matrix

A square matrix A for which [katex]A^{T}=-A[/katex] is called a skew symmetric matrix.

Note that the product of two symmetric matrices may not be symmetric. That is,

if [katex]A^{T}=A, B^{T}=B[/katex] then [katex](A B)^{T}=B^{T} A^{T}=B A \neq A B[/katex] is general.

Example. For

A=\left[\begin{array}{rrr}
1 & 2 & 3 \\
2 & 4 & -3 \\
3 & -3 & 6
\end{array}\right] \text {, we have } A^{T}=A \text { Hence } A \text { is symmetric. }

Also for the matrix

A=\left[\begin{array}{rrr}
0 & -2 & 3 \\
2 & 0 & 4 \\
-3 & -4 & 0
\end{array}\right], A^{T}=-A \text {. Hence } A \text { is skew symmetric. }

Conjugate matrix

Definition. Let [katex]A=\left[a_{i j}\right][/katex] be a matrix with complex entries. Then the matrix [katex][ \bar{a}_{i j} ][/katex], obtained from A by replacing each [katex]a_{i j}[/katex] by its complex conjugate, is called the conjugate matrix of [katex]\bar{A}[/katex] and is denoted by [katex]\bar{A}[/katex] (read A conjugate). [katex](\bar{A})^{T}[/katex] is called Hermitian transpose of A.

Hermitian

A square matrix A such that [katex](\bar{A})^{T}=A[/katex] is called Hermitian (after the name of French mathematician Charles Hermit, 1822-1901). [katex](\bar{A})^{T}[/katex] is also denoted by [katex]A^{H}[/katex]. Thus A is Hermitian if [katex]A^{H}=A[/katex].

A square matrix. [katex]A=\left[a_{i j}\right][/katex] over C for which [katex](\bar{A})^{T}=-A[/katex] is called skew Hermission.

As for the product of two symmetric matrices, the product of two Hermitian matrices A and B need not be Hermitian. That

(A B)^{H}=B^{H} A^{H}=B A \neq A B \text { in general. }

However, if A and B are conformable for multiplication then

So, for symmetric (Hermitian) [katex]A, B^{T} A B,\left(B^{H} A B\right)[/katex] is symmetric (Hermitian).\

Example. Let

\begin{aligned}
A &=\left[\begin{array}{ccc}
1 & 1-i & 2 \\
1+i & 3 & i \\
2 & -i & 0
\end{array}\right] . \text { Then } \bar{A}=\left[\begin{array}{ccc}
1 & 1+i & 2 \\
1-i & 3 & -i \\
2 & i & 0
\end{array}\right] \\
\text { and }(\bar{A})^{T}=& {\left[\begin{array}{ccc}
1 & 1-i & 2 \\
1+i & 3 & i \\
2 & -i & 0
\end{array}\right]=A \text {. Thus } A \text { is Hermitian. } }
\end{aligned}

Example. Let

A=\left[\begin{array}{ccc}
i & 1-i & 2 \\
-\mathrm{i}-i & 3 i & i \\
-2 & i & 0
\end{array}\right] \text {. It is easy to verify that }(\bar{A})^{T}=-A \text {. }

Hence A is skew Hermitian.

Example. Show that for any square matrix [katex]A, A+A^{T}[/katex] is symmetric and [katex]A-A^{T}[/katex] is skew symmetric.

Solution. Here we use the results [katex](A \pm B)^{T}=A^{T} \pm B^{T}[/katex] and [katex]\left(A^{T}\right)^{T}=A[/katex]. Thus

\left(A+A^{T}\right)^{T}=A^{T}+\left(A^{T}\right)^{T}=A^{T}+A=A+A^{T} .

Hence [katex]A+A^{T}[/katex] is symmetric.

Again, [katex]\left(A-A^{T}\right)^{T}=A^{T}-\left(A^{T}\right)^{T}=A^{T}-A=-\left(A-A^{T}\right)[/katex].

Hence [katex]A-A^{T}[/katex] is skew symmetric.

we have the following relations between two matrices A and B and their Hermitian forms.

(3.24) Theorem. Let $A$ and $B$ be square matrices of the same order. Then

(i) [katex]\left(A^{H}\right)^{H}=A[/katex],

(ii) [katex](A \pm B)^{H}=A^{H} \pm B^{H}[/katex]

(iii) [katex](c A)^{H}=(\bar{c}) A^{H}[/katex],

(iv) [katex](A B)^{H}=B^{H} A^{H}[/katex].

Proof. Left to the reader.

Leave a Comment