Well, today we are going to have a look at some of the concepts of Mathematics that are extensively used to simplify Machine learning problems. One of them is vectors.So lets see how is vector algebra applied in ML problems.
August 13 2019
Well, today we are going to have a look at some of the concepts of Mathematics that are extensively used to simplify Machine learning problems. One of them is vectors.So lets see how is vector algebra applied in ML problems.
Elements from vector space V are called vectors (abstract algebra)
orVector (denoted by $\overrightarrow{x}$ ) is a quantity having both magnitude and direction
Above are the most common definations of vector which you may have read from books or internet sources like Wikipedia. In Machine learning vectors are specifically called feature vectors. Later in this post you will get to know why do we call them feature vectors in particular, but before that let's check vector representation, revise some basic algebric operations on vectors, and also I will explain the real meaning of these operations when we encounter a real dataset
In mathematics vectors can be represented as $\overrightarrow{v}$
$\overrightarrow{v} $= <a, b>
or
$\overrightarrow{v}$= (a, b)
where a and b are scaler quantities
representing different components of vector quantity $\overrightarrow{v}$.
The above code snippet shows some of the ways to represent vectors in computer programming
using Python.
There are two vectors namely $\overrightarrow{v} and \overrightarrow{x}$ both having three components
$a, b & c$.
Now
both of these are actually same in theoratical sense,
only mere difference between vector variables $v & x$ is their object type because
later one is created using numpy library.
numpy being a powerfull utility library, we use numpy for most of the operations
(multidimensional array,generate random numbers etc) due to its execution
speed and shorthand syntax.
From now on we will use numpy for all operations and stuff for programming examples in
Python.
A note about vector space :For
instance
vectors x and v that we just
created are having their components (a,b and c) as scaler quantities so we can say that
v,x ∈ ℝ³ (vector space)
where ℝ states that components of vectors v & x are real number values and
superscript ³ denotes that vectors v & x
have 3 components each.
Above expression can also be generalized for any vector $\overrightarrow{v}$ ∈
ℝⁿ.
This is one defination of vector space. Here is a theroatical explanation of
vector space if you
want to know more about them.
In fig.1 two vectors $v & x ∈ ℝ²$ are
visualized in Cartesian Plane having
components $(a',b') & (a'',b'')$ respectively, their directions
shown with arrows
and axis of Cartesian Plane representing as components.
It is easy to represent a vector with 2 components in real space than vector with 3
components.
   Vector can be thought of as line originating from origin $(0,0)$ and
growing in some
specifc direction.
It is important to note that the length of any vector is same as magnitude of that
vector,we will discuss magnitude in Vector subtraction section.
  
Cartesian system makes it quite easy for us to visualize the vectors,
but remember that this is just an analogy for mapping 2d vectors to Cartesian system
and
also most of the times in real Machine learning problems we have vectors with higher
dimensions , so use this analogy just to remember that
this is how you started visualizing vectors with lower dimensions !
*If you want to play around vectors with 3 components use this tool
3d Vector
Plotter.
Adding a scaler to vector is defined as adding scaler to each component of vector
Consider a vector $\overrightarrow{v}=<2,3,4>$
Adding scaler $ s=3 $ to $\overrightarrow{v}$ will result in :
$\overrightarrow{v} + s = <2+s,3+s,4+s> = <5,6,7>$.
Adding a scaler to vector will result in another vector quantity.
Unlike addition with scaler, addition of vector to a vector has one conditions:
Two vectors can be added iff both have same number of components or they
reside in
same vector space $ℝⁿ$
In vector to vector addition we just add the corresponding components of two or more
vectors
and we get another vector quantity in result
Consider two vectors $\overrightarrow{v} $&$ \overrightarrow{x}$
where $\overrightarrow{v}=<2,4,5> and \overrightarrow{x}=<3,5,1>$
so
$\overrightarrow{c} = \overrightarrow{v} +\overrightarrow{x} =
<2+3, 4+5, 5+1>$
$\overrightarrow{c} =<5,9,6>$
Talking about graphical representation of vector addition
fig.2 gives an idea about
this.Vectors $\overrightarrow{v} and \overrightarrow{x}$ are added to
each other and result is another
vector $\overrightarrow{v}+\overrightarrow{x}$.Note that how is
resultant
vector represented, it starts from origin and goes to specific
direction.Some of you may be familiar with other
type of representation of vector addition, more formally known as
Triangle Law of vector addition,
which you may have seen
during your physics or linear algebra class in high school.
Below figures fig.3(a) and fig.3(b)
represent the Triangle Law of vector addition
In fig.3(a) vector $\overrightarrow{x}$
tail is
aligned to head of $\overrightarrow{v}$ and it can be noted
that
the resulatant vector $\overrightarrow{v}+
\overrightarrow{x}$ doesn't change neither in magnitude nor
in
direction.
Unlike in fig.3(a) other
fig.3(b) represents $\overrightarrow{x}$
addition to $\overrightarrow{v}$, but resultant vector
$\overrightarrow{x}$ + $\overrightarrow{v}$ is same as
before.
From this we can conclude that vector addition is
commutative i.e
$\overrightarrow{v}$+$\overrightarrow{x}$=$\overrightarrow{x}$+$\overrightarrow{v}$
Along with commutative vector addition is also associative
i.e
$(\overrightarrow{v}+\overrightarrow{x})+\overrightarrow{c}
=\overrightarrow{v}+(\overrightarrow{x}+\overrightarrow{c})$
As stated above, vector addition is commutative, so in adjacent figure
fig. 3(c) if addition $\overrightarrow{v} + \overrightarrow{x}$
is mirrored we obtain a hypothetical parallelogram $ABCD$, now the result
of $\overrightarrow{v} + \overrightarrow{x}$ from both
the additions coincides with a diagonal of parallelogram $AC$,
this is known as Parallelogram Theorem of vector addition.
The above code snippet shows scaler and vector addition in Python,although most of the code is self
explanatory,there are some
important points to notice.
Recall that addition between vectors from different vector space is not valid in vector algebra, so in
line $20$
when we try to add $\overrightarrow{c} to \overrightarrow{x}$ numpy throws ValueError exception in line
$21$ indicating about
different shapes of $\overrightarrow{c} and \overrightarrow{x}$
As stated above two terms magnitude or length of a vector can be used interchangeably, now lets see
how to compute it. Magnitude of a vector is represented by $|\overrightarrow{v}|$
Consider $\overrightarrow{v} = <3,4>$
$|\overrightarrow{v}|= \sqrt{(3-0)^2+(4-0)^2} \\ |\overrightarrow{v}|=5$
Magnitude of any vector is defined as length of vector from origin.Origin being $(0,0)$,
$|\overrightarrow{v}|$ can be directly written as
$|\overrightarrow{v}|= \sqrt{(3)^2+(4)^2} \\ |\overrightarrow{v}|=5$
Magnitude for any vector with $n$ components can be generalized as
$|\overrightarrow{v}|= \sqrt{\sum\limits_{i=1}^{n} x_{i}^2}$
At this point you may argue that its the same way we calculate distance of a point in $xy$ plane
from origin
in coordinate geometry which is formally known as Euclidean distance, but in vector algebra
it is known as
norm(s).There are different type of norms available in vector algebra to calculate length of
a vector.
The one we just discussed is known as L-2 norm of $\overrightarrow{v}$ and it is written
as
$||\overrightarrow{v}||_2$
An important application of norms is to find the distance between the two vectors, it is known as L-2
norm of $(\overrightarrow{v}-\overrightarrow{x})$
and is calculated as
$||(\overrightarrow{v}-\overrightarrow{x})||_2=\sqrt{\sum\limits_{i=1}^{n} (v_{i}-x_{i})^2}$
For example
$\overrightarrow{v}= <6,9> and \overrightarrow{x}=<2,6> \\
||(\overrightarrow{v}-\overrightarrow{x})||_2= \sqrt{(v_1-x_1)^2+(v_2-x_2)^2} \\
||(\overrightarrow{v}-\overrightarrow{x})||_2= \sqrt{(6-2)^2+(9-6)^2} \\
=\sqrt{16+9}=5$
Above code snippet shows $||\overrightarrow{a}||_2$ and $||\overrightarrow{a}-\overrightarrow{b}||_2$
Notice that numpy.linalg.norm
takes only one argument of array type i.e a vector, we can
either pass
another vector
$\overrightarrow{c}=\overrightarrow{a}-\overrightarrow{b}$ or $\overrightarrow{a}-\overrightarrow{b}$ as
it is.
In vector algebra, multiplication is more formally known as "vector product". There are mainly two types
of vector product:
$(i)$ Dot Product.
$(ii)$ Cross Product.
In Machine Learning context we rarely use cross-product, so we will discuss only dot product here.
Dot product between two vectors is usually represented as $\overrightarrow{v}.\overrightarrow{x}$
$\overrightarrow{v}.\overrightarrow{x}=\sum\limits_{i=1}^{n}v_{i}x_{i}$
Let $\overrightarrow{v}=<3,6>$ and $\overrightarrow{x}=<2,8>$
$\overrightarrow{v}.\overrightarrow{x}= 3{\times}2+6{\times}8=54$
Sometimes we have vectors in row or column form:
$\begin{bmatrix} v_1 & v_2 & v_3\end{bmatrix}$ or $\begin{bmatrix} v_1 \\ v_2 \\
v_3\end{bmatrix}$
In that case $\overrightarrow{v}.\overrightarrow{x}= vx^T$ where 'T' stands for Transpose of vector x.
Let $\overrightarrow{v}$ and $\overrightarrow{x}$ be row vectors [1 3 4] and [3 1 7]
respectively.
So, $\overrightarrow{v}$.$\overrightarrow{x}$ =
$\begin{bmatrix} 1 & 3 &
4\end {bmatrix}\begin{bmatrix} 3 \\ 1 \\ 7\end{bmatrix}$
$= 1\times3+ 3\times1+4\times7 =34$<
The above code snippet shows the dot product of vectors of different shapes. Line $8$ prints the dot product of vector v and x; both vectors being row
vecotors can be multiplied with $np.inner$ method. But when we try to do $np.dot$ on them it prints an error regarding different dimensions.
On other hand when we have vectors in the row and column form we cannot do $np.inner$ on them as shown in line $29$ this gives an error, we need to do $np.dot$
to calculate dot prodcut between them as shown in line $25$. You can read documentation of Numpy Vector Product to undrstand its working.