We know from the theorem on $n$th order linear homogeneous differential equations $\cal{L} y=0$ that the general solution is a linear combination of $n$ linearly independent solutions $$y=C_1 y_1 + C_2 y_2 +\dots + C_n y_n$$ What does the word linearly independent mean and how do we find out if a set of particular solutions is linearly independent?
Let's examine a close geometric analogy. Consider the set of three vectors in the plane $$\{\vec{v}_1=\hat{x}, \vec{v}_2=\hat{y}, \vec{v}_3=3\hat{x}-2\hat{y}\}.$$ Notice that the third vector is a linear combination of the first two: $$\vec{v}_3=3\vec{v}_1-2\vec{v}_2$$ or $$3\vec{v}_1-2\vec{v}_2-\vec{v}_3=0.$$ We say that these three vectors are linearly dependent (alternatively, NOT linearly independent). Geometrically, this is equivalent to the statement that these three vectors lie in a two-dimensional plane.
Why is linear independence important? If we wanted to expand another vector $\vec{v}_4$, it is sufficient to expand it in terms of the linearly independent vectors $\vec{v}_1$ and $\vec{v}_2$: $$\vec{v}_4=D_1 \vec{v}_1 + D_2 \vec{v}_2$$ and the coefficients $D_1$ and $D_2$ are unique. We say that $\vec{v}_1$ and $\vec{v}_2$ form a basis for the two-dimensional vector space. We do not need to include the vector $\vec{v}_3$ in the expansion, $$\vec{v}_4=D_1 \vec{v}_1 + D_2 \vec{v}_2 + D_3 \vec{v}_3$$ but if we did include it the coefficients $D_1$, $D_2$, and $D_3$ would not be uniquely specified. Many combinations of $D$s would work.
We will now extend this definition of linear independence of vectors that are arrows in space to linear independence of functions that are solutions of a linear ODE. There is deep mathematics underlying the analogy. The solutions of a linear ODE form a vector space. You can explore this analogy more deeply in this section of the book.
Definition: A set of $n$ functions ${y_1,\dots ,y_n}$ on an interval $I$ are linearly dependent if there exist constants $C_1, C_2, \dots , C_n$, not all zero, such that $$ C_1 y_1+C_2 y_2 +\dots + C_n y_n=0 $$ Otherwise the functions are linearly independent.
It is cumbersome to use the definition above to find out if a set of functions is linearly independent. If the set of functions are all solutions of the same linear ODE, then there is a much quicker method, using a mathematical object called a Wronskian. Definition: If a set of $n$ functions ${y_1,\dots ,y_n}$ on an interval $I$ each have $n-1$ derivatives, then the determinant $W(y_1,\dots ,y_n)$, defined below, is called the Wronskian of the set of functions. \begin{equation} W(y_1,\dots ,y_n)\doteq \begin{vmatrix} y_1&y_2&\dots&y_n\\ y_1^{\prime}&y_2^{\prime}&\dots&y_n^{\prime}\\ \vdots&\vdots&\vdots&\vdots\\ y_1^{(n-1)}&y_2^{(n-1)}&\dots&y_n^{(n-1)}\\ \end{vmatrix} \end{equation}
Theorem: If ${y_1,\dots ,y_n}$ are solutions of $\cal{L}(y)=0$ on $I$, then they are linearly independent $\Longleftrightarrow$ $W(y_1,\dots ,y_n)$ is not identically zero on $I$.
Note: This theorem is only valid if the functions ${y_1,\dots ,y_n}$ are all solutions of the same $n^{th}$ order linear ODE.
Just as with vectors that are arrows in space, it is often convenient, but not necessary to choose the linearly independent basis functions to be orthonormal, i.e. orthogonal and normalized. In the motivation section above, I chose to focus on $\vec{v}_1=\hat{x}$ and $\vec{v}_2=\hat{y}$ as the basis because that is the conventional orthonormal basis, but everything I said would have worked perfectly well if I had chosen $\vec{v}_1=\hat{x}$ and $\vec{v}_3=3\hat{x}-2\hat{y}$ as the basis instead. It just would have been a little harder for you to follow the algebra. In the same way, it will often simplify algebra for us if we choose the linearly independent basis functions to be orthonormal, but we'll need to generalize the ideal of the dot product to these functions. See this section of the book.