Linear algebra (Osnabrück 2024-2025)/Part II/Lecture 43

Polynomials in several variables and their zero sets

As an application of the diagonalizability of symmetric matrices (and of principal axis transformation), we discuss how to transform simple polynomial equations in several variables of low degree to a very simple form. For this, we introduce briefly polynomials in several variables.

Definition

For a set of variables ${}X_{1},\ldots ,X_{n}$ and an ${}n$ -tuple ${}(\nu _{1},\ldots ,\nu _{n})\in \mathbb {N} ^{n}$ ,

an expression of the form

{}X_{1}^{\nu _{1}}\cdots X_{n}^{\nu _{n}}

is called a monomial in the

{}X_{i}

.

The degree of a monomial is the sum of the exponents, that is, ${}\nu _{1}+\nu _{2}+\cdots +\nu _{n}$ .

Definition

A polynomial ${}F$ in the variables ${}X_{1},\ldots ,X_{n}$ over a field ${}K$ is a finite linear combination of monomials

{}F=\sum _{\nu }c_{\nu }X^{\nu }\,,

where

{}c_{\nu }\in K

.

The degree of a polynomial is the maximum of the degrees of the monomials involved (meaning those monomials that occur with a coefficient different from ${}0$ ). A polynomial ${}F=F(X_{1},\ldots ,X_{n})$ in ${}n$ variables over ${}K$ defines a function by inserting (substituting each variable by certain values)

K^{n}\longrightarrow K,(x_{1},\ldots ,x_{n})\longmapsto F(x_{1},\ldots ,x_{n}).

These are important functions in higher-dimensional analysis. The variable ${}X_{i}$ , interpreted in this way, is simply the ${}i$ -th projection, and the addition and the multiplication of polynomials corresponds to the addition and the multiplication of functions, where the values in ${}K$ are added or multiplied.

Definition

For a field ${}K$ and a set of variables ${}X_{1},\ldots ,X_{n}$ , the polynomial ring

K[X_{1},\ldots ,X_{n}]

consists of all polynomials ${}F(X_{1},\ldots ,X_{n})$ in these variables. This set is made into a commutative ring by componentwise addition and by the multiplication that extends the rule

{}X_{1}^{r_{1}}\cdots X_{n}^{r_{n}}\cdot X_{1}^{s_{1}}\cdots X_{n}^{s_{n}}:=X_{1}^{r_{1}+s_{1}}\cdots X_{n}^{r_{n}+s_{n}}\,

such that distributivity holds.

Definition

Let ${}K$ be a field, and let ${}F\in K[X_{1},\ldots ,X_{n}]$ denote a polynomial in ${}n$ Variables. The set

{\left\{P\in K^{n}\mid F(P)=0\right\}}

is called the vanishing set (or zero set)

of

{}F

.

Thus, the vanishing set of ${}F$ is just the fiber of the function

F\colon K^{n}\longrightarrow K

given by ${}F$ . For ${}n=1$ , this is just a finite collection of points, the zeroes of ${}F$ , (in case ${}F=0$ , it is ${}K$ ); however, for ${}n\geq 2$ , these vanishing sets are interesting and complicated geometric objects. The study of these objects is called algebraic geometry. In case of ${}n=2$ , we talk about algebraic curves.

For arbitrary ${}n$ , a polynomial of degree ${}\leq 1$ has the form

{}F=a_{1}X_{1}+\cdots +a_{n}X_{n}+b\,,

and the corresponding vanishing set is just the solution set of the inhomogeneous linear equation

{}a_{1}x_{1}+\cdots +a_{n}x_{n}=-b\,,

that is, it is an affine-linear space.

Real quadrics

The polynomials of degree two and their zero-sets can be understood completely with methods from linear algebra.

Definition

A quadratic polynomial ${}F\in K[X_{1},\ldots ,X_{n}]$ over a field ${}K$ is a polynomial of degree ${}2$ ; that is, an expression of the form

{}F=\sum _{i\leq j}a_{ij}X_{i}X_{j}+\sum _{i=1}^{n}b_{i}X_{i}+c\,,

where

{}a_{ij},b_{i},c\in K

.

Example

For a quadratic polynomial ${}aX^{2}+bX+c$ in one variable ${}X$ , with ${}a,b,c\in K$ and ${}a\neq 0$ , the zeroes can be found by completing the square. That is, we write (suppose that the characteristic of the fields is not ${}2$ )

{}aX^{2}+bX+c=a{\left(X^{2}+{\frac {b}{a}}X+{\frac {c}{a}}\right)}=a{\left({\left(X+{\frac {b}{2a}}\right)}^{2}-{\frac {b^{2}}{4a^{2}}}+{\frac {c}{a}}\right)}\,.

This equals ${}0$ if and only if

{}X=\pm {\sqrt {{\frac {b^{2}}{4a^{2}}}-{\frac {c}{a}}}}-{\frac {b}{2a}}\,,

and if the square root

{}{\sqrt {{\frac {b^{2}}{4a^{2}}}-{\frac {c}{a}}}}={\frac {1}{2a}}{\sqrt {b^{2}-4ac}}\,

exists in the field. Depending on this, there are no, one or two solutions.

We develop a relation between quadratic polynomials and bilinear forms.

Definition

For a bilinear form ${}\left\langle -,-\right\rangle$ on a ${}K$ -vector space ${}V$ , the mapping

V\longrightarrow K,v\longmapsto \left\langle v,v\right\rangle

is called the

corresponding quadratic form.

Given a basis ${}v_{1},\ldots ,v_{n}$ , a bilinear form is described by its Gram matrix

{}G={\left(g_{ij}\right)}_{ij}\,;

the corresponding quadratic form ${}V\rightarrow K$ is described by the quadratic polynomial

{}\left(X_{1},\,\ldots ,\,X_{n}\right)G{\begin{pmatrix}X_{1}\\\vdots \\X_{n}\end{pmatrix}}=\sum _{1\leq i,j\leq n}g_{ij}X_{i}X_{j}=\sum _{i}g_{ii}X_{i}^{2}+\sum _{i<j}{\left(g_{ij}+g_{ji}\right)}X_{i}X_{j}\,.

( ${}X_{i}$ is the ${}i$ -th projection, the corresponding dual basis). In the symmetric case, this is

\sum _{i}g_{ii}X_{i}^{2}+\sum _{i<j}2g_{ij}X_{i}X_{j}.

For every pure-quadratic polynomial in ${}n$ variables, we can form in this way a symmetric Gram matrix. The theory of real-symmetric bilinear forms makes it possible to get rid, by a suitable coordinate transformation (a base change) of the mixed terms.

Example

We make a list of real quadratic polynomials in the two variables ${}X$ and ${}Y$ , together with their corresponding vanishing sets; we restrict to coefficients from ${}0,1,-1$ . If only one variable ${}X$ occurs, then we have essentially the following three possibilities.

${}X^{2}\,$ ${}\,$ ${}\,$ the vanishing set is a "doubled line“.

${}X^{2}-1\,$ ${}\,$ ${}\,$ this means

${}X=\pm 1$ , the vanishing set consists in two parallel lines.

${}X^{2}+1\,$ ${}\,$ ${}\,$ the vanishing set is leer.

In these cases (where the second variable ${}Y$ does not occur explicitly), the vanishing set is simply the product set of a zero-dimensional vanishing set (finitely many points) and of a line.

Now we consider polynomials where both variables occur.

${}Y^{2}-X\,$ ${}\,$ ${}\,$ the vanishing set is a parabola.

${}Y^{2}-X^{2}\,$ ${}\,$ ${}\,$ this means

${}(Y-X)(Y+X)=0$ , the vanishing set consists in two lines crossing each other.

${}Y^{2}+X^{2}\,$ ${}\,$ ${}\,$ the only solution is the point ${}(0,0)$ , the vanishing set is just a single point.

${}Y^{2}-X^{2}-1\,$ ${}\,$ ${}\,$ this means

${}(Y-X)(Y+X)=1$ , the vanishing set is a hyperbola.

${}Y^{2}+X^{2}-1\,$ ${}\,$ ${}\,$ the vanishing sets is the unit circle.

${}Y^{2}+X^{2}+1\,$ ${}\,$ ${}\,$ again, this is empty.

The polynomial ${}XY-1$ does not appear directly in this list, because, in the variables ${}X=U+V$ and ${}Y=U-V$ , we can write

{}U^{2}-V^{2}=1\,.

In this form it is in the lists. The following theorem tells us that, up to scaling of the individual variables, the list is complete.

A *hyperbolic paraboloid*, also called a *saddle surface*.

An *ellipsoid*. Its surface is a quadric.

Example

We make a list of real quadratic polynomials in the three variables ${}X,Y$ and ${}Z$ , together with their corresponding vanishing sets, where we restrict the coefficients to ${}0,1,-1$ . Moreover, we consider only such polynomials where all variables occur and the vanishing set is not empty.

${}Y^{2}+X^{2}-Z\,$ ${}\,$ ${}\,$ the vanishing set is a paraboloid.

${}Y^{2}-X^{2}-Z\,$ ${}\,$ ${}\,$ the vanishing set is a saddle surface.

${}X^{2}+Y^{2}+Z^{2}$ ${}\,$ ${}\,$ the only solution is the point ${}(0,0,0)$ , the vanishing set is just one point.

${}X^{2}+Y^{2}+Z^{2}-1$ ${}\,$ ${}\,$ the vanishing set is a sphere, that is, the surface of a ball.

${}X^{2}+Y^{2}-Z^{2}$ ${}\,$ ${}\,$ the vanishing set is the solution set of the equation

${}Z^{2}=X^{2}+Y^{2}$ . This is a (double)-cone.

${}X^{2}+Y^{2}-Z^{2}-1$ ${}\,$ ${}\,$ the vanishing set is a one-sheeted hyperboloid.

${}X^{2}+Y^{2}-Z^{2}+1$ ${}\,$ ${}\,$ the vanishing set is a two-sheeted hyperboloid.

Theorem

Every real pure-quadratic polynomial

{}F=\sum _{i\leq j}a_{ij}X_{i}X_{j}\,

has, with respect to a suitable

orthonormal basis (with respect to the standard inner product) the form

{}F=\sum _{1\leq i\leq n}r_{i}U_{i}^{2}\,.

The

{}r_{i}

are the eigenvalues of the square symmetric matrix

{}M={\left(\alpha _{ij}\right)}_{1\leq i,j\leq n}\,,

where

{}\alpha _{ij}={\begin{cases}a_{ij}{\text{ for }}i=j\,,\\{\frac {a_{ij}}{2}}{\text{ for }}i<j\,,\\{\frac {a_{ji}}{2}}{\text{ for }}i>j\,.\end{cases}}\,

Proof

Using the matrix ${}M$ mentioned in the theorem we can write the polynomial in the form

\left(X_{1},\,\ldots ,\,X_{n}\right)M{\begin{pmatrix}X_{1}\\\vdots \\X_{n}\end{pmatrix}}.

By definition, the matrix ${}M$ is symmetric. Due to Theorem 42.12 , there exists an orthonormal basis ${}v_{1},\ldots ,v_{n}$ of ${}\mathbb {R} ^{n}$ , such that the new Gram matrix

{}D={B^{\text{tr}}}MB\,

is a diagonal matrix, where

{}B=M_{\mathfrak {e}}^{\mathfrak {v}}\,

denotes the base change matrix from the new basis to the standard basis. We express this briefly as

{}{\begin{pmatrix}v_{1}\\\vdots \\v_{d}\end{pmatrix}}={B^{\text{tr}}}{\begin{pmatrix}e_{1}\\\vdots \\e_{d}\end{pmatrix}}\,

in the sense of remark *****. Let ${}U_{1},\ldots ,U_{n}$ be the dual basis of the new orthonormal basis. Thus, the ${}U_{i}$ describe the new coordinate functions, and we consider them as new variables. According to Lemma 14.14 , we have the relation

{}{\begin{pmatrix}X_{1}\\\vdots \\X_{d}\end{pmatrix}}=B{\begin{pmatrix}U_{1}\\\vdots \\U_{d}\end{pmatrix}}\,.

Therefore,

{}{\begin{aligned}F&=\left(X_{1},\,\ldots ,\,X_{n}\right)M{\begin{pmatrix}X_{1}\\\vdots \\X_{n}\end{pmatrix}}\\&=\left(X_{1},\,\ldots ,\,X_{n}\right){{B^{-1}}^{\text{tr}}}DB^{-1}{\begin{pmatrix}X_{1}\\\vdots \\X_{n}\end{pmatrix}}\\&=\left(U_{1},\,\ldots ,\,U_{n}\right)D{\begin{pmatrix}U_{1}\\\vdots \\U_{n}\end{pmatrix}}.\end{aligned}}

Calculating this yields ${}\sum _{i=1}^{n}r_{i}U_{i}^{2}$ , where the ${}r_{i}$ are the diagonal entries of ${}D$ . Because of Lemma 21.10 , the eigenvalues of ${}M$ and of ${}D$ are the same (and they have the same multiplicities).

\Box

Example

We consider the pure-quadratic polynomial

{}F=3X^{2}-4XY+5Y^{2}\,.

In order to apply Theorem 43.10 , we have to find the eigenvalues of the matrix

{}M={\begin{pmatrix}3&-2\\-2&5\end{pmatrix}}\,.

The characteristic polynomial is

{}(X-3)(X-5)-4=X^{2}-8X+11=(X-4)^{2}-5\,.

Therefore, the eigenvalues are

x_{1}={\sqrt {5}}+4{\text{ and }}x_{2}=-{\sqrt {5}}+4.

Therefore, in a suitable orthonormal basis, the polynomial has the form

{}F={\left({\sqrt {5}}+4\right)}U^{2}+{\left(-{\sqrt {5}}+4\right)}V^{2}\,.

Example

We want to bring the pure-quadratic form

{\frac {3}{2}}X^{2}+2Y^{2}+2XY-2YZ

into a standard form in the sense of Theorem 43.10 . The corresponding symmetric matrix is

{\begin{pmatrix}{\frac {3}{2}}&1&0\\1&2&-1\\0&-1&0\end{pmatrix}}.

We have to determine the eigenvalues of this matrix. The characteristic polynomial of the matrix is

{}{\begin{aligned}\det {\begin{pmatrix}X-{\frac {3}{2}}&-1&0\\-1&X-2&1\\0&1&X\end{pmatrix}}&={\left(X-{\frac {3}{2}}\right)}{\left(X^{2}-2X-1\right)}-X\\&=X^{3}-{\frac {7}{2}}X^{2}+X+{\frac {3}{2}}\\&=(X-1){\left(X^{2}-{\frac {5}{2}}X-{\frac {3}{2}}\right)}\\&=(X-1)(X-3){\left(X+{\frac {1}{2}}\right)};\end{aligned}}

hence, the eigenvalues are

1,3,-{\frac {1}{2}}.

In the new variables ${}U,V,W$ for an orthonormal basis consisting of eigenvectors to these eigenvalues, we have

{}F=U^{2}+3V^{2}-{\frac {1}{2}}W^{2}\,.

Theorem

Every real quadratic polynomial

{}F=\sum _{i\leq j}a_{ij}X_{i}X_{j}+\sum _{i=1}^{n}b_{i}X_{i}+c\,

has, with respect to a suitable

orthonormal basis (with respect to the standard inner product, and allowing translations), the form ( ${}k\geq n$ and ${}r_{i}\neq 0$ )

{}F=\sum _{1\leq i\leq k}r_{i}U_{i}^{2}+s\,,

or the form ( ${}k\leq n-1$ and ${}r_{i}\neq 0$ )

{}F=\sum _{1\leq i\leq k}r_{i}U_{i}^{2}+sU_{k+1}\,.

Proof

We apply the transformation described in Theorem 43.10 to the pure-quadratic part ${}\sum _{i\leq j}a_{ij}X_{i}X_{j}$ . In the new variables ${}V_{j}$ (which are dual to the orthonormal basis), the polynomial has now the form

{}F=\sum _{1\leq i\leq k}e_{i}V_{i}^{2}+\sum _{j=1}^{n}f_{j}V_{j}+g\,,

with a certain ${}k$ between ${}1$ and ${}n$ , and ${}e_{i}\neq 0$ . The summands

e_{i}V_{i}^{2}+f_{i}V_{i}

can be brought, by completing the square and using new variables ${}U_{i}=V_{i}+h_{i}$ , to the form

e_{i}U_{i}^{2}+g_{i}.

Besides the pure-quadratic term, either a constant or a linear polynomial remains. In the second case, we denote this linear form by ${}U_{k+1}$ .

\Box

The representations occurring in this theorem are called the standard form of the quadratic form. In a standard form, we only have purely-quadratic terms and at most one variable in first degree. The theorem tells us that every quadratic form can be brought, using suitable orthonormal (Cartesian) coordinates, into such a standard form. Regarding the vanishing set, such a coordinate transformation means that we apply an affine-linear isometry.

Remark

A quadratic form in standard form

\sum _{1\leq i\leq k}r_{i}U_{i}^{2}+s\,\,{\text{  or }}\,\,\sum _{1\leq i\leq k}r_{i}U_{i}^{2}+sU_{k+1}

in the sense of Theorem 43.13 can be simplified further, if we allow distortions. In the new coordinates

{}Z_{i}={\sqrt {\vert {r_{i}}\vert }}U_{i}\,

or

{}U_{i}={\frac {1}{\sqrt {\vert {r_{i}}\vert }}}Z_{i}\,

for ${}i=1,\ldots ,k$ , the quadratic form has a representation of the form

\sum _{1\leq i\leq k}\pm Z_{i}^{2}+s\,\,{\text{  or }}\,\,\sum _{1\leq i\leq k}\pm Z_{i}^{2}+sZ_{k+1},

the coefficients are ${}1$ or ${}-1$ . This is called the normalized standard form of the quadratic form. By swapping of the variables, we may achieve that the first variables have the coefficient ${}1$ , and the later variables have coefficient ${}-1$ . In doing these transformations, the vanishing sets are distorted. For example, an ellipse might become a circle, or a parabola might be compressed. Since the vanishing set does not change by multiplying the form with ${}-1$ , we may also assume that the number variables with coefficient ${}1$ is at least the number of variables with coefficient ${}-1$ .

Example

We consider the quadratic polynomial

{}F=3X^{2}-4XY+5Y^{2}+6X+2Y-7\,,

and we want to transform it according to Theorem 43.13 into a standard form. In Example 43.10 we have already studied the pure-quadratic part ${}3X^{2}-4XY+5Y^{2}$ with the help of the symmetric matrix

{}M={\begin{pmatrix}3&-2\\-2&5\end{pmatrix}}\,,

the eigenvalues are

x_{1}={\sqrt {5}}+4{\text{ and }}x_{2}=-{\sqrt {5}}+4.

In order to transform ${}F$ itself to standard form, we need the eigenvectors, and we have to perform the change of variables explicitly. The eigenvectors are