A confusion about the inner product keeps disturbing me

Question

I get very confused about the concept of inner product. When an inner product is defined in a vector space, $\mathbb{V}$, don't we define it as an operation between two vectors from $\mathbb{V}$ itself?

But very often, I hear that people [cannot give a reference, :-(] say it is an operation between a vector from $\mathbb{V}$ and a dual vector from $\mathbb{V}^*$? If somebody could explain this simply and systematically, I would highly appreciate it.

I'll appreciate even more if someone could explain the examples of inner product in special relativity i.e. $A\cdot B= A_\mu B^\mu$ and quantum mechanics, i.e., $\langle\alpha|\beta\rangle$.

score 8 · Accepted Answer · edited Dec 17 '24 at 22:51

But very often, I hear that people [cannot give a reference, :-(] say it is an operation between a vector from $\mathbb{V}$ and a dual vector from $\mathbb{V}^*$? If somebody could explain this simply and systematically, I would highly appreciate it.

My advice is just stop listening to/don’t take so seriously their abuse of language. Because that’s what it is: an abuse of language. One can try to justify why it’s not so bad of an abuse of language, but at the end of the day, it is. So, if it’s confusing you, then don’t listen to them.

At a very general level, what we’re interested in is various ways of combining vectors to produce other vectors.

Definition.

Let $E,F,G$ be vector spaces over a field $\Bbb{F}$, and $\beta:E\times F\to G$ a bilinear map.

Then we (may if we decide to) call $\beta$ a “product”, and for each $(e,f)\in E\times F$, call the value $\beta(e,f)\in G$ the product of $e$ and $f$ (relative to $\beta$). It is also not uncommon to write $e\cdot_{\beta}f$ or simply $e\cdot f$ or $ef$ if the meaning of $\beta$ is understood from context.

Here are some common examples:

multiplication in the field: $\cdot: \Bbb{F}\times\Bbb{F}\to\Bbb{F}$, $(s,t)\mapsto s\cdot t=st$.
scalar multiplication in a vector space: $\cdot:\Bbb{F}\times E\to E$, $(t,x)\mapsto t\cdot x=tx$.
evaluation of a linear map on a vector: $\text{ev}:\text{Hom}(E,F)\times E\to F$, $(T,x)\mapsto T(x)$. A special case is when $E=F$, so you get the evaluation map in $E$, i.e $\text{End}(E)\times E\to E$.
Duality pairing $E^*\times E\to\Bbb{F}$. Notice that this is a special case of the above when $F=\Bbb{F}$ is the underlying field, but it is sufficiently important, so I put explicitly mention it as a separate bullet point.
composition of linear maps: $\circ:\text{Hom}(F,G)\times \text{Hom}(E,F)\to \text{Hom}(E,G)$, $(T,S)\mapsto T\circ S$. A special case is when $E=F=G$ so you have composition of endomorphisms $\text{End}(E)\times\text{End}(E)\to\text{End}(E)$.
A Lie algebra bracket, $[\cdot,\cdot]:\mathfrak{g}\times\mathfrak{g}\to\mathfrak{g}$. Common examples include $\Bbb{R}^3$ with the cross product, or $\text{End}(E)$ with the “commutator”.
A real inner-product. This means you start with a real vector space $V$ and a bilinear map $\langle\cdot,\cdot\rangle:V\times V\to\Bbb{R}$, which is symmetric, and positive-definite.
A complex inner-product. Here you start with a complex vector space $V$, and you don’t consider bilinear maps, but rather sesquilinear maps, i.e $\langle\cdot,\cdot\rangle:V\times V\to\Bbb{C}$ a map which is linear in the first slot, conjugate-linear in the second slot, and is ‘conjugate-symmetric’ (i.e $\langle x,y\rangle=\overline{\langle y,x\rangle}$) and for each $x\in V$, $\langle x,x\rangle$ is real and non-negative, with equality if and only if $x=0$.
Bilinear symmetric forms. This means you consider any vector space $E$ over a field $\Bbb{F}$ with characteristic not $2$ (for simplicity you may want to just focus on $\Bbb{R}$), and a bilinear map $E\times E\to\Bbb{R}$ which is symmetric.
Non-degenerate bilinear symmetric forms, also known as pseudo inner-products or semi inner-products. This means a bilinear symmetric map $\beta:E\times E\to\Bbb{F}$ such that the mapping $x\mapsto \beta(x,\cdot )$ from $E\to E^*$ is a linear isomorphism.
A symplectic form (on a real vector space, rather than a manifold). This means an anti-symmetric bilinear map $\omega:E\times E\to\Bbb{R}$ which is non-degenerate (i.e $x\mapsto \omega(x)$ is a linear isomorphism from $E\to E^*$).
The canonical map $\otimes:E\times F\to E\otimes F$ of a product space into the tensor product.

Ok enough examples, but hopefully these are all familiar (and if not familiar, then I hope atleast the definitions make sense). For your question, I want to focus mainly on items 4,7,10 (and a little about 8, but that’s essentially the same as 7, with a few extra complex conjugates).

Item 4 is one of the most important important examples of a “product” (i.e a bilinear map) because structurally speaking, the only thing you need to define (4) is the knowledge of a single vector space $E$. You give me $E$ and I can immediately define $E^*$ and I can immediately define the bilinear evaluation map $E^*\times E\to\Bbb{F}$. I don’t need any “extra structure” (i.e it is automatically there); compared to an inner product (see below) which is an extra piece of structure that you have to prescribe by hand.
Item 7 defines what a (real) inner product is. This is the usual intuitive notion encompassing Euclidean geometry (it is a rule for measuring lengths of vectors and angles between vectors). And yes, it is by definition a thing which takes two vectors from the same space and spits out a number. Anyone who says otherwise is abusing language (either because they’re already too smart and so expect everyone else to have a similar mathematical maturity, or they don’t know what they’re talking about so will interchange words without knowing why, or they’re just lazy), so regardless of their reason, don’t take their words too seriously (especially not if it is confusing you).
Item 10: in special relativity, this is the type of thing we consider. Specifically, a pseudo inner product (on a finite-dimensional space) with Lorentzian signature, meaning a bilinear map $g:E\times E\to\Bbb{R}$ such that there exists a basis $\{e_0,\dots, e_n\}$ for which we have $[g(e_i,e_j)]=\text{diag}(-1,+1,\dots, +1)$, with one minus and $n$ plus (or the other way around depending on your conventions). Be very careful this one minus sign describes a different type of beast in terms of geometry compared to item 7, but broadly speaking they’re similar in so far as they’re both symmetric, bilinear and non-degenerate (i.e item 7 is a very special case of item 10). So, in special relativity when people write $A\cdot B$, or more properly $g(A,B)$, then its expansion relative to an arbitrary basis is simply \begin{align} g(A,B)=g(A^ie_i,B^je_j)=g(e_i,e_j)A^iB^j\equiv g_{ij}A^iB^j. \end{align} This is all you should write at this stage. Writing this as $A_iB^i$ is garbage at this stage.

But now, I should mention the following algebraic fact. It really just boils down to a different way of looking at the same thing, i.e it is the process of currying. Suppose again we’re back to our general case of having a bilinear map $\beta:E\times F\to G$. So, one way to view this is just that we take a pair of vectors $(e,f)$ and produce a vector $\beta(e,f)$. Another thing you can do, however, is think that given $e\in E$, we get an operator $\beta(e,\cdot)\in\text{Hom}(F,G)$, so if this operator is now fed a vector in $f$, then we get an output in $G$. In other words, we have factorized \begin{align} E\times F\xrightarrow{\beta}G \end{align} into the composition \begin{align} E\times F\xrightarrow{\text{currying}}\text{Hom}(F,G)\times F\xrightarrow{\text{evaluation}} G\\ (e,f)\mapsto \bigg(\beta(e,\cdot),f\bigg)\mapsto \beta(e,f). \end{align} This may seem like a silly idea, and that I’m just unnecessarily introducing complexity into the problem. Well, the point is that (as we’ve discovered time and time again in math) sometimes in order to understand a ‘space’, you don’t just study all the elements of that space, but you also study all the possible functions on that space. So, that’s why we care so much about Hom spaces and in particular the dual spaces (and more generally at the manifold level, we care about not just what the points of the manifold are, but what the ring of smooth functions is, because that in a sense gives us the smooth structure… anyway this was just a side remark); in fact the importance of studying dual spaces alongside the given vector spaces is literally in the name of the subject functional analysis.

Now, in the special case that we consider bilinear symmetric functionals $\beta:E\times E\to\Bbb{R}$, this becomes the factorization of $\beta$ into its musical map followed by the duality pairing: \begin{align} E\times E\xrightarrow{\beta}\Bbb{R},\quad\text{factorizes into}\quad E\times E\xrightarrow{\text{currying}}E^*\times E\xrightarrow{\text{duality pairing (4)}}\Bbb{R}. \end{align} In other words, rather than just looking at $\beta(x,y)$, you can first think of converting the vector $x\in E$ into a covector $\beta(x,\cdot)\in E^*$ and then applying this covector on the vector $y\in E$ to produce the number $[\beta(x,\cdot)](y):=\beta(x,y)$.

So, that is what is actually going on in the second equality of the special relativity formula \begin{align} g(A,B)=g_{\mu\nu}A^{\mu}B^{\nu}=A_{\mu}B^{\mu}. \end{align} You’re first taking the vector $A$, and using the metric $g$ to convert it into a covector $g(A,\cdot)$, but by abuse of notation people denote this still as $A$, and it is this covector which is being evaluated on $B$. In physics, people will disambiguate the vector and covector by its index location (and 99% of the time, there will be only a single metric $g$ involved, so once you get the idea, no confusion can arise, but when multiple metrics are involved, you have to be very careful when doing these musical isomorphisms).

Finally, about point 8 (complex inner products), well there’s not much to say really. We just add in a few complex conjugates to the real definition to ensure it does what we want it to. In quantum mechanics, the thing we have is a complex inner product. And all this ket/bra stuff is just the distinction between a vector $x\in E$ and its associated dual $\langle x,\cdot \rangle\in E^*$.

The only technical thing to add is that often here we deal with infinite-dimensional complex vector spaces, so this mapping $x\mapsto \langle x,\cdot\rangle$ is not necessarily bijective. In finite-dimensions, it is bijective for easy reasons (it is injective due to positive-definiteness so by the rank-nullity theorem it is bijective). However in the infinite-dimensional case, this is no longer true; but it is true if you assume completeness (i.e you have a Hilbert space). In this case, the result is known as Riesz’s representation theorem.

Summary.

An inner product is indeed defined between two vectors in the same space. Anyone who says otherwise is abusing language.
In special relativity, $g(A,B)=g_{\mu\nu}A^{\mu}B^{\nu}$ is the ‘correct’ formula. With some enlightened interpretation by first converting a vector into a covector and then making an evaluation, you can make sense of saying this is equal to $A_{\mu}B^{\mu}$. Also, calling $g$ here an “inner product” is technically wrong because many people insist on positive-definiteness in its definition, but if you really wish to enlarge your definitions, and depending on context be more specific (e.g “positive-definite inner product” vs “Lorentzian inner product”) then be my guest.
In quantum mechanics $\langle\alpha|\beta\rangle$ is just the (complex)inner product of two vectors $\alpha,\beta\in E$. You can again play the same game of converting a vector $\alpha\in E$ into a ‘covector’ $\langle\alpha|\cdot\rangle\in E^*$ and then applying this on $\beta$ to get a number. Though… in QM rather than vectors and covectors, people speak of kets and bras (we just love our puns).

score 1 · Answer 2 · answered Feb 10 '24 at 08:37

There are differences in real vectors and imaginary vectors. In particular, real vectors in a measure space (the dot product establishes a measure) are combined in a dot product by multiplication of components and addition. In a complex vector, however, one takes the complex conjugate of the one vector before multiplying term-by-term and adding the terms. That is because a real value is given for A・A by such an operation that is always non-negative (a rather important feature for measure purposes).

Sometimes, too, the nomenclature is that a dot product results from a pair of vectors one of which is a (N by 1) column vector, and the other is a (1 by N) row vector. In that picture, a dot product of vectors A and B is the vector product of conjugate transpose of A and untransposed B. The conjugate transpose is therefore in a different vector space because it is a column vector and not a row vector. The 'bra' vector <A| is conjugated, the 'ket' vector |B> is not.

That picture treats vectors as nonsquare matrices, and dot product as one matrix times the conjugate-transpose of another, with the row and column matrices either making a 1 by 1 result (a dot product) or an N by N result (a direct product) depending on the order (row-first or column-first).

score 1 · Answer 3 · edited Feb 10 '24 at 14:23

Note: this is not a math answer, go to math.stackexchange.com for that. This is a physics answer.

So my absolute favorite definition of the inner product is from a Kip Thorne lecture on "coordinate free physics". This is a movement to express the laws of physics as relations between geometric object without regards to coordinates. Geometric objects are things like scalars, pseudo-scalars, vectors, axial-vectors, rank-$n$ tensors. (Note: numbers are not scalars--the contrary is dangerous misinformation. $\pi$ is a number, no one rotates $\pi$. You can't rotate $\pi$. Scalars are things that can be represented by a number, but you can rotate them, but they don't change under rotation).

When we are taught physics, we use coordinates. A vector is an arrow, magnitude and direction, for example:

$$ \hat z $$

we all know what the unit vector in the $z$ direction is. $(0, 0, 1)$. It's a little arrow of length $1$ that points up.

Or is it?

Really, vectors are geometric objects that are spanned by 3 orthonormal things that are closed under rotations, the most important representation is $Y_{l=1}^m(\theta, \phi)$.

So z-hat is:

$$ Y_1^0(\theta, \phi) = \frac 1 2 \sqrt{\frac 3 {2\pi}} \frac z r $$

Not quite an arrow, but it's a thing that looks like a dipole, as it should. (Arrows are too skinny), and can be rotates to mix with other $Y_{l=^1}^m$.

The point is that arrows are not irreducible geometric objects with dimension 3 that are closed under rotations, while $Y_1^m$ are.

That was just a warm-up. Now on to inner products.

So what's an inner product. We're taught that the inner product of 2 vectors is the product of their magnitudes times the cosine of the angle between them:

$$ |\vec a||\vec b|\cos\theta $$

The problem here is: what is $\theta$? With two arrows, it makes sense. But what if the arrows are complex? Or worse, in Hilbert space?

In terms of components, we're taught:

$$ \vec a \cdot \vec b = a_xb_x + a_yb_y + a_zb_z$$

It works in computer code, but requires that the vectors be represented in a coordinate system, and as we all know: physics doesn't care about your coordinate system.

Moreover, per the comments, what happens of the vectors are complex:

$$ \vec a \cdot \vec b = a_x^*b_x + a_y^*b_y + a_z^*b_z$$

That's the whole dual space thing. E.g.. if $b$ is a column vector, then $a$ needs to be a row vector--those are different spaces.

So what Kip says, is: forget the coordinates. With a linear vector space (addition and negation are defined), all you need is a norm--which is pretty basic. No norm, forget about it.

With that, the inner product is:

$$ \vec a \cdot \vec b \equiv \frac 1 4 [|(\vec a + \vec b)|^2 - |(\vec a + -(\vec b))|^2] $$

It's a scalar that is manifestly invariant under rotations (the cosine and component formula are not manifest invariant).

That formula should always work, but you know mathematicians: they always have a special case--but idk, if you have a norm and can add and subtract, it works. No coordinates or angles required. No dual spaces either..though it may appear in the norm, but the norm isn't defined as the square root of the dot product of a vector with itself, it is more fundamental than the dot product.

A confusion about the inner product keeps disturbing me

3 Answers3