# Dual norm

In functional analysis, the dual norm is a measure of the "size" of each continuous linear functional defined on a normed vector space.

## Definition

Let ${\displaystyle X}$  be a normed vector space with norm ${\displaystyle |\cdot |}$  and let ${\displaystyle X^{*}}$  be the dual space. The dual norm of a continuous linear functional ${\displaystyle f}$  belonging to ${\displaystyle X^{*}}$  is defined to be the real number

${\displaystyle \|f\|:=\sup\{|f(x)|:x\in X,|x|\leq 1\}}$

where ${\displaystyle \sup }$  denotes the supremum.[1]

The map ${\displaystyle f\mapsto \|f\|}$  defines a norm on ${\displaystyle X^{*}}$ . (See Theorems 1 and 2 below.)

The dual norm is a special case of the operator norm defined for each (bounded) linear map between normed vector spaces.

The topology on ${\displaystyle X^{*}}$  induced by ${\displaystyle |\cdot |}$  turns out to be as strong as the weak-* topology on ${\displaystyle X^{*}}$ .

If the ground field of ${\displaystyle X}$  is complete then ${\displaystyle X^{*}}$  is a Banach space.

## The double dual of a normed linear space

The double dual (or second dual) ${\displaystyle X^{**}}$  of ${\displaystyle X}$  is the dual of the normed vector space ${\displaystyle X^{*}}$ . There is a natural map ${\displaystyle \varphi :X\to X^{**}}$ . Indeed, for each ${\displaystyle w^{*}}$  in ${\displaystyle X^{*}}$  define

${\displaystyle \varphi (v)(w^{*}):=w^{*}(v).}$

The map ${\displaystyle \varphi }$  is linear, injective, and distance preserving.[2] In particular, if ${\displaystyle X}$  is complete (i.e. a Banach space), then ${\displaystyle \varphi }$  is an isometry onto a closed subspace of ${\displaystyle X^{**}}$ .[3]

In general, the map ${\displaystyle \varphi }$  is not surjective. For example, if ${\displaystyle X}$  is the Banach space ${\displaystyle L^{\infty }}$  consisting of bounded functions on the real line with the supremum norm, then the map ${\displaystyle \varphi }$  is not surjective. (See ${\displaystyle L^{p}}$  space). If ${\displaystyle \varphi }$  is surjective, then ${\displaystyle X}$  is said to be a reflexive Banach space. If ${\displaystyle 1  then the space ${\displaystyle L^{p}}$  is a reflexive Banach space.

## Mathematical Optimization

Let ${\displaystyle \|\cdot \|}$  be a norm on ${\displaystyle \mathbb {R} ^{n}.}$  The associated dual norm, denoted ${\displaystyle \|\cdot \|_{*},}$  is defined as

${\displaystyle \|z\|_{*}=\sup\{z^{\intercal }x\;|\;\|x\|\leq 1\}.}$

(This can be shown to be a norm.) The dual norm can be interpreted as the operator norm of ${\displaystyle z^{\intercal }}$ , interpreted as a ${\displaystyle 1\times n}$  matrix, with the norm ${\displaystyle \|\cdot \|}$  on ${\displaystyle \mathbb {R} ^{n}}$ , and the absolute value on ${\displaystyle \mathbb {R} }$ :

${\displaystyle \|z\|_{*}=\sup\{|z^{\intercal }x|\;|\;\|x\|\leq 1\}.}$

From the definition of dual norm we have the inequality

${\displaystyle z^{\intercal }x=\|x\|\left(z^{\intercal }{\frac {x}{\|x\|}}\right)\leq \|x\|\|z\|_{*}}$

which holds for all x and z.[4] The dual of the dual norm is the original norm: we have ${\displaystyle \|x\|_{**}=\|x\|}$  for all x. (This need not hold in infinite-dimensional vector spaces.)

The dual of the Euclidean norm is the Euclidean norm, since

${\displaystyle \sup\{z^{\intercal }x\;|\;\|x\|_{2}\leq 1\}=\|z\|_{2}.}$

(This follows from the Cauchy–Schwarz inequality; for nonzero z, the value of x that maximises ${\displaystyle z^{\intercal }x}$  over ${\displaystyle \|x\|_{2}\leq 1}$  is ${\displaystyle {\tfrac {z}{\|z\|_{2}}}}$ .)

The dual of the ${\displaystyle \ell _{\infty }}$ -norm is the ${\displaystyle \ell _{1}}$ -norm:

${\displaystyle \sup\{z^{\intercal }x\;|\;\|x\|_{\infty }\leq 1\}=\sum _{i=1}^{n}|z_{i}|=\|z\|_{1},}$

and the dual of the ${\displaystyle \ell _{1}}$ -norm is the ${\displaystyle \ell _{\infty }}$ -norm.

More generally, Hölder's inequality shows that the dual of the ${\displaystyle \ell _{p}}$ -norm is the ${\displaystyle \ell _{q}}$ -norm, where, q satisfies ${\displaystyle {\tfrac {1}{p}}+{\tfrac {1}{q}}=1}$ , i.e., ${\displaystyle q={\tfrac {p}{p-1}}.}$

As another example, consider the ${\displaystyle \ell _{2}}$ - or spectral norm on ${\displaystyle \mathbb {R} ^{m\times n}}$ . The associated dual norm is

${\displaystyle \|Z\|_{2*}=\sup\{\mathrm {\bf {tr}} (Z^{\intercal }X)|\|X\|_{2}\leq 1\},}$

which turns out to be the sum of the singular values,

${\displaystyle \|Z\|_{2*}=\sigma _{1}(Z)+\cdots +\sigma _{r}(Z)=\mathrm {\bf {tr}} ({\sqrt {Z^{\intercal }Z}}),}$

where ${\displaystyle r=\mathrm {\bf {rank}} Z.}$  This norm is sometimes called the nuclear norm.[5]

## Examples

### Dual norm for matrices

The Frobenius norm defined by

${\displaystyle \|A\|_{\text{F}}={\sqrt {\sum _{i=1}^{m}\sum _{j=1}^{n}\left|a_{ij}\right|^{2}}}={\sqrt {\operatorname {trace} (A^{*}A)}}={\sqrt {\sum _{i=1}^{\min\{m,n\}}\sigma _{i}^{2}}}}$

is self-dual, i.e., its dual norm is ${\displaystyle \|\cdot \|'_{\text{F}}=\|\cdot \|_{\text{F}}.}$

The spectral norm, a special case of the induced norm when ${\displaystyle p=2}$ , is defined by the maximum singular values of a matrix, i.e.,

${\displaystyle \|A\|_{2}=\sigma _{\max }(A),}$

has the nuclear norm as its dual norm, which is defined by

${\displaystyle \|B\|'_{2}=\sum _{i}\sigma _{i}(B),}$

for any matrix ${\displaystyle B}$  where ${\displaystyle \sigma _{i}(B)}$  denote the singular values[citation needed].

## Some basic results about the operator norm

More generally, let ${\displaystyle X}$  and ${\displaystyle Y}$  be topological vector spaces, and ${\displaystyle L(X,Y)}$ [6] be the collection of all bounded linear mappings (or operators) of ${\displaystyle X}$  into ${\displaystyle Y}$ . In the case where ${\displaystyle X}$  and ${\displaystyle Y}$  are normed vector spaces, ${\displaystyle L(X,Y)}$  can be normed in a natural way.

Theorem 1. Let ${\displaystyle X}$  and ${\displaystyle Y}$  be normed spaces, and associate to each ${\displaystyle f\in L(X,Y)}$  the number:
${\displaystyle \|f\|=\sup\{|f(x)|:x\in X,\|x\|\leq 1\}.}$
This turns ${\displaystyle L(X,Y)}$  into a normed space. Moreover if ${\displaystyle Y}$  is a Banach space, so is ${\displaystyle L(X,Y)}$ .[7]

Proof. A subset of a normed space is bounded if and only if it lies in some multiple of the unit sphere; thus ${\displaystyle \|f\|<\infty }$  for every ${\displaystyle f\in L(X,Y)}$  if ${\displaystyle \alpha }$  is a scalar, then ${\displaystyle (\alpha f)(x)=\alpha \cdot fx}$  so that

${\displaystyle \|\alpha f\|=|\alpha |\|f\|}$

The triangle inequality in ${\displaystyle Y}$  shows that

{\displaystyle {\begin{aligned}\|(f_{1}+f_{2})x\|&=\|f_{1}x+f_{2}x\|\\&\leq \|f_{1}x\|+\|f_{2}x\|\\&\leq (\|f_{1}\|+\|f_{2}\|)\|x\|\\&\leq \|f_{1}\|+\|f_{2}\|\end{aligned}}}

for every ${\displaystyle x\in X}$  with ${\displaystyle \|x\|\leq 1}$ . Thus

${\displaystyle \|f_{1}+f_{2}\|\leq \|f_{1}\|+\|f_{2}\|}$

If ${\displaystyle f\neq 0}$ , then ${\displaystyle fx\neq 0}$  for some ${\displaystyle x\in X}$ ; hence ${\displaystyle \|f\|>0}$ . Thus, ${\displaystyle L(X,Y)}$  is a normed space.[8]

Assume now that ${\displaystyle Y}$  is complete, and that ${\displaystyle \{f_{n}\}}$  is a Cauchy sequence in ${\displaystyle L(X,Y)}$ . Since

${\displaystyle \|f_{n}x-f_{m}x\|\leq \|f_{n}-f_{m}\|\|x\|}$

and it is assumed that ${\displaystyle \|f_{n}-f_{m}\|\to 0}$  as ${\displaystyle n,m\to \infty }$ , ${\displaystyle \{f_{n}x\}}$  is a Cauchy sequence in ${\displaystyle Y}$  for every ${\displaystyle x\in X}$ . Hence

${\displaystyle fx=\lim _{n\to \infty }f_{n}x}$

exists. It is clear that ${\displaystyle f:X\to Y}$  is linear. If ${\displaystyle \varepsilon >0}$ , ${\displaystyle \|f_{n}-f_{m}\|\|x\|\leq \varepsilon \|x\|}$  for sufficiently large n and m. It follows

${\displaystyle \|fx-f_{m}x\|\leq \varepsilon \|x\|}$

for sufficiently large m. Hence ${\displaystyle \|fx\|\leq (\|f_{m}\|+\varepsilon )\|x\|}$ , so that ${\displaystyle f\in L(X,Y)}$  and ${\displaystyle \|f-f_{m}\|\leq \varepsilon }$ . Thus ${\displaystyle f_{m}\to f}$  in the norm of ${\displaystyle L(X,Y)}$ . This establishes the completeness of ${\displaystyle L(X,Y).}$ [9]

When ${\displaystyle Y}$  is a scalar field (i.e. ${\displaystyle Y=\mathbb {C} }$  or ${\displaystyle Y=\mathbb {R} }$ ) so that ${\displaystyle L(X,Y)}$  is the dual space ${\displaystyle X^{*}}$  of ${\displaystyle X}$ .

Theorem 2. Suppose ${\displaystyle B}$  is the closed unit ball of normed space ${\displaystyle X}$ . For every ${\displaystyle x^{*}\in X^{*}}$  define:
${\displaystyle \|x^{*}\|=\sup\{|\langle {x,x^{*}}\rangle |:x\in B\}}$
Then
(a) This norm makes ${\displaystyle X^{*}}$  into a Banach space.[10]
(b) Let ${\displaystyle B^{*}}$  be the closed unit ball of ${\displaystyle X^{*}}$ . For every ${\displaystyle x\in X}$ ,
${\displaystyle \|x\|=\sup\{|\langle {x,x^{*}}\rangle |:x^{*}\in B^{*}\}.}$
Consequently, ${\displaystyle x^{*}\to \langle {x,x^{*}}\rangle }$  is a bounded linear functional on ${\displaystyle X^{*}}$  of norm ${\displaystyle \|x\|}$ .
(c) ${\displaystyle B^{*}}$  is weak*-compact.

Proof. Since ${\displaystyle L(X,Y)=X^{*}}$ , when ${\displaystyle Y}$  is the scalar field, (a) is a corollary of Theorem 1. Fix ${\displaystyle x\in X}$ . There exists[11] ${\displaystyle y^{*}\in B^{*}}$  such that

${\displaystyle \langle {x,y^{*}}\rangle =\|x\|.}$

but,

${\displaystyle |\langle {x,x^{*}}\rangle |\leq \|x\|\|x^{*}\|\leq \|x\|}$

for every ${\displaystyle x^{*}\in B^{*}}$ . (b) follows from the above. Since the open unit ball ${\displaystyle U}$  of ${\displaystyle X}$  is dense in ${\displaystyle B}$ , the definition of ${\displaystyle \|x^{*}\|}$  shows that ${\displaystyle x^{*}\in B^{*}}$  if and only if ${\displaystyle |\langle {x,x^{*}}\rangle |\leq 1}$  for every ${\displaystyle x\in U}$ . The proof for (c)[12] now follows directly.[13]

## Notes

1. ^ Rudin 1991, p. 87
2. ^ Rudin 1991, section 4.5, p. 95
3. ^ Rudin 1991, p. 95
4. ^ This inequality is tight, in the following sense: for any x there is a z for which the inequality holds with equality. (Similarly, for any z there is an x that gives equality.)
5. ^
6. ^ Each ${\displaystyle L(X,Y)}$  is a vector space, with the usual definitions of addition and scalar multiplication of functions; this only depends on the vector space structure of ${\displaystyle Y}$ , not ${\displaystyle X}$ .
7. ^ Rudin 1991, p. 92
8. ^ Rudin 1991, p. 93
9. ^ Rudin 1991, p. 93
10. ^ Aliprantis 2005, p. 230
11. ^ Rudin 1991, Theorem 3.3 Corollary, p. 59
12. ^ Rudin 1991, Theorem 3.15 The Banach–Alaoglu theorem algorithm, p. 68
13. ^ Rudin 1991, p. 94

## References

• Aliprantis, Charalambos D.; Border, Kim C. (2007). Infinite Dimensional Analysis: A Hitchhiker's Guide (3rd ed.). Springer. ISBN 9783540326960.
• Boyd, Stephen; Vandenberghe, Lieven (2004). Convex Optimization. Cambridge University Press. ISBN 9780521833783.
• Kolmogorov, A.N.; Fomin, S.V. (1957). Elements of the Theory of Functions and Functional Analysis, Volume 1: Metric and Normed Spaces. Rochester: Graylock Press.
• Narici, Lawrence; Beckenstein, Edward (2011). Topological Vector Spaces. Pure and applied mathematics (Second ed.). Boca Raton, FL: CRC Press. ISBN 978-1584888666. OCLC 144216834.
• Rudin, Walter (January 1, 1991). Functional Analysis. International Series in Pure and Applied Mathematics. 8 (Second ed.). New York, NY: McGraw-Hill Science/Engineering/Math. ISBN 978-0-07-054236-5. OCLC 21163277.CS1 maint: ref=harv (link) CS1 maint: date and year (link)
• Schaefer, Helmut H.; Wolff, Manfred P. (1999). Topological Vector Spaces. GTM. 8 (Second ed.). New York, NY: Springer New York Imprint Springer. ISBN 978-1-4612-7155-0. OCLC 840278135.CS1 maint: ref=harv (link)
• Trèves, François (August 6, 2006) [1967]. Topological Vector Spaces, Distributions and Kernels. Mineola, N.Y.: Dover Publications. ISBN 978-0-486-45352-1. OCLC 853623322.CS1 maint: ref=harv (link) CS1 maint: date and year (link)