By Shun-ichi Amari

**Additional info for Information Geometry and Its Applications**

We further introduce new parameters μ , σ2 1 θ2 = − 2 . 23) is written in the standard form, p(x, θ) = exp {θ · x − ψ(θ)} . 2 Examples of Exponential Family: Gaussian and Discrete Distributions 35 The convex function ψ(θ) is given by √ μ2 2πσ + log 2 2σ 2 θ1 1 1 = − 2 − log −θ2 + log π. 31) we use the dominating measure of where δ is the delta function. 10) as η1 = μ, η2 = μ2 + σ 2 . 2 Discrete Distribution Distributions of discrete random variable x over X = {0, 1, . . , n} form a probability simplex Sn .

18) for which we hereafter use the abbreviation ∂i = ∂ ∂ , ∂i = . 19) Here, the position of the index i is important. If it is lower, as in ∂i , the differentiation is with respect to θi , whereas, if it is upper as in ∂ i , the differentiation is with respect to ηi . The Fisher information matrix plays a fundamental role in statistics. We prove the following theorem which connects geometry and statistics. 1 The Riemannian metric in an exponential family is the Fisher information matrix defined by 2 Exponential Families and Mixture Families of Probability .

122) Fig. 8 Geodesic projection of P to S M P geodesic . Ps . 6 Generalized Pythagorean Theorem and Projection Theorem 27 for any neighboring Q. This shows that PˆS∗ is a critical point of Dψ [P : Q], Q ∈ S, proving the theorem. The dual part is proved similarly. It should be noted that the projection theorem gives a necessary condition for the point PˆS∗ to minimize the divergence, but is not sufficient. The projection or dual projection can give the maximum or saddle point of the divergence. The following theorem gives a sufficient condition for the minimality of the projection and its uniqueness.