“Simply calculate the resultant”

I’ve been reading through Erwin Kreyszig’s excellent Differential Geometry. The second section is all about curves in three-dimensional Euclidean space, and it includes this neat problem:

Represent the folium of Descartes and the curve \mathbf{x}(t) = \left ( \frac{at}{t^2 + 1}, \frac{b(t^2 - 1)}{t^2 + 1}, 0 \right ), a,b \neq 0, in the form (6.5).

Problem 8.5

Where the “folium of Descartes” has already been given as

\mathbf{x}(t) = \left ( \frac{3t}{1 + t^3}, \frac{3t^2}{1 + t^3}, 0 \right ),

and “in the form (6.5)” means that two implicit functions should be given in terms of the coordinates of \mathbf{x}, as in

F(x_1, x_2, x_3) = 0, G(x_1, x_2, x_3) = 0.

I had a lot of fun working on this problem, and I’d like to share my little journey here. I’m going to focus on the folium of Descartes first.

A First Attempt

After some naive exploration, I decided to choose a technique that was guaranteed to work: cheating. I figured that a curve which is famous enough to have a name like “the folium of Descartes” would have a Wikipedia article with the answer, and I was right.

With a little adjustment of notation and a quick proof-by-calculation, we have the answer already!

x_1^3 = \frac{27t^3}{(1 + t^3)^3}

x_2^3 = \frac{27t^6}{(1 + t^3)^3}

x_1^3 + x_2^3 = \frac{27(t^3 + t^6)}{(1 + t^3)^3} = \frac{27t^3(1 + t^3)}{(1 + t^3)^3} = \frac{27}{(1 + t^3)^2}

x_1^3 x_2^3 = \frac{9t^3}{(1 + t^3)^2}

F(x_1, x_2, x_3) = x_1^3 + x_2^3 - 3x_1x_2 = 0

G(x_1, x_2, x_3) = x_3 = 0

At least, it was enough for me to ignore it and move on. The book isn’t very concerned with representations of the form (6.5), so I figured that my time was best spent elsewhere. Looking back now, I’m not sure that this “proof” is really enough. It certainly proves that F = G = 0 is a necessary condition for a point (x_1, x_2, x_3) to be on the curve, but it doesn’t do much in the way of proving that condition to be sufficient.

And even if this was a full proof, it would still be useless since there’s no sense of generality to it. I was only able to write it out because I looked up the answer to this particular problem.

The Book’s Answer

I didn’t learn until later that Differential Geometry has very informative solutions to certain problems in the back, so I was excited to learn that a more general technique for solving Problem 8.5 was provided. Here’s the book’s solution:

We have to eliminate the parameter t. The functions x_1(t) and x_2(t) may be written in the form F(x_1, t) = 0 and G(x_2, t) = 0. If F and G are polynomials in t we can always find the desired representation by equating to zero the resultant of these two polynomials. In the first case we obtain x_1^3 + x_2^3 - 3x_1x_2 = 0, x_3 = 0, and in the second case we find 4b^2x_1^2 + a^2x_2^2 - a^2b^2 = 0, x_3 = 0.

As a first thing, the F and G used here are not the same as in (6.5). I guess the book just likes those letters. The implicit functions as in (6.5) are given without name in this solution, at the end. I’ll be using the same notation for the rest of this. Aside from that, it seems like this should be a very simple algorithm, so let’s try to work through it, still just with the folium of Descartes.

It’s requested that we write x_1(t) and x_2(t) implicitly as polynomials, which is simple enough:

x_1(t) = \frac{3t}{1 + t^3} becomes F(x_1, t) = x_1t^3 - 3t + x_1 = 0

x_2(t) = \frac{3t^2}{1 + t^3} becomes G(x_2, t) = x_2t^3 - 3t^2 + x_2 = 0.

Then, we simply calculate the resultant of these two polynomials and set it equal to zero (I recommend that a reader consult the Wikipedia article on resultants if they want to understand why this equation corresponds to common roots of F and G). After looking up the “resultant”, we see that it is the determinant of the “Sylvester matrix”. I’ll get more into the construction of the Sylvester matrix later, but for now I’ll just say that the Sylvester matrix of these two polynomials is

\begin{pmatrix} x_1 & 0 & -3 & x_1 & 0 & 0 \\ 0 & x_1 & 0 & -3 & x_1 & 0 \\ 0 & 0 & x_1 & 0 & -3 & x_1 \\ x_2 & -3 & 0 & x_2 & 0 & 0 \\ 0 & x_2 & -3 & 0 & x_2 & 0 \\ 0 & 0 & x_2 & -3 & 0 & x_2 \end{pmatrix}.

Yikes. I’ve made a solemn pact with myself that I will never calculate the determinant of a matrix of order greater than 3 by hand, so seeing this for the first time was very frightening to me.

How to Calculate this Resultant

Computing is for computers, and computers have been able to calculate a determinant for ages now. As an example, here’s the determinant of the 10×10 matrix with each entry equal to one, using Haskell:

ghci> import Data.Matrix
ghci> detLaplace $ fromList 10 10 $ repeat 1
0

That was an awful example, but it still proves my point. Imagine if that matrix wasn’t singular! This would be much more convenient than a pencil-and-paper approach. The trouble comes when we notice that this example uses a matrix whose entries are numbers, while the matrix we’re interested in has more complicated entries. Haskell supports a wide variety of data types, so we should be able to continue with a computer-based approach if we look very carefully at what our entries are.

To have a Sylvester matrix in the first place, we need two univariate polynomials in the same variable and with coefficients from the same ring. This seems problematic, because the two polynomials we have, F and G, are multivariate polynomials in different variables and with integer coefficients. In symbols, we have F \in \mathbb{Z}[x_1, t] and G \in \mathbb{Z}[x_2,t]. However, we can view these as univariate polynomials in t by considering x_1 and x_2 as coefficients themselves. With this perspective, each polynomial has polynomial coefficients. In symbols, F \in (\mathbb{Z}[x_1])[t] and G \in (\mathbb{Z}[x_2])[t]. Now all that’s left to resolve is the requirement that the two univariate polynomials have coefficients in the same ring. This is no problem, since we have \mathbb{Z}[x_1], \mathbb{Z}[x_2] \subset \mathbb{Z}[x_1,x_2]. So finally, we can write F, G \in (\mathbb{Z}[x_1,x_2])[t].

The entries of the Sylvester matrix are made up of the coefficients of the two univariate polynomials, so now we have a better understanding of what our entries are: multivariate polynomials in x_1 and x_2 with integer coefficients. With a data type which can represent these, we should have no problem calculating our resultant. Luckily, the Haskell package polynomial-algebra does exactly this.

The package is seriously lacking when it comes to documentation, so I had to spend a lot of time reading through Balazs Komuves’ ingenious source code. For this I have no regrets; this library embodies my favorite part of Haskell very well: a complicated implementation can be hidden behind a clean front-end which interfaces seamlessly with the other parts of Haskell. By the time you’re actually using polynomials you can almost completely forget about the details of their in-memory representation.

(Edit: This is just false. The documentation is all on Hackage just like any other Haskell package. No idea how I didn’t see it…)

To assemble our matrix is very simple. Our coefficients can be implemented with Math.Algebra.Polynomial.Multivariate.Compact. This module allows us to represent polynomials with variables which have names like x_1,x_2,\ldots,x_N for fixed N. Of particular use is the type constructor ZPoly, which takes a generic variable name and the number of variables and gives us a polynomial ring in those variables with integer coefficients. So, for example, the type ZPoly "x" 2 corresponds to \mathbb{Z}[x_1,x_2]. To promote strings and natural numbers to types, we use the DataKinds language extension.

{-# LANGUAGE DataKinds #-}

import Data.Matrix
import Math.Algebra.Polynomial.Class
import Math.Algebra.Polynomial.Pretty
import Math.Algebra.Polynomial.Multivariate.Compact

x1 :: ZPoly "x" 2
x1 = variableP $ Index 1

x2 :: ZPoly "x" 2
x2 = variableP $ Index 2

-- note: we have automatic conversion from
--       integers to polynomials
m :: Matrix (ZPoly "x" 2)
m = fromLists [ [x1,  0, -3, x1,  0,  0]
              , [ 0, x1,  0, -3, x1,  0]
              , [ 0,  0, x1,  0, -3, x1]
              , [x2, -3,  0, x2,  0,  0]
              , [ 0, x2, -3,  0, x2,  0]
              , [ 0,  0, x2, -3,  0, x2]
              ]

Since ZPoly "x" 2 is an instance of the Num class, we can calculate determinants right away!

ghci> pretty $ detLaplace m
"- 27*x2^3 + 81*x1*x2 - 27*x1^3"

This is so cool! Up to rescaling and reordering (which is no problem) this is exactly what the book says! The only downside is that the matrix takes a while to type in. But this can be fixed.

Constructing the Sylvester Matrix

To construct the Sylvester matrix of two polynomials

p(t) = p_0 + p_1t + p_2t^2 + \ldots + p_dt^d

q(t) = q_0 + q_1t + q_2t^2 + \ldots + q_et^e

we play a cute little game. We make the first row by listing all of the coefficients of p backwards, followed by e-1 zeros:

\begin{pmatrix}  p_d & p_{d-1} & \ldots & p_1 & p_0 & 0 & \ldots & 0 \end{pmatrix} .

Then we repeat this same row e-1 times, but rotated to the right by one more each time. The eth row is

\begin{pmatrix}  0 & \ldots & 0 & p_d & p_{d-1} & \ldots & p_1 & p_0 \end{pmatrix} .

Then for the next rows we do the same thing, but with q playing the role of p and d playing the role of e. At the end, we have a square matrix of order d+e.

How to Calculate any Reslutant

With the construction in mind, we can write some Haskell to do the work for us. This time we’ll also make use of the Math.Algebra.Polynomial.Univariate module, which of course provides support for univariate polynomials. The type Univariate (ZPoly "x" 2) "t" corresponds to (\mathbb{Z}[x_1,x_2])[t]. After we’ve built our Sylvester matrix function, the resultant function follows easily.

{-# LANGUAGE DataKinds #-}

import GHC.TypeLits
import Data.Matrix
import Math.Algebra.Polynomial.Class
import Math.Algebra.Polynomial.Pretty
import Math.Algebra.Polynomial.Multivariate.Compact
import Math.Algebra.Polynomial.Univariate

x1 :: ZPoly "x" 2
x1 = variableP $ Index 1

x2 :: ZPoly "x" 2
x2 = variableP $ Index 2

polyF :: Univariate (ZPoly "x" 2) "t"
polyF = fromListP [ (U 3, x1)  -- (power, coefficient)
              , (U 1, -3)
              , (U 0, x1)
              ]

polyG :: Univariate (ZPoly "x" 2) "t"
polyG = fromListP [ (U 3, x2)
              , (U 2, -3)
              , (U 0, x2)
              ]

orderP :: Polynomial p => p -> Int
orderP = maximum . map (totalDegM . fst) . toListP

coefficientListP :: (Ring c, KnownSymbol v) => Univariate c v -> [c]
coefficientListP p = [coeffOfP (U n) p | n <- [0..d]]
  where
    d = orderP p

--https://rosettacode.org/wiki/Shift_list_elements_to_left_by_3#Haskell
rotate :: Int -> [a] -> [a]
rotate n xs = take <*> (flip drop (cycle xs) . mod n) $ length xs

sylvesterMatrix :: (Ring c, KnownSymbol v) => Univariate c v -> Univariate c v -> Matrix c
sylvesterMatrix p q = fromLists $ pLists ++ qLists
  where
    d      = orderP p
    e      = orderP q
    pList  = (reverse $ coefficientListP p) ++ (take (e-1) $ repeat 0)
    qList  = (reverse $ coefficientListP q) ++ (take (d-1) $ repeat 0)
    pLists = [rotate (-n) pList | n <- [0..e-1]]
    qLists = [rotate (-n) qList | n <- [0..d-1]]

resultant :: (Ring c, KnownSymbol v) => Univariate c v -> Univariate c v -> c
resultant p q = detLaplace $ sylvesterMatrix p q
ghci> mapPos (\(r,c) v -> pretty v) $ sylvesterMatrix polyF polyG
┌                                                             ┐
│      "x1"        "" "- 3*(1)"      "x1"        ""        "" │
│        ""      "x1"        "" "- 3*(1)"      "x1"        "" │
│        ""        ""      "x1"        "" "- 3*(1)"      "x1" │
│      "x2" "- 3*(1)"        ""      "x2"        ""        "" │
│        ""      "x2" "- 3*(1)"        ""      "x2"        "" │
│        ""        ""      "x2" "- 3*(1)"        ""      "x2" │
└                                                             ┘
ghci> pretty $ resultant polyF polyG
"- 27*x2^3 + 81*x1*x2 - 27*x1^3"

It works! Let’s see if that’s true for the problem’s second curve as well. In this case we have

x_1(t) = \frac{at}{t^2 + 1} becomes F(x_1,t) = x_1t^2 - at + x_1 = 0

x_2(t) = \frac{b(t^2 - 1)}{t^2 + 1} becomes G(x_2,t) = (x_2 - b)t^2 + x_2 + b = 0.

We have to look at the coefficients carefully again, but I’ll just skip to saying that they come from \mathbb{Z}[x_1, x_2, a, b]. To get this ring in Haskell, we’ll have to use the Math.Algebra.Polynomial.Multivariate.Generic module. It allows for polynomials with variables which have arbitrary names. This also provides a ZPoly type constructor, this time taking the type of variable and giving us a polynomial ring over those variables with integer coefficients. Note that the Generic and Compact modules cannot be used in the same file (without qualified names) or else there will be conflict over things like ZPoly.

{-# LANGUAGE DataKinds #-}

import Math.Algebra.Polynomial.Class
import Math.Algebra.Polynomial.Pretty
import Math.Algebra.Polynomial.Univariate
import Math.Algebra.Polynomial.Multivariate.Generic

data Coefficient = X1 | X2 | A | B
  deriving (Eq, Ord, Show)

instance Pretty Coefficient where
  pretty c = case c of
    X1 -> "x1"
    X2 -> "x2"
    A  -> "a"
    B  -> "b"

x1 :: ZPoly Coefficient
x1 = variableP X1

x2 :: ZPoly Coefficient
x2 = variableP X2

a :: ZPoly Coefficient
a = variableP A

b :: ZPoly Coefficient
b = variableP B

polyF :: Univariate (ZPoly Coefficient) "t"
polyF = fromListP [ (U 2, x1)
              , (U 1, -a)
              , (U 0, x1)
              ]

polyG :: Univariate (ZPoly Coefficient) "t"
polyG = fromListP [ (U 2, x2 - b)
              , (U 0, x2 + b)
              ]
ghci> mapPos (\(r,c) v -> pretty v) $ sylvesterMatrix polyF polyG
┌                                     ┐
│     "x1"    "- a"     "x1"       "" │
│       ""     "x1"    "- a"     "x1" │
│ "x2 - b"       "" "x2 + b"       "" │
│       "" "x2 - b"       "" "x2 + b" │
└                                     ┘
ghci> pretty $ resultant polyF polyG
"4*x1^2*b^2 + x2^2*a^2 - a^2*b^2"

Once again, our Haskell agrees (up to reordering) with the book, and we haven’t had to calculate a single determinant by hand.

Problems with the Resultant Method

For the second curve, we can notice that (x_1,x_2,x_3) = (0,b,0) is a valid point according to our implicit functions. But have a closer look at the parametric representation of the second coordinate: x_2(t) = \frac{b(t^2 - 1)}{t^2 + 1}. We can see that x_2 < b for all values of t. For a visual, I’ve plotted x_2(t) for b=1.

This is an issue with the resultant, having to do with something called a root at infinity. In fact, we can see that \lim_{t \to \pm \infty} \mathbf{x}(t) = (0,b,0). Still, such points do not belong to the parametric curve, so the parametric representation and the representation of the form (6.5) correspond to different curves.

Closing

I still haven’t given anything in the way of a proof that the representations are equivalent. In the case of the second curve, such a proof is impossible, as we have seen. Another very spooky thing that I’ve left out is the problems potentially created by only considering parameter values from \mathbb{R}, a field which is not algebraically closed. But I’m happy enough with my implementation of resultants in Haskell. I’ll leave the rest as an exercise for the reader.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: