Implementing natural numbers in OCaml

In this post we’re going to implement natural numbers (positive integers) in OCaml to see how we can define numbers from first principle, namely without using OCaml’s built in Integer type. We’ll then write a simple UI so that we have a basic (but inefficient) calculator. You can find all the code for this post on Github.

Definition

We’ll start with a recursive definition of natural numbers:

\[n \in \mathcal{N} \iff n = \begin{cases}0 \\ S(m) \hspace{5mm} \text{for }m \in \mathcal{N} \end{cases}\]

We used the function \(S(m)\) which is called the successor function. This simply returns the next natural number (for example \(S(0)=1\), and \(S(4)=5\)).

This definition means that a natural number is either \(0\) or the successor of another natural number. For example \(0\) is a natural number (the first case in the definition), but \(1\) is also a natural number, as it’s the successor of \(0\) (you would write \(1=S(0)\)). 2 can then be written as \(2 = S(S(0))\) , and so on. By using recursion (the definition of a natural number includes another natural number) we can “bootstrap” building numbers without using many other definitions.

We now write this definition as a type in OCaml, which looks a lot like the mathematical definition above:

type nat =
  | Zero
  | Succ of nat

The vertical lines denote the two cases. Here you would write 1 as Succ Zero, 2 as Succ Succ Zero, and so on.

However we haven’t said what these numbers are (what is zero? What are numbers? ). To do that we need to define how they act.

Some operators

We’ll start off by defining how we can increment and decrement them.

let incr n =
  Succ n

let decr n =
  match n with
  | Zero -> Zero
  | Succ nn -> nn

The increment function simply adds a Succ before the number, so this corresonds to adding 1. So incr (Succ Zero) returns Succ Succ Zero. The decrement function checks whether the number n is Zero or the successor of a number. In the first case it simply returns Zero (So this means that decr Zero returns Zero. However this could be extended to include negative numbers). In the second case the function returns the number that precedes it. So decr (Succ Succ Succ Zero) returns Succ Succ Zero.

Addition

We can now define addition as a recursive function which we denote by ++ (in OCaml we define infix operators using parentheses). So the addition function takes two elements n and m of type nat and returns an element of type nat. Note the rec added before the function name which means that it’s recursive.

let rec (++) n m =
  match m with
  | Zero -> n
  | Succ mm -> (Succ n) ++ mm

Because we defined the function to be an infix operator we put it in between the arguments (ex: Zero ++ (Succ Zero)). This function checks whether m is Zero or the successor of a number. If it’s a successor of mm it returns the sum of mm and Succ n.

Let check that this definition behaves correctly by calculating 1+1 which we write as (Succ Zero) ++ (Succ Zero). The first call to the function finds that the second argument is the successor of Zero, so returns the sum (Succ Succ Zero) ++ Zero. This calls the functions a second time which finds that the second argument is Zero. As a result the function return Succ Succ Zero which is 2 !

So in summary 1+1 is written as (Succ Zero) ++ (Succ Zero) = (Succ Succ Zero) ++ Zero = Succ Succ Zero. Math still works!

Subtraction

We now define subtraction:

let rec (--) n m =
  match m with
  | Zero -> n
  | Succ mm -> (decr n) -- mm

This decrements both arguments until the second one is Zero. Note that if m is bigger than n then n -- m will still equal Zero.

Multiplication

Moving on, we define multiplication:

let (+*) n m =
  let rec aux n m acc =
    match m with
    | Zero -> acc
    | Succ mm -> aux n mm (n ++ acc)
  in
  aux n m Zero

Here we use an auxiliary function (aux) which builds up the result in the accumulator acc by adding n to it m times. So applying this function to \(3\) and \(2\) gives: \(3*2 = 3*1 + 3 = 3*0 + 6 = 6\). And in code this is:

(Succ (Succ (Succ Zero))) +* (Succ (Succ Zero))
Which returns ((Succ (Succ (Succ Zero))) +* (Succ Zero)) ++ (Succ (Succ (Succ Zero)))
Which returns ((Succ (Succ (Succ Zero))) +* Zero) ++ (Succ (Succ (Succ (Succ (Succ (Succ Zero))))))
which returns (Succ (Succ (Succ (Succ (Succ (Succ Zero)))))) (namely \(6\))

Division

We also define the ‘strictly less than’ operator which we then use to define integer division.

let rec (<<) n m =
  match (n, m) with
  | (p, Zero) -> false
  | (Zero, q) -> true
  | (p, q) -> (decr n) << (decr m)

let (//) n m =
  let rec aux p acc =
    let lt = p << m in
    match lt with
    | true -> acc
    | false -> aux (p -- m) (Succ acc)
  in
  aux n Zero

Like in the case of multiplication, the division function defines an auxiliary function that builds up the result in the accumulator acc. This function checks whether the first argument p is less than m. If it isn’t, then increment the accumulator by 1 and call aux again but with p-m as the first argument. Once p is less than m then return the accumulator. So this auxiliary function counts the number of times that m fits into p, which is exactly what integer division is. We run this function with n as first argument and with the accumulator as Zero.

Finally we can define the modulo operator. As we use previous definitions of division, multiplication, and subtraction, this definition is abstracted away from our implementation of natural numbers. This function gives the remainder when dividing n by m.

let (%) n m =
  let p = n // m in
  n -- (p +* m)

A basic UI

We’ve defined the natural numbers and the basic operators, but it’s a bit unwieldy to use them in their current form. So we’ll write some code to convert them to the usual number system (represented as strings) and back.

From type `nat` to string representation

We’ll write some code to convert numbers to base 10 and then represent them in the usual Arabic numerals.

let ten = Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ Zero)))))))))

let base10 n =
  let rec aux q acc =
    let r = q % ten in
    let p = q // ten in
    match p with
    | Zero -> r::acc
    | pp -> aux p (r::acc)
  in
  aux n []

This function returns a list where each element corresponds to the number of 1s, 10s, 100s etc in the number. So if n is Succ Succ Succ Succ Succ Succ Succ Succ Succ Succ Succ Succ Zero (ie: 12), then base10 n returns [Succ Zero; Succ Succ Zero].

We then define the 10 digits (with a hack for the cases bigger than 9) and put it all together in the function string_of_nat.

let print_nat_digits = function
  | Zero -> "0"
  | Succ Zero -> "1"
  | Succ Succ Zero -> "2"
  | Succ Succ Succ Zero -> "3"
  | Succ Succ Succ Succ Zero -> "4"
  | Succ Succ Succ Succ Succ Zero -> "5"
  | Succ Succ Succ Succ Succ Succ Zero -> "6"
  | Succ Succ Succ Succ Succ Succ Succ Zero -> "7"
  | Succ Succ Succ Succ Succ Succ Succ Succ Zero -> "8"
  | Succ Succ Succ Succ Succ Succ Succ Succ Succ Zero -> "9"
  | _ -> "bigger than 9"

let string_of_nat n =
  let base_10_rep = base10 n in
  let list_strings = List.map print_nat_digits base_10_rep in
  String.concat "" list_strings

string_of_nat converts the number of type nat to base 10, then maps each of the list element to a string and concatenates those strings.

So string_of_nat (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ Zero)))))))))))) returns "12" which is easier to read!

From string representation to type `nat`

We then define some code to go the other way around: from string representation to natural numbers.

let string_to_list s =
  let rec loop acc i =
    if i = -1 then acc
    else
      loop ((String.make 1 s.[i]) :: acc) (pred i)
  in loop [] (String.length s - 1)


let nat_of_listnat l =
  let lr = List.rev l in
  let rec aux n b lr =
    match lr with
    | [] -> n
    | h::t -> aux (n ++ (b+*h)) (b+*ten) t
  in
  aux Zero (Succ Zero) lr

let nat_of_string_digits = function
  | "0" -> Zero
  | "1" -> Succ Zero
  | "2" -> Succ (Succ Zero)
  | "3" -> Succ (Succ (Succ Zero))
  | "4" -> Succ (Succ (Succ (Succ Zero)))
  | "5" -> Succ (Succ (Succ (Succ (Succ Zero))))
  | "6" -> Succ (Succ (Succ (Succ (Succ (Succ Zero)))))
  | "7" -> Succ (Succ (Succ (Succ (Succ (Succ (Succ Zero))))))
  | "8" -> Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ Zero)))))))
  | "9" -> Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ Zero))))))))
  | _ -> raise (Failure "string must be less than 10")

(* Converts string to nat *)
let nat_of_string s =
  let liststring = string_to_list s in
  let listNatbase = List.map nat_of_string_digits liststring in
  nat_of_listnat listNatbase


(*
  final (infix) functions for adding, subtracting, multiplying, and dividing
  which take strings as arguments and return a string
*)
let (+++) n m =
 string_of_nat ((nat_of_string n) ++ (nat_of_string m))

let (---) n m =
 string_of_nat ((nat_of_string n) -- (nat_of_string m))

let (+**) n m =
 string_of_nat ((nat_of_string n) +* (nat_of_string m))

let (///) n m =
 string_of_nat ((nat_of_string n) // (nat_of_string m))

let (%%) n m =
string_of_nat ((nat_of_string n) % (nat_of_string m))

So putting it all together, we have a working calculator for natural numbers!

Let’s try it out:

"3" +++ "17" returns "20"
"182" --- "93" returns "89"
"12" +** "3" returns "36"
"41" /// "3" returns "13"
"41" %% "3" returns "2"

Conclusion

We have built up natural numbers from first principles and now have a working calculator. However these operators start getting very slow for numbers of around 7 digits or more, so sticking with built-in integers sounds preferable..

All the code for this post is on Github

Thanks to James Jobanputra for useful feedback on this post