Arithmetic on bigInts
L. Allison, FIT, Monash University, Australia 3800.
16 August 2021
1. Introduction
An arbitrarily big integer,
a bigInt,
can be represented by a sign (±) and
a vector (array) of digits in some base:
        
   sign D[d-1], ..., D[0] 
where the base is chosen so that each digit fits in an integer
as can be manipulated by basic computer operations.
Addition (subtraction) of two bigInts can be performed digit by digit
taking account of any carries (borrows), e.g., in base ten,
9+9+1
= 19
= 9 carry 1;
it is convenient if
'digit1+digit2+carry'
cannot overflow.
Multiplication
of two bigInts can be performed
by various algorithms that ultimately rely on
basic computer operations to multiply individual digits
and add any carries.
Consider for example,
in base ten,
        
   9×9+8
   = 81+8
   = 89
   = 9 carry 8
and,
in base 16,
        
   1111×1111+1110
   = 11101111
   = 1111 carry 1110 
        
   (15×15+14
   = 239
   = 14×16+15
   = 15 carry 14).
We see that the product of two b-bit "digits" can produce a 2b-bit result.
If a 2b-bit integer fits in one computer Int then two digits can be
multiplied by a basic computer multiply.
Failing that, two digits must be multiplied
by a small software procedure of a few computer operations.
Division of bigInts can be performed by multiplication
combined with other operations on bigInts.
Timings of long division and long multiplication are reasonably consistent with quadratic time-complexity O(d2) (d~log N), div having a constant several times bigger (but this version is operating bit-wise not digit-wise like '×'). Note that N.toString() does a special 'divide by 10' repeatedly (log10(N) times) and takes maybe O(d2)-time overall but is very lumpy (try 64K bits or more), while N.toBinary() is very fast of course. (And the garbage collector may kick in at any time.)
2. Addition and Subtraction
Addition and subtraction of d-digit integers take O(d)-time.
3. Multiplication
3.1. Long-Multiplication
      1 2 3
    x 4 5 6
  ----------.
  4 9 2     | 4
    6 1 5   | 5
      7 3 8 | 6
  ---------
  5 6 0 8 8
Figure 2: Long-Multiplication.
For d-digit integers, long-multiplication takes O(d2)-time.
3.2. Karatsuba ×
Multiply two d-digit integers by multiplying three half-length, (d/2)-digit, integers [Kar62].
-  A;B and C;D are big integers.
- A;B × C;D = A×C<< + (A×D+B×C)< + B×D
- now (A×D+B×C) = −(A−B)×(C−D) + A×C + B×D (also = (A+B)×(C+D) − A×C − B×D), hence
- A;B × C;D = A×C<< + A×C< − (A−B)×(C−D)< + C×D< + C×D
- So to multiply A;B × C;D we only need to do three multiplications of half-size integers, A×C, C×D and (A−B)×(C−D), and some addition, subtraction and left-shift (<).
The algorithm has O(dlog2(3)) time complexity for d-digit multiplication, log2(3)≈1.58, so it beats long-multiplication for big enough bigInts.
2.2.1 The polynomial view:
An integer corresponds to a polynomial, the integer's digits being the coefficients of the polynomial. Multiplying two integers is mirrored by multiplying their corresponding polynomials to give the product polynomial. Recall that a polynomial of degree 'd' is defined by its values at d+1 points. In particular, a quadratic is defined by its values at three points, e.g., 0, 1 and −1, say.
-  f(x) × g(x) = (Ax + B) × (Cx + D)
   
   = Wx2 + Yx + Z = p(x).
 
 
-  p(0)    = Z = f(0) × g(0) = (0+B)×(0+D) = BD
- p(1) = W + Y + Z = (A+B)×(C+D)
- p(−1) = W − Y + Z = (−A+B)×(−C+D),
 
 
- p(1)−p(−1) = 2Y = (A+B)×(C+D) − (−A+B)×(−C+D) = 2(AD + BC)
- p(1)+p(−1) = 2(W + Z) = (A+B)×(C+D) + (−A+B)×(−C+D) = 2(AC + BD)
- W = ×AC, Y = AD+BC, Z = ×BD, so it is right
- and, looking at p(−1), Y = W+Z−(A−B)×(C−D)
- p(1) = W + Y + Z = (A+B)×(C+D)
3.3. Toom-Cook × (Toom3)
Multiply two d-digit integers by multiplying five (d/3)-digit integers, version after M. Bodrato [Bod07].
-  A;B;C × D;E;F
- f(x) × g(x) = (Ax2 + Bx + C) × (Dx2 + Ex + F) = Ux4 + Vx3 + Wx2 + Yx + Z = p(x)
 
 
- f(x) × g(x) = (Ax2 + Bx + C) × (Dx2 + Ex + F) = Ux4 + Vx3 + Wx2 + Yx + Z = p(x)
-  p(0)   = Z = f(0) × g(0) = C×F
- p(1) = U+V+W+Y+Z = (A+B+C) × (D+E+F)
- p(−1) = U−V+W−Y+Z = (A−B+C) × (D−E+F)
- p(−2) = 16U−8V+4W−2Y+Z = (4A−2B+C) × (4D−2E+F)
- p(∞): U = A×D
 
 
- Z = p(0) = CF ✓
- U = p(∞) = AD ✓
- r = (p(-2) − p(1))/3
 = (((4A−2B+C)×(4D−2E+F)) − ((A+B+C)×(D+E+F)))/3
 = 5AD−3AE+AF−3BD+BE−BF+CD−CE
- s = (p(1) − p(-1))/2
 = ((A+B+C)×(D+E+F)) − ((A−B+C)×(D−E+F)))/2
 = AE+BD+BF+CE
- t = p(-1) − Z
 = (A−B+C)×(D−E+F)−CF
 = AD−AE+AF−BD+BE−BF+CD−CE
- V = (t − r)/2 + 2.U
 = AE+BD ✓
- W = t + s − U
 = AF+BE+CD ✓
- Y = s − V
 = BF+CE ✓
- note five '×' (and some shifts, '+'s and '−'s) to calculate p(0), p(1), p(−1), p(−2) and p(∞) and from them U, V, W, Y and Z.
- p(1) = U+V+W+Y+Z = (A+B+C) × (D+E+F)
The algorithm has O(dlog3(5)) time complexity (log3(5)≈1.465). This beats Karatsuba asymptotically but the more extensive house-keeping makes for a larger constant of proportionality.
3.4. Faster Multiplication
The Schonhage-Strassen algorithm [SS71], running in O(n log n log log d)-time, was the asymptotically fastest multiplication algorithm for more than four decades. Harvey and van der Hoeven [HV19] finally removed the log log d term (as S and S had speculated would be possible); the improvement is of great theoretical interest but little practical use.
      1   2   3   123 x 456
      <--------
  6 | 7   3   8
    |           .
  5 | 6   1   5   .
    |           .   .
  4 | 4   9   2   .  8
    v   .   .   .  8
          .   .  0
            .  6
             5
  123 x 456 = 56088
Figure 4: Another Angle on Long-Multiplication.
Long multiplication is closely related to vector convolution, M conv N, where we pair-wise multiply elements and add up diagonals and that is all, the carries are dealt with afterwards (this is linear convolution below):
      1   2   3   123 conv 456
      <--------
  6 | 6  12  18
    |           .
  5 | 5  10  15   .
    |           .   .
  4 | 4   8  12   . 18
    v   .   .   . 27
          .   . 28
            . 13
             4
  4,13,28,27,18 >carries> 5,6,0,8,8, i.e., 56088
Figure 5: Linear Convolution
Trouble is, the obvious way to do the above still takes quadratic time, O(d2), but there is a faster method.
The Schonhage-Strassen algorithm [SS71] relies on
the Fast Fourier Transform (FFT) algorithm [CT65] which
calculates the Discrete Fourier Transform (F) of a vector
and on the fact that
       
  conv(X, Y) = F-1( F(X) . F(Y) ) 
       
  where F(X) . F(Y) is element-wise multiplication.
The FFT and FFT-1 algorithms carry out O(d log d)
arithmetic operations.
4. Division
To divide M≥0 by N>0, M ÷ N, find quotient Q≥0 and remainder r∈[0, N−1] such that M = Q×N + r, e.g., for 14÷4, Q=3 and r=2.
4.1. Long-Division
Long-division has O(d2) time complexity.
  e.g., 3219876 ÷ 394
           8172 rem 108
       .-------
  394  |3219876
   x8 = 3152
        ----
          678
   x1 =   394
          ---
          2847
   x7 =   2758
          ----
            896
   x2 =     788
            ---
            108
Figure 3: Long-Division
Long-division can be carried out in any base. If the base of the stored integers is a power of two (2k) the algorithm can still act in terms of binary (21) and this is the simplest version to implement because the only possible digit-multipliers are zero and one, however the time-complexity does decrease as the base increases because there are fewer digits.
4.2. Recursive Division
There are fast recursive algorithms [BZ98] to divide bigInts, e.g.,
To divide M by N, giving a quotient, Q, and a remainder:
Initialise Q := 0.
If M < N return 0 remainder M.
If M is "short", or if N is "nearly" as long as M, use long-division.
Otherwise M > N, M is "long" and M is longer than N.
If N is "long"
set half-length integers N1 and N0 such that N=N1;N0 and
set N1' =N1+1
(note that (N1'<<|N0|) > N),
otherwise N is "short",
set N1' = N.
Set half-length integers M1 and M0 such that M=M1;M0;
note that
M1 <<|M0| ≤ M.
Set
Q1 := (M1 ÷ N1') <<|M0|-|N0|
(←recursion);
note that
Q1 ≤ M ÷ N
and
Q1 > 0.
Set
Q := Q + Q1,
and
M := M−Q1×N.
Repeat,
accumulating the quotient Q,
until M < N, which leaves the remainder in M.
(<<k is left-shift by k digit places and
|X| is the # of digits in X.)
Naturally this division algorithm avails itself of the fastest multiplication algorithm.
5. Conclusion
Applications of arithmetic on bigInts include testing primality for public-key encryption, and compression of big integers [AKS21].
To make the very fastest implementation of operations on large integers it is best to use a programming language that is close to the underlying hardware and C, C++ or assembler are obvious choices. Javascript is not that kind of choice but it does run in web browsers which is convenient for interactive demonstrations of algorithms.
References
- [AKS21] L. Allison, A. S. Konagurthu and D. F. Schmidt 'On universal codes for integers: Wallace Tree, Elias Omega and beyond', Data Compression Conference (DCC), IEEE, pp.313-322, March 2021 [doi:10.1109/DCC50243.2021.00039]
- And, an application of arithmetic on bigInts to encoding and decoding large integer values for data compression and machine learning.
- [Bod07] M. Bodrato. 'Toward optimal Toom-Cook multiplication for univariate and multivariate polynomials in characteristic 2 and 0', Int. Workshop on Arith. & Finite Fields (WAIFI), Springer, 2007, [doi:10.1007/978-3-540-73074-3_10].
- Also see Toom-Cook@[wikip]['21].
- [BZ98] C. Burnikel and J. Ziegler,
  'Fast recursive division',
  Im Stadtwald, D-Saarbrucken, 1998
  [www]['21].
- [CT65] J. W. Cooley and J. W. Tukey, 'An algorithm for the machine calculation of complex Fourier series', Math. Comp., 19, 297-301, 1965 [doi:10.1090/S0025-5718-1965-0178586-1]
- Also see the FFT@[wikip]['21].
- [HV19] D. Harvey and J. Van Der Hoeven,
  'Integer multiplication in time O(n log n)',
  HALarchive, 2019
  [www]['21].
- [Kar62] A. Karatsuba and Yu. Ofman, 'Multiplication of many-digital numbers by automatic computers', Proc. USSR Acad. Scis, 145, pp.293-294, 1962
- Translation in the acad. j.
  Physics-Doklady, 7, pp.595-596, 1963.
 Also see algorithm@[wikip]['21].
- [SS71] A. Schonhage and V. Strassen, 'Schnelle multiplikation grosser zahlen', Computing, 7, pp.281-292, 1971 [doi:10.1007/BF02242355]
- O(n log n log log n)-time. Also see algorithm@[wikip]['21].