Sieve of Eratosthenes

The Sieve of Eratosthenes is a simple, ancient algorithm for finding all prime numbers up to any given limit. The algorithm requires an array of boolean values, and accesses its values with two nested loops.

We implemented this algorithm in different programming languages, and measured their performance and memory usage for counting the number of primes up to 10'000'000. Our test computer has the following specs:

Performance comparison

Time Memory
Implementation Compile Startup Run 1 Run N Array RSS Virtual
C
sieve.c 72 ms 1 ms 77 ms 65 ms 10 MB 11 MB 13 MB
sieve-bitarray.c 68 ms 1 ms 34 ms 32 ms 1.25 MB ? ?
Java
sieve.java 460 ms 48 ms 88 ms 75 ms 10 MB 92 MB 5000 MB
JavaScript
sieve-array.js 510 ms 310 ms 310 ms 80 MB 120 MB 670 MB
sieve-uint8array.js 510 ms 91 ms 86 ms 10 MB 50 MB 600 MB
Python
sieve.py 15 ms 2550 ms 2550 ms 80 MB 89 MB 98 MB
sieve-numpy.py 96 ms 105 ms 105 ms 10 MB 140 MB 170 MB
sieve-python-c 615 ms¹ 13 ms 77 ms 65 ms 10 MB 18.5 MB 27 MB
Perl
sieve.pl 5 ms 4200 ms 4200 ms 93.5 MB 325 MB 330 MB
sieve-inline-c.pl 750 ms² 40 ms 65 ms 65 ms 10 MB 28.5 MB 34.4 MB
Go
sieve.go 225 ms 1 ms 83 ms 75 ms 10 MB 16 MB 105 MB
Octave
sieve_while.m 165 ms ~160 s ~160 s 10 MB 85 MB 620 MB
sieve_vectorized.m 165 ms 105 ms 105 ms 10 MB 85 MB 620 MB
1. Time to compile the module. The Python script is compiled at startup.
2. Time to compile the C code on the first run. The Perl code is compiled at startup.

sieve.c

This is the standard implementation in C, using a char array to store whether a number is a prime or not. The first run is slightly slower than subsequent runs, which is probably due to memory wiring by the kernel, and memory caches in the CPU.

We compiled the program using GCC 8.3.0.

sieve-bitarray.c

Using just one bit (instead of one byte) per number, this implementation requires 8 times less memory than the above implementation. It therefore involves less memory transfer between the RAM and the CPU, but more overhead shifting and masking these values. On most computers, it is faster than the standard implementation.

We compiled the program using GCC 8.3.0.

sieve.java

This is the standard implementation in Java, using an array of booleans. Since the JVM checks array bounds, the runs take slightly longer than with the C implementation. The compilation from byte code to machine code may be included in the startup time, or in the first run.

We compiled the program using javac of OpenJDK 1.8.0_212, and run it using OpenJDK 11.0.4.

sieve-array.js

This is a JavaScript implementation using a regular array. Since such arrays are dynamic and can hold any type of value, they require significantly more memory, and are quite a bit slower.

We run the script using Node.js v10.15.2.

sieve-uint8array.js

This is a JavaScript implementation using a Uint8Array, which is comparable to a char array in C. The array requires the same amount of memory as the C implementation, and the algorithm runs just slightly slower.

We run the script using Node.js v10.15.2.

sieve.py

This is a Python implementation using a standard array as a local variable. It is about 35 times slower than the corresponding C implementation, mostly because it is executed as intermediate code, and not compiled machine code.

Note that the implementation is significantly slower (about 4700 ms) when using global variables.

We run the script using Python 3.7.3. The array size was determined using sys.getsizeof.

sieve-numpy.py

This is a Python implementation using a numpy array. While the outer loop is still written in Python, the inner loop is written as slice assignment, and executed by numpy. This is significantly faster than the above Python implementation.

We run the script using Python 3.7.3. The array size was determined using sys.getsizeof.

sieve-python-c

This is a C implementation as Python module. The speed is about the same as the C implementation, except for a little startup overhead.

sieve.pl

This is a Perl implementation. It has a remarkably low startup time, but is slow afterwards.

We run the script using Perl v5.28.1. The array size was determined using Devel::Size::total_size.

sieve-inline-c.pl

This is a Perl script running the C implementation through the Inline::C module.

We run the script using Perl v5.28.1. The array size was determined using Devel::Size::total_size.

sieve.go

This is a Go implementation.

sieve_while.m

This is a Matlab/Octave implementation using a while loop.

We run the script using GNU Octave version 4.4.1.

sieve_vectorized.m

This is a Matlab/Octave implementation using vector indexing.

We run the script using GNU Octave version 4.4.1.

sieve-arduino

This C++ implementation targets an Arduino compatible microcontroller. These devices have very limited memory, which makes it impossible to look for more than a few thousand prime numbers. In addition, these CPUs are about a thousand times less powerful than the reference computer. For these reasons, it is not possible to compare this to the other implementations.