Legendre prime counting function: Difference between revisions

Line 1,756:

0.003234 seconds (421 allocations: 14.547 KiB)

</pre>

=={{header|Kotlin}}==

There doesn't seem to be much point to contributing memoized (Hash table) versions as they seem to be the result of the task author not realizing that there is a simple way of keeping the computation complexity from having exponential growth with counting range by limiting the "splitting" of the "phi" calculation when the result must be just one. Accordingly, the following Kotlin versions are [translated and improved from the non-memoizing Nim versions](https://rosettacode.org/wiki/Legendre_prime_counting_function#Non-Memoized_Versions), which explanations can be referenced for an understanding of how they work:

'''The Basic Recursive Legendre Algorithm'''

<syntaxhighlight lang="kotlin">fun countPrimes(lmt: Long): Long {

if (lmt < 3) return if (lmt < 2) 0 else 1

val sqrtlmt = Math.sqrt(lmt.toDouble()).toInt()

fun makePrimes(): IntArray {

val mxndx = (sqrtlmt - 3) / 2

val arr = IntArray(mxndx + 1, { it + it + 3})

var i = 0

while (true) {

val sqri = (i + i) * (i + 3) + 3

if (sqri > mxndx) break

if (arr[i] != 0) {

val bp = i + i + 3

for (c in sqri .. mxndx step bp) arr[c] = 0

}

i++

}

return arr.filter { it != 0 }.toIntArray()

}

val oprms = makePrimes()

fun phi(x: Long, a: Int): Long {

if (a <= 0) return x - (x shr 1)

val na = a - 1; val p = oprms[na].toLong()

if (x <= p) return 1

return phi(x, na) - phi(x / p, na)

}

return phi(lmt, oprms.size) + oprms.size.toLong()

}

fun main() {

val strt = System.currentTimeMillis()

for (i in 0 .. 9) {

val arg = Math.pow(10.toDouble(), i.toDouble()).toLong()

println("π(10**$i) = ${countPrimes(arg)}")

}

val stop = System.currentTimeMillis()

println("This took ${stop - strt} milliseconds.")

}</syntaxhighlight>

<pre>π(10**0) = 0

π(10**1) = 4

π(10**2) = 25

π(10**3) = 168

π(10**4) = 1229

π(10**5) = 9592

π(10**6) = 78498

π(10**7) = 664579

π(10**8) = 5761455

π(10**9) = 50847534

This took 480 milliseconds.</pre>

The time is as run on an Intel i5-6500 (3.6 GHz single-threaded boost).

Although this is about ten times faster than if it were using memoization (and uses much less memory), it is only reasonably usable for trivial ranges such as this (trivial in terms of prime counting functions).

'''The Less Recursive "Bottom-Up" with a TinyPhi LUT Algorithm'''

Just substitute the following Kotlin code for the `countPrimes` function above to enjoy the gain in speed:

<syntaxhighlight lang="kotlin">val cTinyPhiPrimes = intArrayOf(2, 3, 5, 7, 11, 13)

val cTinyPhiDeg = cTinyPhiPrimes.size - 1

val cTinyPhiOddCirc = cTinyPhiPrimes.reduce(Int::times) / 2

val cTinyPhiTot = cTinyPhiPrimes.fold(1) { s, p -> s * (p - 1) }

fun makeTPLUT(): IntArray {

val arr = IntArray(cTinyPhiOddCirc) { _ -> 1 }

for (bp in cTinyPhiPrimes.drop(1)) {

arr[bp shr 1] = 0

for (c in ((bp * bp) shr 1) until cTinyPhiOddCirc step bp) arr[c] = 0 }

var acc = 0

for (i in 0 until cTinyPhiOddCirc) { acc += arr[i]; arr[i] = acc }

return arr

}

val cTinyPhiLUT = makeTPLUT()

fun tinyPhi(x: Long): Long {

val ndx = (x - 1) shr 1; val numtots = ndx / cTinyPhiOddCirc.toLong()

val rem = (ndx - numtots * cTinyPhiOddCirc.toLong()).toInt()

return numtots * cTinyPhiTot.toLong() + cTinyPhiLUT[rem].toLong()

}

fun countPrimes(lmt: Long): Long {

if (lmt < 169) {

if (lmt < 3) return if (lmt < 2) 0 else 1

// adjust for the missing "degree" base primes

if (lmt <= 13)

return ((lmt - 1).toLong() shr 1) + (if (lmt < 9) 1 else 0);

return 5.toLong() + cTinyPhiLUT[(lmt - 1).toInt() shr 1].toLong();

}

val sqrtlmt = Math.sqrt(lmt.toDouble()).toInt()

fun makePrimes(): IntArray {

val mxndx = (sqrtlmt - 3) / 2

val arr = IntArray(mxndx + 1, { it + it + 3})

var i = 0

while (true) {

val sqri = (i + i) * (i + 3) + 3

if (sqri > mxndx) break

if (arr[i] != 0) {

val bp = i + i + 3

for (c in sqri .. mxndx step bp) arr[c] = 0

}

i++

}

return arr.filter { it != 0 }.toIntArray()

}

val oprms = makePrimes()

fun lvl(pilmt: Int, m: Long): Long {

var acc = 0.toLong()

for (pi in cTinyPhiDeg until pilmt) {

val p = oprms[pi].toLong(); val nm = m * p

if (lmt <= nm * p) return acc + (pilmt - pi).toLong()

val q = (lmt.toDouble() / nm.toDouble()).toLong(); acc += tinyPhi(q)

if (pi > cTinyPhiDeg) acc -= lvl(pi, nm)

}

return acc

}

return tinyPhi(lmt) - lvl(oprms.size, 1) + oprms.size.toLong()

}</syntaxhighlight>

Use of the above function gets a gain in speed of about a further twenty times over the above version due to less recursion and the use of the "Tiny_Phi" Look Up Table (LUT) for smaller "degrees" of primes which greatly reduces the number of integer divisions, which have also been converted to floating point divisions because they are faster although with a lost of precision for counting ranges that one would likely not use this implementation for. This version of prime counting function might be considered to be reasonably useful for counting primes up to a trillion (1e12) in a few tens of seconds.

'''The Non-Recursive Partial Sieving Algorithm'''

The following Kotlin code enjoys the further gain in speed:

<syntaxhighlight lang="kotlin">import java.util.BitSet

fun countPrimes(lmt: Long): Long {

if (lmt < 3) return if (lmt < 2) 0 else 1 // odds only!

fun half(x: Int): Int = (x - 1) shr 1

fun divide(nm: Long, d: Long): Int = (nm.toDouble() / d.toDouble()).toInt()

val sqrtlmt = Math.sqrt(lmt.toDouble()).toLong()

val mxndx = ((sqrtlmt - 1) shr 1).toInt()

val cmpsts = BitSet(mxndx + 1)

val smalls = IntArray(mxndx + 1) { it }

val roughs = IntArray(mxndx + 1) { it + it + 1 }

val larges = LongArray(mxndx + 1) { (lmt / (it + it + 1).toLong() - 1) shr 1 }

// partial sieve loop, adjusting larges/smalls, compressing larges/roughs...

var nobps = 0; var rilmt = mxndx; var bp = 3.toLong()

while (true) {

val i = (bp shr 1).toInt(); val sqri = (i + i) * (i + 1)

if (sqri > mxndx) break

if (!cmpsts.get(i)) { // condition for partial sieving pass for bp is prime

cmpsts.set(i) // cull bp!

for (c in sqri .. mxndx step bp.toInt()) cmpsts.set(c) // cull bp mults!

// now adjust `larges` for latest partial sieve pass...

var ori = 0 // compress input rough index to output one

for (iri in 0 .. rilmt) {

val r = roughs[iri]; val rci = (r shr 1)

if (cmpsts.get(rci)) continue // skip roughs culled in this sieve pass!

val d = bp * r.toLong()

larges[ori] = larges[iri] -

( if (d <= sqrtlmt)

larges[smalls[(d shr 1).toInt()] - nobps]

else smalls[half(divide(lmt, d))].toLong() ) +

nobps.toLong() // base primes count over subtracted!

roughs[ori++] = r

}

var si = mxndx // and adjust `smalls` for latest partial sieve pass...

for (bpm in (sqrtlmt / bp - 1) or 1 downTo bp step 2) {

val c = smalls[(bpm shr 1).toInt()] - nobps

val ei = ((bpm * bp) shr 1).toInt()

while (si >= ei) smalls[si--] -= c

}

nobps++; rilmt = ori - 1

}

bp += 2

}

// combine results to here; correcting for over subtraction in combining...

var ans = larges[0]; for (i in 1 .. rilmt) ans -= larges[i]

ans += (rilmt.toLong() + 1 + 2 * (nobps.toLong() - 1)) * rilmt.toLong() / 2

// add final adjustment for pairs of current roughs to cube root of range...

var ri = 0

while (true) { // break when reaches cube root of counting range...

val p = roughs[++ri].toLong(); val q = lmt / p

val ei = smalls[half(divide(q, p))] - nobps

if (ei <= ri) break // break here when no more pairs!

for (ori in ri + 1 .. ei)

ans += smalls[half(divide(q, roughs[ori].toLong()))].toLong()

ans -= (ei - ri).toLong() * (nobps.toLong() + ri.toLong() - 1)

}

return ans + 1 // add one for only even prime of two!

}

fun main() {

val strt = System.currentTimeMillis()

for (i in 0 .. 9) {

val arg = Math.pow(10.toDouble(), i.toDouble()).toLong()

println("π(10**$i) = ${countPrimes(arg)}")

}

val stop = System.currentTimeMillis()

println("This took ${stop - strt} milliseconds.")

}</syntaxhighlight>

The above code enjoys quite a large gain in speed of about a further ten times (and increasing with range) over the immediately preceding version for larger "non-trivial" ranges (since there is an appreciable environment overhead in initialization and JIT compilation) due to not using recursion at all and the greatly reduced computational complexity of O(n**(3/4)/((log n)**2)) instead of the almost-linear-for-large-counting-ranges O(n/((log n)**2)) for the previous two versions, although the previous two versions differ in performance by a constant factor due to the overheads of recursion and division; for instance, this version can calculate the number of primes to 1e11 in about 82 milliseconds. This version of prime counting function might be considered to be reasonably useful for counting primes up to a 1e16 in about two minutes as long as the computer used has the required free memory of about 800 Megabytes. This Kotlin version is about twice as slow as the Nim version from which it was translated for large ranges where the initialization overhead is not significant due to being run under the Java Virtual Machine (JVM) environment and JIT compiled.

=={{header|Mathematica}}/{{header|Wolfram Language}}==