Jump to content

Legendre prime counting function: Difference between revisions

→‎{{header|Haskell}}: make memoized code compile; add non-memoized versions...
m (→‎Non-Memoized Versions: Nim - tuned comments on partial sieving implementation...)
(→‎{{header|Haskell}}: make memoized code compile; add non-memoized versions...)
Line 651:
=={{header|Haskell}}==
 
With Memoization:
Memoization utilities:
<syntaxhighlight lang="haskell">data{-# MemoOPTIONS_GHC a-O2 =-fllvm Node-Wno-incomplete-patterns a (Memo a) (Memo a)#-}
{-# LANGUAGE DeriveFunctor #-}
 
import Data.Time.Clock.POSIX ( getPOSIXTime ) -- for timing
 
import Data.Int ( Int64 )
import Data.Bits ( Bits( shiftL, shiftR ) )
 
data Memo a = EmptyNode | Node a (Memo a) (Memo a)
deriving Functor
 
Line 673 ⟶ 681:
memoList = memo . mkList
where
mkList [] = EmptyNode -- never used; makes complete
mkList (x:xs) = Node x (mkList l) (mkList r)
where (l,r) = split xs
split [] = ([],[])
split [x] = ([x],[])
split (x:y:xs) = let (l,r) = split xs in (x:l, y:r)</syntaxhighlight>
 
isqrt :: Integer -> Integer
Computation of Legendre function:
<syntaxhighlight lang="haskell">isqrt :: Integer -> Integer
isqrt n = go n 0 (q `shiftR` 2)
where
Line 690 ⟶ 698:
else go z (r `shiftR` 1) (q `shiftR` 2)
 
primes :: [Integer]
primes = 2 : _Y ((3:) . gaps 5 . _U . map(\p-> [p*p, p*p+2*p..])) where
_Y g = g (_Y g) -- = g (g (g ( ... ))) non-sharing multistage fixpoint combinator
gaps k s@(c:cs) | k < c = k : gaps (k+2) s -- ~= ([k,k+2..] \\ s)
| otherwise = gaps (k+2) cs -- when null(s\\[k,k+2..])
_U ((x:xs):t) = x : (merge xs . _U . pairs) t -- tree-shaped folding big union
pairs (xs:ys:t) = merge xs ys : pairs t
merge xs@(x:xt) ys@(y:yt) | x < y = x : merge xt ys
| y < x = y : merge xs yt
| otherwise = x : merge xt yt
 
phi :: Integer -> Integer -> Integer
phi = memoize2 phiM
where
phiM x 0 = x
phiM x a = phi x (a-1) - phi (x `div` p a) (a - 1)
 
p = memoList (undefined : primes)
 
Line 703 ⟶ 723:
where a = legendrePi (floor (sqrt (fromInteger n)))
 
main :: IO ()
main = mapM_ (\n -> putStrLn $ show n ++ "\t" ++ show (legendrePi (10^n))) [1..7]</syntaxhighlight>
main = do
strt <- getPOSIXTime
mapM_ (\n -> putStrLn $ show n ++ "\t" ++ show (legendrePi (10^n))) [0..9]
stop <- getPOSIXTime
let elpsd = round $ 1e3 * (stop - strt) :: Int64
putStrLn $ "This last took " ++ show elpsd ++ " milliseconds."</syntaxhighlight>
 
<pre>0 0
1 4
2 25
3 168
4 1229
5 9592
6 78498
7 664579
8 5761455
9 50847534
This last took 20383 milliseconds.</pre>
 
The above code reveals all of the problems with the usual Haskell developer with overuse of the multi-precision `Integer` type when there is no way that the range of a 64-bit integer will ever be exceeded, overuse of lists, developed and tested using the interpreter, etc., such that, even compiled using the LLVM back-end, the above code is about ten times slower that any other compiled language for a memoized version.
 
===Non Memoized Haskell Versions===
 
There doesn't seem to be much point to improving the above memoized version even though one could gain a factor of about ten in doing so as the need for memoization seems to be the result of the task author not realizing that there is a simple way of keeping the computation complexity from having exponential growth with counting range by limiting the "splitting" of the "phi" calculation when the result must be just one. Accordingly, the following Haskell versions are [translated from the non-memoizing Nim versions](https://rosettacode.org/wiki/Legendre_prime_counting_function#Non-Memoized_Versions), which explanations can be referenced for an understanding of how they work:
 
'''The Basic Recursive Legendre Algorithm'''
 
<syntaxhighlight lang="haskell">{-# OPTIONS_GHC -O2 -fllvm #-}
{-# LANGUAGE FlexibleContexts, BangPatterns #-}
 
import Data.Time.Clock.POSIX ( getPOSIXTime ) -- for timing
 
import Data.Int ( Int64 )
import Data.Word ( Word32, Word64 )
import Data.Bits ( Bits( (.&.), (.|.), shiftL, shiftR ) )
import Control.Monad ( unless, when, forM_ )
import Control.Monad.ST ( ST, runST )
import Data.Array.ST ( runSTUArray )
import Data.Array.Base ( UArray(..), IArray(unsafeAt), listArray, elems, assocs,
MArray( unsafeNewArray_, newArray, unsafeRead, unsafeWrite ),
STUArray, unsafeFreezeSTUArray, castSTUArray )
 
countPrimes :: Word64 -> Int64
countPrimes n =
if n < 3 then (if n < 2 then 0 else 1) else
let sqrtn = truncate $ sqrt $ fromIntegral n
qdrtn = truncate $ sqrt $ fromIntegral sqrtn
rtlmt = (sqrtn - 3) `div` 2
qrtlmt = (qdrtn - 3) `div` 2
oddPrimes@(UArray _ _ psz _) = runST $ do -- UArray of odd primes...
cmpsts <- newArray (0, rtlmt) False :: ST s (STUArray s Int Bool)
forM_ [ 0 .. qrtlmt ] $ \ i -> do
t <- unsafeRead cmpsts i
unless t $ do
let sqri = (i + i) * (i + 3) + 3
bp = i + i + 3
forM_ [ sqri, sqri + bp .. rtlmt ] $ \ c ->
unsafeWrite cmpsts c True
fcmpsts <- unsafeFreezeSTUArray cmpsts
let !numoprms = sum $ [ 1 | False <- elems fcmpsts ]
prms = [ fromIntegral $ i + i + 3 | (i, False) <- assocs fcmpsts ]
return $ listArray (0, numoprms - 1) prms :: ST s (UArray Int Word32)
phi x a =
if a < 1 then x - (x `shiftR` 1) else
let na = a - 1
p = fromIntegral $ unsafeAt oddPrimes na in
if x < p then 1 else
phi x na - phi (x `div` p) na
 
in fromIntegral (phi n psz) + fromIntegral psz
 
main :: IO ()
main = do
strt <- getPOSIXTime
mapM_ (\n -> putStrLn $ show n ++ "\t" ++ show (countPrimesx (10^n))) [ 0 .. 9 ]
stop <- getPOSIXTime
let elpsd = round $ 1e3 * (stop - strt) :: Int64
putStrLn $ "This took " ++ show elpsd ++ " milliseconds."</syntaxhighlight>
{{out}}
<pre>0 0
1 4
2 25
3 168
4 1229
5 9592
6 78498
7 664579
8 5761455
9 50847534
This took 273 milliseconds.</pre>
 
The time is as run on an Intel i5-6500 (3.6 GHz single-threaded boost).
 
Although this is much faster than if it were using memoization (and uses much less memory), it is only reasonably usable for trivial ranges such as this (trivial in terms of prime counting functions).
 
'''The Less Recursive "Bottom-Up" with a TinyPhi LUT Algorithm'''
 
Just substitute the following Haskell code for the `countPrimes` function above to enjoy the gain in speed:<syntaxhighlight lang="haskell">cTinyPhiPrimes :: [Int]
cTinyPhiPrimes = [ 2, 3, 5, 7, 11, 13 ]
cC :: Int
cC = length cTinyPhiPrimes - 1
cTinyPhiOddCirc :: Int
cTinyPhiOddCirc = product cTinyPhiPrimes `div` 2
cTinyPhiTot :: Int
cTinyPhiTot = product [ p - 1 | p <- cTinyPhiPrimes ]
cTinyPhiLUT :: UArray Int Word32
cTinyPhiLUT = runSTUArray $ do
ma <- newArray (0, cTinyPhiOddCirc - 1) 1
forM_ (drop 1 cTinyPhiPrimes) $ \ bp -> do
let i = (bp - 1) `shiftR` 1
let sqri = (i + i) * (i + 1)
unsafeWrite ma i 0
forM_ [ sqri, sqri + bp .. cTinyPhiOddCirc - 1 ] $ \ c -> unsafeWrite ma c 0
let tot i acc =
if i >= cTinyPhiOddCirc then return ma else do
v <- unsafeRead ma i
if v == 0 then do unsafeWrite ma i acc; tot (i + 1) acc
else do let nacc = acc + 1
unsafeWrite ma i nacc; tot (i + 1) nacc
tot 0 0
tinyPhi :: Word64 -> Int64
tinyPhi n =
let on = (n - 1) `shiftR` 1
numcyc = on `div` fromIntegral cTinyPhiOddCirc
rem = fromIntegral $ on - numcyc * fromIntegral cTinyPhiOddCirc
in fromIntegral numcyc * fromIntegral cTinyPhiTot +
fromIntegral (unsafeAt cTinyPhiLUT rem)
 
countPrimes :: Word64 -> Int64
countPrimes n =
if n < 3 then (if n < 2 then 0 else 1) else
let sqrtn = truncate $ sqrt $ fromIntegral n
qdrtn = truncate $ sqrt $ fromIntegral sqrtn
rtlmt = (sqrtn - 3) `div` 2
qrtlmt = (qdrtn - 3) `div` 2
oddPrimes@(UArray _ _ psz _) = runST $ do -- UArray of odd primes...
cmpsts <- newArray (0, rtlmt) False :: ST s (STUArray s Int Bool)
forM_ [ 0 .. qrtlmt ] $ \ i -> do
t <- unsafeRead cmpsts i
unless t $ do
let sqri = (i + i) * (i + 3) + 3
bp = i + i + 3
forM_ [ sqri, sqri + bp .. rtlmt ] $ \ c ->
unsafeWrite cmpsts c True
fcmpsts <- unsafeFreezeSTUArray cmpsts
let !numoprms = sum $ [ 1 | False <- elems fcmpsts ]
prms = [ fromIntegral $ i + i + 3 | (i, False) <- assocs fcmpsts ]
return $ listArray (0, numoprms - 1) prms :: ST s (UArray Int Word32)
lvl pi pilmt !m !acc =
if pi >= pilmt then acc else
let p = fromIntegral $ unsafeAt oddPrimes pi
nm = m * p in
if n <= nm * p then acc + fromIntegral (pilmt - pi) else
let !q = fromIntegral $ n `div` nm
!nacc = acc + tinyPhi q
!sacc = if pi <= cC then 0 else lvl cC pi nm 0
in lvl (pi + 1) pilmt m $ nacc - sacc
 
in tinyPhi n - lvl cC psz 1 0 + fromIntegral psz</syntaxhighlight>
 
Use of the above function gets a gain in speed of about a further fifteen times over the above version due to less recursion and the use of the "Tiny_Phi" Look Up Table (LUT) for smaller "degrees" of primes which greatly reduces the number of integer divisions. This version of prime counting function might be considered to be reasonably useful for counting primes up to a trillion (1e12) in a little over ten seconds.
 
'''The Non-Recursive Partial Sieving Algorithm'''
 
Just substitute the following Haskell code for the `countPrimes` function above to enjoy the further gain in speed:
<syntaxhighlight lang="haskell">countPrimesxx :: Word64 -> Int64
countPrimesxx n =
if n < 3 then (if n < 2 then 0 else 1) else
let
{-# INLINE divide #-}
divide :: Word64 -> Word64 -> Int
divide nm d = truncate $ (fromIntegral nm :: Double) / fromIntegral d
{-# INLINE half #-}
half :: Int -> Int
half x = (x - 1) `shiftR` 1
rtlmt = floor $ sqrt (fromIntegral n :: Double)
mxndx = (rtlmt - 1) `div` 2
(!nbps, !nrs, !smalls, !roughs, !larges) = runST $ do
mss <- unsafeNewArray_ (0, mxndx) :: ST s (STUArray s Int Word32)
let msscst =
castSTUArray :: STUArray s Int Word32 -> ST s (STUArray s Int Int64)
mdss <- msscst mss -- for use in adjing counts LUT
forM_ [ 0 .. mxndx ] $ \ i -> unsafeWrite mss i (fromIntegral i)
mrs <- unsafeNewArray_ (0, mxndx) :: ST s (STUArray s Int Word32)
forM_ [ 0 .. mxndx ] $ \ i -> unsafeWrite mrs i (fromIntegral i * 2 + 1)
mls <- unsafeNewArray_ (0, mxndx) :: ST s (STUArray s Int Int64)
forM_ [ 0 .. mxndx ] $ \ i ->
let d = fromIntegral (i + i + 1)
in unsafeWrite mls i (fromIntegral (divide n d - 1) `div` 2)
cmpsts <- unsafeNewArray_ (0, mxndx) :: ST s (STUArray s Int Bool)
 
let loop i !cbpi !rlmti =
let sqri = (i + i) * (i + 1) in
if sqri > mxndx then do
fss <- unsafeFreezeSTUArray mss
frs <- unsafeFreezeSTUArray mrs
fls <- unsafeFreezeSTUArray mls
return (cbpi, rlmti + 1, fss, frs, fls)
else do
v <- unsafeRead cmpsts i
if v then loop (i + 1) cbpi rlmti else do
unsafeWrite cmpsts i True -- cull current bp so not a "k-rough"!
let bp = i + i + 1
-- partial cull by current base prime, bp...
cull c = if c > mxndx then return () else do
unsafeWrite cmpsts c True; cull (c + bp)
 
-- adjust `larges` according to partial sieve...
part ri nri = -- old "rough" index to new one...
if ri > rlmti then return (nri - 1) else do
r <- unsafeRead mrs ri -- "rough" always odd!
t <- unsafeRead cmpsts (fromIntegral r `shiftR` 1)
if t then part (ri + 1) nri else do -- skip newly culled
olv <- unsafeRead mls ri
let m = fromIntegral r * fromIntegral bp
adjv <- if m <= fromIntegral rtlmt then do
let ndx = fromIntegral m `shiftR` 1
sv <- unsafeRead mss ndx
unsafeRead mls (fromIntegral sv - cbpi)
else do
sv <- unsafeRead mss (half (divide n m))
return (fromIntegral sv)
unsafeWrite mls nri (olv - (adjv - fromIntegral cbpi))
unsafeWrite mrs nri r; part (ri + 1) (nri + 1)
!pm0 = ((rtlmt `div` bp) - 1) .|. 1 -- max base prime mult
 
adjc lmti pm = -- adjust smalls according to partial sieve:
if pm < bp then return () else do
c <- unsafeRead mss (pm `shiftR` 1)
let ac = c - fromIntegral cbpi -- correction
bi = (pm * bp) `shiftR` 1 -- start array index
adj si = if si > lmti then adjc (bi - 1) (pm - 2)
else do ov <- unsafeRead mss si
unsafeWrite mss si (ov - ac)
adj (si + 1)
adj bi
dadjc lmti pm =
if pm < bp then return () else do
c <- unsafeRead mss (pm `shiftR` 1)
let ac = c - fromIntegral cbpi -- correction
bi = (pm * bp) `shiftR` 1 -- start array index
ac64 = fromIntegral ac :: Int64
dac = (ac64 `shiftL` 32) .|. ac64
dbi = (bi + 1) `shiftR` 1
dlmti = (lmti - 1) `shiftR` 1
dadj dsi = if dsi > dlmti then return ()
else do dov <- unsafeRead mdss dsi
unsafeWrite mdss dsi (dov - dac)
dadj (dsi + 1)
when (bi .&. 1 /= 0) $ do
ov <- unsafeRead mss bi
unsafeWrite mss bi (ov - ac)
dadj dbi
when (lmti .&. 1 == 0) $ do
ov <- unsafeRead mss lmti
unsafeWrite mss lmti (ov - ac)
adjc (bi - 1) (pm - 2)
cull sqri; nrlmti <- part 0 0
dadjc mxndx pm0
loop (i + 1) (cbpi + 1) nrlmti
loop 1 0 mxndx
 
!ans0 = unsafeAt larges 0 - -- combine all counts; each includes nbps...
sum [ unsafeAt larges i | i <- [ 1 .. nrs - 1 ] ]
-- adjust for all the base prime counts subracted above...
!adj = (nrs + 2 * (nbps - 1)) * (nrs - 1) `div` 2
!adjans0 = ans0 + fromIntegral adj
 
loopr ri !acc = -- o final phi calculation for pairs of larger primes...
let r = fromIntegral (unsafeAt roughs ri)
q = n `div` r
lmtsi = half (fromIntegral (q `div` r))
lmti = fromIntegral (unsafeAt smalls lmtsi) - nbps
addcnt pi !ac =
if pi > lmti then ac else
let p = fromIntegral (unsafeAt roughs pi)
ci = half (fromIntegral (divide q p))
in addcnt (pi + 1) (ac + fromIntegral (unsafeAt smalls ci))
in if lmti <= ri then acc else -- break when up to cube root of range!
-- adjust for the `nbps`'s over added in the `smalls` counts...
let !adj = fromIntegral ((lmti - ri) * (nbps + ri - 1))
in loopr (ri + 1) (addcnt (ri + 1) acc - adj)
in loopr 1 adjans0 + 1 -- add one for only even prime of two!</syntaxhighlight>
 
The above function enjoys quite a large gain in speed of about a further ten times over the immediately preceding version due to not using recursion at all and the greatly reduced computational complexity of O(n**(3/4)/((log n)**2)) instead of the almost-linear-for-large-counting-ranges O(n/((log n)**2)) for the previous two versions, although the previous two versions differ in performance by a constant factor due to the overheads of recursion and division. This version of prime counting function might be considered to be reasonably useful for counting primes up to 1e16 in just over a minute as long as the computer used has the required free memory of about 800 Megabytes, counting the primes to 1e11 in under 30 milliseconds. This Haskell version is about twenty percent slower than the Nim version from which it was translated because Nim's default GCC back-end is better at optimization (at least for this algorithm) than the LLVM back-end used by Haskell here.
<pre>λ> main
1 4
2 25
3 168
4 1229
5 9592
6 78498
7 664579
8 5761455
9 50847534</pre>
 
=={{header|J}}==
474

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.