File size distribution: Difference between revisions

m
→‎{{header|Haskell}}: fix non terminating condition
m (→‎{{header|Haskell}}: fix non terminating condition)
Line 391:
</pre>
=={{header|Haskell}}==
Uses a grouped frequency distribution. Program arguments are optional. Arguments include starting directory and initial frequency distribution group size. Distribution groups of 0 are removed. After the first frequency distribution is computed it further breaks it down for any group that exceeds 25% of the total file count, when possible.
<lang haskell>{-# LANGUAGE TupleSections, LambdaCase #-}
 
Line 418:
range = maximum xs - minimum xs
groupSize = succ $ ceiling $ realToFrac range / realToFrac totalGroups
groups = genericTaketakeWhile (succ<=groupSize totalGroups+ maximum xs) $ iterate (+groupSize) 0
groupMinMax = (,0) <$> zip groups (pred <$> tail groups)
 
Line 426:
if d >= min && d <= max
then ((min, max), succ count)
else g) gs
) gs
 
 
fileSizes :: [Item] -> [Integer]
Line 460 ⟶ 462:
 
expandGroups :: Int -> [Integer] -> Integer -> [FrequencyGroup] -> [FrequencyGroup]
expandGroups gsize fileSizes groupThreshold groups= loop 15
| all ((<= groupThreshold) . snd) groups = groups
| otherwise = expandGroups gsize fileSizes groupThreshold $ expand groups
where
loop 0 gs = gs -- break out in case we can't go below threshold
loop n gs
| all ((<= groupThreshold) . snd) gs = gs
| otherwise = loop (pred n) $ expand gs
 
expand = ((\g@((min, max), count) ->
if count > groupThreshold then
Line 505 ⟶ 510:
percentage :: Double
percentage = (realToFrac count / realToFrac filesCount) * 100
bars = replicate (round $ percentage * 4) '█' -- % of 25% (max group size)
 
parseArgs :: [String] -> Either String (FilePath, Int)
Line 558 ⟶ 563:
. initialGroups gsize</lang>
{{out}}
<pre style="height: 50rem;">$ filedist ~/Music
Using 4 worker threads
Total files: 431803688
Total folders: 65528663
Total size: 6GB986MB
 
Distribution:
0B <-> 80B = 7 1.017%: █
0B <-> 75B = 39305 9.103%: ████████████████████████████████████
81B <-> 161B = 74 10.756%: ███████████
76B <-> 151B = 36175 8.378%: ██████████████████████████████████
162B <-> 242B = 112 16.279%: ████████████████
152B <-> 227B = 27747 6.426%: ██████████████████████████
243B <-> 323B = 99 14.390%: ██████████████
228B <-> 303B = 19148 4.434%: ██████████████████
322B <-> 643B = 23 3.343%: ███
301B <-> 601B = 50919 11.792%: ███████████████████████████████████████████████
644B <-> 965B = 2 0.291%:
602B <-> 902B = 41885 9.700%: ███████████████████████████████████████
966B <-> 1KB = 1 0.145%:
903B <-> 1KB = 43986 10.187%: █████████████████████████████████████████
3KB <-> 6KB = 12 1.744%: ██
1KB <-> 2KB = 61277 14.191%: █████████████████████████████████████████████████████████
6KB <-> 10KB = 22 3.198%: ███
2KB <-> 4KB = 29473 6.826%: ███████████████████████████
10KB <-> 13KB = 12 1.744%: ██
4KB <-> 5KB = 17620 4.081%: ████████████████
14KB <-> 27KB = 15 2.180%: ██
5KB <-> 9KB = 28951 6.705%: ███████████████████████████
9KB 27KB <-> 14KB41KB = 10941 2 6 0.534872%: ██████████
14KB41KB <-> 19KB54KB = 5139 122 3.190198%: ████████
19KB54KB <-> 37KB108KB = 8589 199 14.989390%: ██████████████████████
37KB108KB <-> 56KB163KB = 3100 023 3.718343%: ███
56KB163KB <-> 75KB217KB = 1807 0 8 1.418163%: ██
75KB236KB <-> 150KB473KB = 2547 3 0.590436%: ██
150KB709KB <-> 225KB946KB = 933 44 0 6.216395%: ██████
225KB 3MB <-> 300KB 5MB = 697 4 0.161581%: █
301KB 5MB <-> 601KB 7MB = 653 21 0 3.151052%: ███
601KB 7MB <-> 902KB 13MB = 305 72 010.071465%: ██████████
902KB 13MB <-> 1MB20MB = 122 6 0.028872%:
1MB 20MB <-> 2MB27MB = 210 1 0.049145%:
2MB <-> 4MB = 110 0.025%:
4MB <-> 5MB = 51 0.012%:
5MB <-> 9MB = 52 0.012%:
9MB <-> 14MB = 19 0.004%:
14MB <-> 19MB = 8 0.002%:
20MB <-> 40MB = 17 0.004%:
40MB <-> 61MB = 5 0.001%:
61MB <-> 81MB = 3 0.001%:
98MB <-> 196MB = 8 0.002%:
294MB <-> 392MB = 1 0.000%:
 
$ filedist ~/Music 10
# Smaller set
$ filedist
Using 4 worker threads
Total files: 1374688
Total folders: 455663
Total size: 620MB986MB
 
Distribution:
0B <-> 58B88B = 37 7 2 1.693017%: ███████████
89B <-> 177B = 75 10.901%: ███████████
59B <-> 117B = 77 5.604%: ██████████████████████
178B <-> 266B = 156 22.674%: ███████████████████████
118B <-> 176B = 72 5.240%: █████████████████████
267B <-> 355B = 57 8.285%: ████████
177B <-> 235B = 176 12.809%: ███████████████████████████████████████████████████
356B <-> 444B = 20 2.907%: ███
232B <-> 463B = 338 24.600%: ██████████████████████████████████████████████████████████████████████████████████████████████████
801B <-> 889B = 2 0.291%:
464B <-> 695B = 88 6.405%: ██████████████████████████
696B959B <-> 927B 2KB = 66 1 4 0.803145%: ███████████████████
4KB <-> 5KB = 1 0.145%:
926B <-> 2KB = 169 12.300%: █████████████████████████████████████████████████
2KB5KB <-> 3KB6KB = 56 1 4 0.076145%: ████████████████
3KB6KB <-> 4KB7KB = 1411 1.019599%: ██████
7KB <-> 7KB = 10 1.453%: █
4KB <-> 8KB = 121 8.806%: ███████████████████████████████████
8KB7KB <-> 11KB 8KB = 13 4 0.946581%: ████
11KB 8KB <-> 15KB 9KB = 12 7 0 1.873017%: ███
16KB 9KB <-> 32KB19KB = 1521 13.092052%: ███████
32KB19KB <-> 47KB28KB = 17 6 1 0.237872%: █████
47KB28KB <-> 63KB38KB = 54 0.364581%: █
63KB38KB <-> 126KB 47KB = 1812 1.310744%: ███████
126KB 47KB <-> 190KB 57KB = 16 3 02.218326%: ██
190KB 57KB <-> 253KB 66KB = 23 4 03.291343%: ███
594KB 66KB <-> 1MB75KB = 1326 03.946779%: ████
1MB 75KB <-> 2MB85KB = 3715 2.693180%: █████████████
2MB 85KB <-> 2MB94KB = 717 02.509471%: ██
3MB 95KB <-> 5MB189KB = 1242 06.873105%: ███ ██████
5MB189KB <-> 8MB284KB = 14 0.073581%:
8MB284KB <-> 10MB378KB = 12 0.073291%:
58MB851KB <-> 78MB946KB = 44 1 06.073395%: ██████
294MB 3MB <-> 392MB 5MB = 15 0.073727%:
5MB <-> 8MB = 41 5.959%: ██████
 
8MB <-> 11MB = 35 5.087%: █████
# Increase distribution group to 10 using optional arguments
11MB <-> 13MB = 16 2.326%: ██
$ filedist . 10
13MB <-> 16MB = 3 0.436%:
Using 4 worker threads
16MB <-> 19MB = 3 0.436%:
Total files: 1374
24MB <-> 27MB = 1 0.145%:
Total folders: 455
Total size: 620MB
 
Distribution:
0B <-> 87B = 48 3.493%: ██████████████
88B <-> 175B = 137 9.971%: ████████████████████████████████████████
176B <-> 263B = 184 13.392%: ██████████████████████████████████████████████████████
264B <-> 351B = 78 5.677%: ███████████████████████
352B <-> 439B = 208 15.138%: █████████████████████████████████████████████████████████████
440B <-> 527B = 91 6.623%: ██████████████████████████
528B <-> 615B = 20 1.456%: ██████
616B <-> 703B = 24 1.747%: ███████
704B <-> 791B = 28 2.038%: ████████
792B <-> 879B = 13 0.946%: ████
871B <-> 2KB = 168 12.227%: █████████████████████████████████████████████████
2KB <-> 3KB = 58 4.221%: █████████████████
3KB <-> 3KB = 31 2.256%: █████████
3KB <-> 4KB = 11 0.801%: ███
4KB <-> 5KB = 20 1.456%: ██████
5KB <-> 6KB = 51 3.712%: ███████████████
6KB <-> 7KB = 31 2.256%: █████████
7KB <-> 8KB = 14 1.019%: ████
8KB <-> 9KB = 4 0.291%: █
9KB <-> 17KB = 23 1.674%: ███████
17KB <-> 26KB = 5 0.364%: █
26KB <-> 34KB = 13 0.946%: ████
34KB <-> 43KB = 5 0.364%: █
43KB <-> 51KB = 6 0.437%: ██
51KB <-> 60KB = 1 0.073%:
60KB <-> 68KB = 4 0.291%: █
68KB <-> 77KB = 2 0.146%: █
77KB <-> 85KB = 2 0.146%: █
94KB <-> 188KB = 17 1.237%: █████
188KB <-> 283KB = 4 0.291%: █
848KB <-> 942KB = 6 0.437%: ██
1MB <-> 2MB = 50 3.639%: ███████████████
2MB <-> 3MB = 1 0.073%:
3MB <-> 4MB = 7 0.509%: ██
4MB <-> 5MB = 5 0.364%: █
6MB <-> 7MB = 1 0.073%:
9MB <-> 10MB = 1 0.073%:
39MB <-> 78MB = 1 0.073%:
353MB <-> 392MB = 1 0.073%:
 
</pre>
 
Anonymous user