Parallel calculations: Difference between revisions

Parallel calculations (view source)

Revision as of 20:35, 10 July 2018

26 bytes added , 5 years ago

m

→‎{{header|Perl 6}}: Minor performance tweak. Clarify verbiage a bit, reduce redundancy

Thundergnat

10,327

edits

Revision as of 21:44, 6 July 2018 (view source) Hout (talk \| contribs) (→‎{{header\|Haskell}}: Applied hlint, hindent. Added slightly fuller signatures and import details.) ← Older edit		Revision as of 20:35, 10 July 2018 (view source) Thundergnat (talk \| contribs) m (→‎{{header\|Perl 6}}: Minor performance tweak. Clarify verbiage a bit, reduce redundancy) Newer edit →
Line 1,502: Takes the list of numbers and converts them to a <tt>HyperSeq</tt> that is stored in a variable and evaluated concurrently. <tt>HyperSeq</tt>s overload <tt>map</tt> and <tt>grep</tt> to convert and pick values in worker threads. The runtime will pick the number of OS-level threads and assign worker threads to them while avoiding stalling in any part of the program. A <tt>HyperSeq</tt> is lazy, so the computation of values will happen in chunks as they are requested. The hyper (and race) method can take two parameters that will tweak how the parallelization occurs: :degree and :batch. :degree is the number of worker threads to allocate to the job. By default it is set to the number of physical cores available. If you have a hyper threading processor, and the tasks are not cpu bound, it may be useful to raise that number but it is a reasonable default. :batch is how many sub-tasks are parceled out at a time to each worker thread. Default is 64. For small numbers of cpu intensive tasks a lower number will likely be better, but too low may make the dispatch overhead cancel out the benefit of threading. Conversely, too high will over-burden some threads and starve others. Over long-running processes ofwith ~~multi~~many hundreds / thousands of sub-tasks, the scheduler will automatically adjust the batch size up or down to try to keep the pipeline filled~~. For small numbers of cpu intensive tasks (such as this one) it is useful to give it a smaller starting batch size~~. On my system, under the load I was running, I found a batch size of 3 to be optimal for this task. May be different for different systems and different loads. ~~Also, as~~As a relative comparison, perform the same factoring task on the same set of 100 numbers as found in the [[Parallel_calculations#SequenceL\|SequenceL]] example, using varying numbers of threads. ~~Pay~~The noabsolute ~~attention~~speed tonumbers ~~the~~are ~~absolute~~not ~~speed~~very ~~numbers~~significant, they will vary greatly between systems, this is more intended as a comparison of relative throughput. On a Core i7-4770 @ 3.40GHz with 4 cores and hyper-threading under Linux, there is a distinct pattern where more threads on physical cores give reliable increases in throughput. Adding hyperthreads may (and, in this case, does seem to) give some additional marginal benefit. Using the <tt>prime-factors</tt> routine as defined in the [[Prime_decomposition#Perl_6 \|prime decomposition]] task. Line 1,559: sub find-factor ( Int $n, $constant = 1 ) { return 2 unless $n +& 1; if (my $gcd = $n gcd 6541380665835015) > 1 { return $gcd if $gcd != $n } my $x = 2; my $rho = 1; Line 1,594 ⟶ 1,598: from: 64921987050997300559 71774104902986066597 83448083465633593921 87001033462961102237 89538854889623608177 98421229882942378967 Run time: 0.~~2903003~~2968644 -------------------------------------------------------------------------------- Factoring 100 numbers, greatest minimum factor: 782142901 Using: 1 thread Run time: 0.~~3881785~~3438752 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 2 threads Run time: 0.~~2285219~~2035372 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 3 threads Run time: 0.~~1555278~~14177834 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 4 threads Run time: 0.~~13021902~~110738 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 5 threads Run time: 0.~~1206574~~10142434 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 6 threads Run time: 0.~~13286821~~10954304 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 7 threads Run time: 0.~~1102702~~097886 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 8 threads Run time: 0.~~1066578~~0927695 seconds.</pre> Beside <tt>HyperSeq</tt> and its (allowed to be) out-of-order equivalent <tt>RaceSeq</tt>, [[Rakudo]] supports primitive threads, locks and highlevel promises. Using channels and supplies values can be move thread-safely from one thread to another. A react-block can be used as a central hub for message passing.