User talk:Petelomax

From Rosetta Code

Hi, and welcome to RC :-)
(It's nice to get language authors contributions).

Hi, your permutations example seems to be written to answer more than one task but in doing so becomes a not very good RC code example for those tasks as it shows extraneous answers and code not needed in the task. Far better if you used a library or module import system and tailored each example to the task at hand. I have flagged these examples for attention as I don't think it would be good for RC tasks if this unrelated output from examples was to spread.

The Executable library task specifically asks for the use of a library. --Paddy3118 (talk) 06:12, 8 August 2015 (UTC)

Thanks, I assume you meant the deranged anagrams task, which I have fixed. --Pete Lomax (talk)

That's great. Thanks. --Paddy3118 (talk) 11:00, 8 August 2015 (UTC)

I've uploaded http://phix.is-great.org/geshi/phix.php (which just gives you a blank page) to http://phix.is-great.org/geshi/phix.zip in case that helps --Pete Lomax (talk) 08:35, 15 September 2015 (UTC)

Boids[edit]

I've noticed that you've reposted the off-site link to Phix code for Boids. Is there no possibility to post this code on-site? This is after all the whole point of the wiki. Not only because it allows easy code comparisons, but also because it makes it possible for people to change the code. Also all code on Rosetta is posted under the same license. And links can become broken. Rosetta Code really isn't intended as a link directory. Fwend (talk) 14:26, 6 June 2017 (UTC)

OK, done. I also added a screenshot, trust that is acceptable as an off-site link ;-) Pete Lomax (talk) 10:37, 7 June 2017 (UTC)
Thanks, it looks great. :-) Fwend (talk) 20:28, 7 June 2017 (UTC)

Use of Factorization Wheels in Sieves[edit]

As per the comment you made in your contribution to Extensible_prime_generator#Phix, you said "I investigated the use of so-called "wheels", beguiled by the claim that "a 2-3-5-7 wheel saves 77%", until I realised the breakdown was 2: 50%, 3: 16%, 5: 7%, 7: 4% - it is unthinkable not to exclude even numbers, the added complexity (and subscripting) of a 30- or 210- element wheel does not seem worthwhile."

With respect, you are missing something: While it is true that for extremely large ranges of prime approaching infinity, the majority of the gain is derived from using only the low number wheel factors, the benefit is much greater than expected for commonly used ranges. As per this link to the Wikipedia Sieve of Eratosthenes formula and tables, there is a reduction of a factor of about 2.67 using odds-only but a further factor of about 4.16 using the extra 3-5-7 wheel for a sieving range of a hundred thousand as is the requirement of the task, and a factor of about 2.5 for odds-only but a further factor of an extra about 3.26 for a range of about a billion as you calculated (actually you calculated to about two billion). In addition, it is fairly easy to implement pre-culling of the array(s) for the even larger wheel of 11-13-17-19 for a further smaller gain.

You further mention added complexity in subscripting, which isn't necessary; While it is true that there is added complexity in a page segmented version such as this in computation of page start addresses per prime, the actual subscripting for the culling operations doesn't need to be any more complex than for the odds-only case, which could be considered to be just one of the modulo planes out of the total of 48 used for 2-3-5-7 wheel factorization. Part of your perceived subscripting complexity may be because you didn't realize the optimization of reduced memory use using factorization: you can actually reduce the size of your "sieve" culling array by a factor of two as the even values aren't used; a further similar reduction of over a further factor of two is gained by extended wheels, with only the added complexity of starting addresses required at the beginning of each page.

BTW, about 27.5 seconds to calculate the 100 millionth prime for odds-only isn't particularly fast: even using odds-only Julia can calculate this range about ten times faster using an equivalent page segmented technique, as it looks like the techniques of the bit operations required to do bit packing by bytes to make maximal use of the CPU L1 cache are available in Phix, which reduces memory use by a further factor of eight; this is without the more complex techniques of implementing modulo bit planes for efficient wheel factorizations as discussed above for a further reduction of about four times, then multi-processing can be added... --GordonBGood (talk) 05:25, 13 November 2018 (UTC)

Sorry for not getting back sooner, just before you posted my internet went down, and stayed that way for nearly a pigging month.
I took another look, and concede that I am probably missing something, but still not quite convinced. Anyway, looking forward to seeing your first phix contribution :-)
FYI, my mention of subscripting is biased by my internal knowledge of the workings of phix. While it is heavily optimised, and rarely any cause for concern, phix is reference-counted and checking whether we need to do any reference counting can put it at a disadvantage when pitted against C-based subscripting, plus although I inline whenever possible, sometimes subscripting involves a call/return which C-based languages never do. Also, I did not mean subscripting the sieve, but was referring to the subscripting of the wheel itself.
BTW, running the latest julia code, the Euler problem 10, after correcting a typo, I found it was about the same time as phix, whereas the older julia code (plus one extra line and a few global directives so it would actually run) took over seven minutes to display the 100,000,000th prime - 15 times slower than phix, looks like the standard "using Primes" could be improved somewhat. However, the one billion bench from the Page Segmented Algorithm was indeed about 18 times faster. Maybe one day I'll find the time to give that method a spin, unless you beat me to it. --Pete Lomax (talk) 15:50, 16 December 2018 (UTC)
No problem about not getting back; I posted this to you to maybe help your understanding of the Sieve.
As to my use of your Phix language, it probably won't happen anytime soon as I have a toolbox full of languages and although it looks to be quite simple, I don't know about working around its limited data types. One thing I would really miss is the notion of first class function/procedure type as that would then allow one to defer execution, define lazy types, and not be so reliant on memory consuming large sequences. I see this, just as about anything else, can be worked around, in this case by using call id's, but it certainly isn't that beautiful as compared to languages that are designed to be more functional. I see that just about anything is possible in Phix, but I also wonder what kind of performance we might get if we do all of that.
From your documentation, it isn't clear to me about the internal memory structure of sequences; is it a linked list or a simple array in memory? I assume you lose the extra MSB of integer to be able to distinguish between atom/integer... At any rate, any limitations on the use of sequences could be worked around by direct memory manipulations, again at the cost of beautiful code.
a simple array, yes, and yes - you could even resort to inline assembly. --Pete Lomax (talk) 15:33, 17 December 2018 (UTC)
On the other hand, although Julia isn't at all my favorite language and I don't really like dynamic typing, it can be quite fast for doing a lot of things, but even at that it isn't usually my language of choice for any job at hand.
The "older Julia code" for the Extensible Generator isn't my code, and I just added a couple of alternate versions as I perceived its deficiencies. However, although the prime probability test that the Primes library uses is less complex and faster than testing for individual primes using trial division as some libraries do, it isn't really meant to be used as a generator of a sequence of primes. I think that you'll find that my alternate 1 is almost as slow although it better satisfies the requirements of being a true generator (in only one line of code).
As to speeding up your Phix code, you are running into some limitations of the language here in implementing page segmentation: it seems you only have an integer type ("almost" 64-bit on 64-bit machines else using 64-bit "atom" as an integer) and can only shift right/left by dividing/multiplying by powers of two, which is slow unless your compiler optimizes this into machine shift opcodes. I was curious about this so actually wrote a quite test of divide speed, and as expected it's quite terrible at a couple of hundred machine cycles per divide on my machine. I'm afraid that your language will never be able to even match JavaScript languages (which also use a float as an integer, but have proper shift operations) in running a Sieve, which can be done only about three times slower than Julia (say with Fable, a really nice F# language implementation).
If you didn't have this limitation, you could implement "bit packing" and thereby achieve better cache locality in page segmentation as one CPU L2 cache size of say 16 Kilobytes could then contain 131,072 prime representations instead of 2048 using 64-bit integers. The Julia code uses a further optimization that requires addressing memory efficiently by bit-packed bytes to get a further speed-up of a factor of two which wouldn't appear to be possible in Phix (or at least not efficiently). Good luck on this!--GordonBGood (talk) 05:15, 17 December 2018 (UTC)
There are in fact some attempts to use shifts for divide/multiply, if that can be determined at compile-time, will see if I can do more. I totally agree there are some performance issues in bit-fiddling (just like subscripts), and the language would no doubt benefit from a new set of builtins that do such things much faster on blocks of pre-allocated memory, then again my personal definition of "performance" is kinda much more biased towards a fast edit/run cycle. Translation to JavaScript is in fact pretty much number 1 on my wish list for the lanugage, btw. --Pete Lomax (talk) 15:33, 17 December 2018 (UTC)
I seems that builtins are function calls and are not inlined. Thus, although adding `set_bit`, `clear_bit`, `test_bit` `shl_bits`, `shr_bits`, and `shrzf_bits` would make "bit-twiddling" easier to do, it would still take two function calls per cull to implement bit-packed culling, which would be about 120 machine cycles and too slow. You need to be able to specify simple builtins as "inline" to be fast enough. Even having the compiler recognize when multiply/divide factors are powers of two (actually quite easy, as then `and_bits(x, x-1) == 0` after passing the not zero checks when literal constants), the bitwise operations using function calls are too slow unless inlined.
Agreed. Many/most builtins are indeed just lazily implemented as hll code, whereas some I try to inline, like and_bits(), remainder(), compare(), length(), etc. Oh yeah, there's another crafty trick I missed.
As an aside, `without type_check` doesn't seem to work or I don't understand what it is supposed to do, it still detects `integer i = 1/2 as an error without the check; also, how to you turn off integer overflow checking? I can't seem to enter `-#3FFFFFFF - 1` as an integer, or rather it is accepted as an integer but prints as a float and not exact at that as I understood to be possible from the documentation, and, due to integer overflow checking subtracting another one from it triggers the type check error. Flaky... And overflow checks (as well as type checks) take machine cycles.
Ah, `without type_check` only disables user-defined type routines, not internal checking - will clarify that in the manual. I wasn't able to reproduce any issue with -#3FFFFFFF - 1. I'm a fail-fast guy at heart, so I don't really understand/agree with the word Flaky, and there is no option, or plan for one, to make your program harder to maintain just so it can go a little faster (sorry).
I can understand that compilation speed is a concern and in fact is already slow on my low end machine - about 30 seconds for a `-c` pass of a basic test program, where my machine is over twice as slow as yours given the speed to run the extensible generator as a benchmark. But some improvements such as making designated builtin functions inline and (possibly) adding the slightly more sophisticated power of two multiply/divide shouldn't take much time.
It has only just occurred to me: 30s is indeed quite a long time & I'll bet what's happening is your a.v. is kicking in and giving the resultant exe a good once-over. After a few goes it should get bored with that. --Pete Lomax (talk) 16:47, 29 December 2018 (UTC)
Yes, I have experienced AV slowing expected performance before on occasion, and it seems you are right in this instance: Windows Defender (on Windows 10) is doing some kind of extensive scan triggered every time I start a compile cycle (no matter how trivial) and can't really get started until that scan is finished, which is probably the majority of the 30 seconds. It wasn't expected that AV would be triggered on every invocation.--GordonBGood (talk) 06:45, 31 December 2018 (UTC)
"performance issues on subscripting" I suppose means indexing of sequences? Not inlined? bounds checking is slow? Bounds checking can't be optionally turned off?
Inlined and disabled when known to be in bounds, but no option to forcibly turn it off.
As to generating JavaScript output, if it were me at the stage I understand you to be, I wouldn't bother with JavaScript and would skip directly to Web Assembly output as you don't much require anything that Web Assembly doesn't already deliver since you already use reference counting memory allocation, including multi-threading (or at least coming very soon, and already available on Chrome). Performance is not particularly fast (no faster than JavaScript for many cases) and somewhat inconsistent performance across browsers (although, so too is JavaScript), but that will surely get fixed by the time you have completed this.
I ran away screaming (sounds familiar, tee hee) from Web Assembly last time, maybe I'll give it another look.
In short, Phix seems to me to be too much of a "work in progress" and as it currently exists not worth my while in pursuing further adaptation of my algorithms to it. However, as before, good luck to you, and I may look at it again if some of these issues are fixed.--GordonBGood (talk) 05:18, 18 December 2018 (UTC)
No worries. There never was and probably never will be a programming language with no downsides, some we can live with, some we cannot. Many thanks for the input, will try to bear it all in mind as I move onward. --Pete Lomax (talk) 00:53, 20 December 2018 (UTC)
Sorry, I was mistaken: there is no bug on the `-0x40000000` or `-0x3FFFFFFF - 1`, rather I got caught by there being a difference between using `?` used as a shorthand to print to screen which adds a "newline" automatically and `print(1,...)` which does not automatically add the newline. I saw the next value printed immediately beside, which was a float, to be part of the integer printed. Your documentation on `?` does mention this difference but I hadn't read it.
As to language safety, blocking "subscripts" when they are out of range of a "sequence" is commonly done as part of language safety, but blocking integer overflows that can't be turned off is not or at least can commonly be turned off, as it doesn't cause a crash or a security hole but only potentially an error in programming logic. There are quite a few languages that do detect it, but most of the ones that I know have a way of turning it off. Anyway, not a big issue as there are ways of compensating when it's desired, but without inlined bit manipulations they may be expensive in performance.
As to language preferences, yes, lately I have come to prefer functional languages, which Phix in no way pretends to be functional as it is hard to imagine a language which is more imperative. I have been tempted to try Elixir (which might be considered to be a language of similar capabilities as Phix, but functional) for that reason, but which again I don't expect to be a highly performant language - just interesting...--GordonBGood (talk) 06:39, 20 December 2018 (UTC)

Modular_arithmetic[edit]

Hi Pete, I found f(10) to be 1E100, your program outputs 1e101, are you sure yours is correct? Dedalus (talk) 15:04, 15 February 2019 (UTC)

I just ran it again and the output was 1e100, I must have pasted it in wrong. Well spotted, thanks. --Pete Lomax (talk) 17:20, 15 February 2019 (UTC)