Talk:Sparkline in unicode: Difference between revisions

Content added Content deleted

Inline

Revision as of 00:03, 25 February 2019

Most of these are buggy

The wrong way to compute the character index

Anything that uses the number 7 (bins-1 etc.) in the binning assignment has too-wide bin sizes. The two most common manifestations of the bug are:

when the quotient is truncated (floor/ceil/int), the first or last bin will be one value wide.
when the quotient is rounded, the widths of the first and last bin are too small by half.

The right way to compute the character index

The Go code uses int( 8 * (v-min) / (max-min) ) which works in all cases except when v==max; it deals with that case by clamping values larger than 7 to 7 (for a zero-based array).

The Tcl code gets honorable mention for using int( 8 * (v-min) / (max-min)*1.01 ), which mostly does the same thing as the Go code. It avoids the need for clamping but gives bins that are 1% too wide, which becomes visible when the range is large. This approach works if the multiplier is larger than 1, smaller than 1 + 1/(max-min), and large enough to not get overwhelmed by floating-point imprecision.

Test cases that detect bugs

0 1 19 20 detects the one-wide bug. Output should be the same as 0 0 1 1 with exactly two heights. The bug looks like ▁▂██ or ▁▁▇█
0 999 4000 4999 7000 7999 detects the half-width bug and some smaller errors (see Tcl). Output should have three heights; the half-width bug looks like: ▁▂▅▅▇█

sparktest.pl

This is some Perl code that will report the widths of same-height sections of output, when provided with a sparkline on standard input. Non-sparkline-lines are ignored. The line produced from a continuous integer sequence should produce eight equal widths (or almost equal if the sequence length is not a multiple of eight).

perl -CS -Mutf8 -nle '@x=grep $i^=1, map length, /(([▁-█])\2*)/g and print"@x"'

Sample usage (in bash, and assuming program accepts space-separated data on standard input):

alias sparktest=$'perl -CS -Mutf8 -nle \'@x=grep $i^=1, map length, /(([▁-█])\\2*)/g and print"@x"\''
echo {1..8000} | sparkline | sparktest
# expected output is 1000 1000 1000 1000 1000 1000 1000 1000

Not Buggy

Go. Tested up to echo {1..12345} | go run sl.go | sparktest

Buggy

C: ▁▂██
C++: ▁▁▇█
Clojure: ▁▂▅▅▇█
Common Lisp: ▁▂▅▅▇█
D: obvious one-wide bug; didn't run the code
Elixir: ▁▂▅▅▇█
Groovy: one-wide; didn't run
Haskell: looks like half-width bug; didn't run
Java: one-wide; didn't run
Javascript: ▁▂▅▅▇█
jq: one-wide and neglects to check bounds: ▁▃▷►
Nim: Python translation
fixed! ~~Perl~~: ▁▁▇█
fixed! ~~Perl 6~~: ▁▁▇█
PicoLisp: ▁▂▅▅▇█
(half fixed) Python: ▁▁▇█
Ruby: ▁▁▇█
Rust: thread 'main' panicked at 'attempt to subtract with overflow', sl.rust:8:40
Tcl: ▁▁▄▅▇█; not a half-width bug (the second character is correct); manifests only on large ranges; see comments above.

... that's 15 tested, 14 failures, plus 5 didn't-runs that almost certainly have the bug.

--Oopsiedaisy (talk) 08:24, 24 February 2019 (UTC)

Good point. Fixed in the Perl 6 example (and another bug I found while I was at it.) This highlights the perils of developing without a decent test suite. On the other hand, Rosettacode makes no claims about the quality of the code on the site. --Thundergnat (talk) 14:42, 24 February 2019 (UTC)

It's the nature of the beast that an early bug may be faithfully translated many times. I agree that a buggy solution has value despite its flaws. I fixed the Perl example. --Oopsiedaisy (talk) 15:19, 24 February 2019 (UTC)

Thanks Oopsiedaisy. I started the task off with an initial buggy Python solution. Now fixed and with examples extended to show your problem cases. Thanks again. --Paddy3118 (talk) 19:35, 24 February 2019 (UTC)

Bar choices

Hi Tim. There is a problem with your choices of bars in that they have a ragged bottom line:

▁▂▃▄▅▆▇█

There is a problem with my choice of bars in that the highest bar is not full width:

▁▂▃▅▆▇▉▇▆▅▃▂▁

I find the ragged baseline to be much more irritating. How to resolve? --Paddy3118 (talk) 03:18, 18 June 2013 (UTC)

Oh, my font is Courier. --Paddy3118 (talk) 03:42, 18 June 2013 (UTC)

I find that there's quite wide differences in the quality of fonts when it comes to blocks and box elements; a lot of fonts simply don't have the things that should extend to the limits of the glyph box they declare actually doing so at all. In my limited experimenting, Courier New is considerably better than the others I've tried (Andale Mono, Consolas, Courier, Monaco) for this sort of thing. Not much we can do about that really (except “blame the font makers”, which isn't very helpful). –Donal Fellows (talk) 11:24, 20 June 2013 (UTC)

I now find that there is raggedness in the baseline of my bar choice if I swap to Consolas font. I think I'll revert to using Tims seven bars and search for a font as the Unicode page has nothing to say on this, just:

@@	2580	Block Elements	259F
@		Block elements
2580	UPPER HALF BLOCK
2581	LOWER ONE EIGHTH BLOCK
2582	LOWER ONE QUARTER BLOCK
2583	LOWER THREE EIGHTHS BLOCK
2584	LOWER HALF BLOCK
2585	LOWER FIVE EIGHTHS BLOCK
2586	LOWER THREE QUARTERS BLOCK
2587	LOWER SEVEN EIGHTHS BLOCK

--Paddy3118 (talk) 03:56, 18 June 2013 (UTC)

The baseline is fine in my terminal font, and the baseline problem only manifests in the browser. In any case, if the font is problematic, that's the font's problem, not our problem. Notionally the blocks should have the same baseline, and I'd much rather have a solution that will be correct after they fix the fonts. (Or fix the font aliasing algorithm, which may be what's really going on here.) --TimToady (talk) 07:57, 18 June 2013 (UTC)

Yes, it's the font dealiasing that is doing it. Changing the page's font size up and down moves the fuzz from the bottom to the top, and to different characters. So trying to pick the "right" characters is an exercise in futility, because what's right for you will be wrong for someone else. So just use the eight characters that are supposed to be right, and ignore the baseline issue. --TimToady (talk) 08:02, 18 June 2013 (UTC)

Oh, you already did, nevermind. :) --TimToady (talk) 08:10, 18 June 2013 (UTC)

Python query

In the (original) Python entry, obviously some kind of to be or not to be unicode thing, can someone explain the try/except on bar, ta? --Pete Lomax (talk)

It allows the code to work in both Python 2 and Python 3. --Paddy3118 (talk) 10:45, 11 January 2019 (UTC)

Sorry, I didn't mean the try/except on raw_input, but the one on bar (try: bar = u'▁▂▃▄▅▆▇█' except: bar = '▁▂▃▄▅▆▇█'). Following that link, I am certainly closer to understanding, but still slightly adrift. Is it something to do with u'xx' being invalid syntax' in 3.0 .. 3.2 but accepted/ignored in 3.3+? --Pete Lomax (talk) 17:33, 11 January 2019 (UTC)

Petelomax: It's true that originally Python 3.x didn't accept the u'...' syntax because normal '...' strings are already Unicode. More recent versions accept the syntax, but the u has no effect. So that might explain the try: there, except that a try/except doesn't do any good for syntax errors. So I'm as puzzled as you are.--Markjreed (talk) 19:05, 11 January 2019 (UTC)

@@ Line 26: / Line 26: @@
 Sample usage (in bash, and assuming program accepts space-separated data on standard input):
+<pre>
+alias sparktest=$'perl -CS -Mutf8 -nle \'@x=grep $i^=1, map length, /(([▁-█])\\2*)/g and print"@x"\''
-<code>echo {1..8000} | sparkline | sparktest.pl</code>
+echo {1..8000} | sparkline | sparktest
-Expected output from the sample is <code>1000 1000 1000 1000 1000 1000 1000 1000</code>.
+# expected output is 1000 1000 1000 1000 1000 1000 1000 1000
+</pre>
 ;Not Buggy
-* [[Sparkline_in_unicode#Go|Go]].  Tested up to <code>echo {1..12345} | go run sl.go | sparktest.pl</code>
+* [[Sparkline_in_unicode#Go|Go]].  Tested up to <code>echo {1..12345} | go run sl.go | sparktest</code>
 ;Buggy