Unicode polynomial equation: Difference between revisions

Phix
m (flag for difficulty issues)
(Phix)
Line 57:
Either one, or all of these approaches are accepted and appear as a subtitle.
[[Category:Mathematics]]
 
=={{header|Phix}}==
To simplify this task I first created a test file (save as utf8, Unicode_polynomial_equation.txt):
<pre>
-0.00x⁺¹⁰ + 1.0·x ** 5 + -2e0x^4 + +0,042.00 × x ⁺³ + +.0x² + 20.000 000 000x¹ - -1x⁺⁰ + .0x⁻¹ + 20.x¹
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
===> x^5 - 2x^4 + 42x^3 + 40x + 1
x⁵ - 2x⁴ + 42x³ + 0x² + 40x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
0e+0x⁰⁰⁷ + 00e-00x + 0x + .0x⁰⁵ - 0.x⁴ + 0×x³ + 0x⁻⁰ + 0/x + 0/x³ + 0x⁻⁵
==> 0
1x⁵ - 2x⁴ + 42x³ + 40x + 1x⁰
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
+x⁺⁵ + -2x⁻⁻⁴ + 42x⁺⁺³ + +40x - -1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x^5 - 2x**4 + 42x^3 + 40x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x↑5 - 2.00·x⁴ + 42.00·x³ + 40.00·x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x⁻⁵ - 2⁄x⁴ + 42x⁻³ + 40/x + 1x⁻⁰
==> 1 + 40x⁻¹ + 42x⁻³ - 2x⁻⁴ + x⁻⁵
===> 1 + 40x^-1 + 42x^-3 - 2x^-4 + x^-5
x⁵ - 2x⁴ + 42.000 000x³ + 40x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x⁵ - 2x⁴ + 0,042x³ + 40.000,000x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
0x⁷ + 10x + 10x + x⁵ - 2x⁴ + 42x³ + 20x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
1E0x⁵ - 2,000,000.e-6x⁴ + 4.2⏨1x³ + .40e+2x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x⁵ - x⁴⁄2 + 405x³⁄4 + 403x⁄4 + 5⁄2
==> x⁵ - ½x⁴ + 101¼x³ + 100¾x + 2½
===> x^5 - 0.5x^4 + 101.25x^3 + 100.75x + 2.5
x⁵ - ½x⁴ + 101¼x³ + 100¾x + 2½
==> x⁵ - ½x⁴ + 101¼x³ + 100¾x + 2½
x<sup>5</sup> - 2x<sup>4</sup> + 42x<sup>3</sup> + 40x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x^5 - 2x^4 + 42x^3 + 40x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x⁵ - 2x⁴ + 42x³ + 40x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
x5 - 2x4 + 42x3 + 40x + 1
==> x⁵ - 2x⁴ + 42x³ + 40x + 1
</pre>
Explanation: <br>
On finding a line beginning with "==> ", parse the previous line and check that the output matches the rest of the line.<br>
Lines beginning "===> " contain alternative (valid) representations of the previous line, for use when uni_frac is false. Note these equivalences can then be used further on, eg the result on line 5 maps to line 3 via 2, not that it matters should the same equivalent be defined twice, and also note that "==> " should always be the unicode version and "===> " the ascii one.
 
Obviously this is not well tested and //will// fail on the next thing thrown at it, but
it shows the basic approach. Error handling omitted using that good ol' standby excuse
"for clarity".
 
Note that space is not skipped when getting exponents, otherwise it might wrongly treat
say "x + 2" as "x^2", since we /are/ allowing exponentiation symbols to be omitted. At
the start of parse() we strip spaces around '^' (etc) to help with that.
 
Processing the input in utf32 form is a complete no-brainer; whereas constructing utf8
output is a little more subtle, but avoids creating a utf32 "string" just because all
bytes happen to be less than 255, which would not necessarily be valid utf8/32.
 
<lang Phix>-- demo\rosetta\Unicode_polynomial_equation.exw
constant uni_frac = false -- if true output unicode superscripts and vulgar fractions
 
constant UTF8BOM = {#EF,#BB,#BF},
SPTWO = #00B2, -- superscript two
STHRE = #00B3, -- superscript three
MIDOT = #00B7, -- middle dot
SPONE = #00B9, -- superscript one
QUART = #00BC, -- one quarter
AHALF = #00BD, -- one half
THQTR = #00BE, -- three quarters
MULTY = #00D7, -- multiplication sign
DIVDE = #2044, -- division sign
SZERO = #2070, -- superscript zero
SFOUR = #2074, -- superscript four
SFIVE = #2075, -- superscript five
SPSIX = #2076, -- superscript six
SSEVN = #2077, -- superscript seven
SEGHT = #2078, -- superscript eight
SNINE = #2079, -- superscript nine
SPLUS = #207A, -- superscript plus
SMNUS = #207B, -- superscript minus
SVNTH = #2150, -- one seventh
NINTH = #2151, -- one ninth
TENTH = #2152, -- one tenth
THIRD = #2153, -- one third
TWTHD = #2154, -- two thirds
FIFTH = #2155, -- one fifth
TWFTH = #2156, -- two fifths
THFTH = #2157, -- three fifths
FRFTH = #2158, -- four fifths
SIXTH = #2159, -- one sixth
FVSIX = #215A, -- five sixths
EIGTH = #215B, -- one eigth
THEGH = #215C, -- three eigths
FVEGH = #215D, -- five eigths
-- ZTHRD = #2189, -- zero thirds[??]
UPARW = #2191, -- uparrow
BASET = #23E8, -- base 10
SPACE = ' ', -- space
T = 10, -- align nxt tbl
$
 
constant {vulgar_fractions,unicode_vulgar_fractions} = columnize({{{1,4},QUART},
{{1,2},AHALF},
{{3,4},THQTR},
{{1,7},SVNTH},
{{1,9},NINTH},
{{1,T},TENTH},
{{1,3},THIRD},
{{2,3},TWTHD},
{{1,5},FIFTH},
{{2,5},TWFTH},
{{3,5},THFTH},
{{4,5},FRFTH},
{{1,6},SIXTH},
{{5,6},FVSIX},
{{1,8},EIGTH},
{{3,8},THEGH},
{{5,8},FVEGH}})
 
constant EXPONENTS = {SZERO,SPONE,SPTWO,STHRE,SFOUR,SFIVE,SPSIX,SSEVN,SEGHT,SNINE}
 
function skip(sequence s, integer sdx, sequence set)
while sdx<=length(s)
and find(s[sdx],set) do
sdx += 1
end while
return sdx
end function
 
function get_sign(sequence s, integer sdx, bool allow_superscripts)
integer sgn = +1, ch
for sdx=sdx to length(s) do
ch = s[sdx]
if allow_superscripts then
ch = iff(ch=SPLUS?'+':
iff(ch=SMNUS?'-':
iff(ch=SPACE?'?':ch))) -- (do not skip spaces, see note)
end if
if ch!='+'
and ch!=' ' then
if ch!='-' then exit end if
sgn *= -1
end if
end for
return {sgn,sdx}
end function
 
function get_num(sequence s, integer sdx, atom n=0, bool allow_superscripts=false, as_fraction=false)
integer sgn = +1, ch, f, e10, d = 1
atom p10 = 10
bool none = not as_fraction -- (cope with "x" == "1x^1" and != "0x^0")
-- (but not when processing the "34" of "12.34", obvs)
if not as_fraction then
{sgn,sdx} = get_sign(s,sdx,allow_superscripts)
end if
if not allow_superscripts then
sdx = skip(s,sdx,{' '})
end if
for sdx=sdx to length(s) do
ch = s[sdx]
if ch>='0' and ch<='9' then
if as_fraction then
n += (ch-'0')/p10
p10 *= 10
else
n = n*10+ch-'0'
end if
none = false
elsif allow_superscripts then
f = find(ch,EXPONENTS)
if f=0 then exit end if
n = n*10+f-1
none = false
elsif not find(ch," ,") then
exit
end if
end for
if not allow_superscripts then
if find(ch,{'e','E',BASET}) then
{e10,f,sdx} = get_num(s,sdx+1)
if f!=1 then ?9/0 end if
n *= power(10,e10)
elsif ch='.' then
if as_fraction then ?9/0 end if
{n,f,sdx} = get_num(s,sdx+1,n,as_fraction:=true)
if f!=1 then ?9/0 end if
none = false
else
f = find(ch,unicode_vulgar_fractions)
if f!=0 then
if as_fraction then ?9/0 end if
integer {vfn,vfd} = vulgar_fractions[f]
if uni_frac then
n = n*vfd + vfn
if d!=1 then ?9/0 end if
d = vfd
else
n += vfn/vfd
end if
sdx += 1
none = false
end if
end if
end if
if none then n = 1 end if
n *= sgn
return {n,d,sdx}
end function
 
function get_term(sequence s, integer sdx)
integer last_sdx = sdx, -- (infinite loop check/prevent)
e = 0
atom c, d = 1, f
bool sdiv = false
{c,d,sdx} = get_num(s,sdx)
sdx = skip(s,sdx,{' ',MIDOT,MULTY})
if sdx<=length(s)
and (s[sdx]='/' or
s[sdx]=DIVDE) then
sdx += 1
if sdx<=length(s) and s[sdx]!='x' then
{d,f,sdx} = get_num(s,sdx)
if f!=1 then ?9/0 end if
if not uni_frac then
c /= d
d = 1
end if
else
sdiv = true
end if
end if
if sdx<=length(s) and s[sdx]='x' then
sdx = skip(s,sdx+1,{'^','*',SPLUS,UPARW})
{e,f,sdx} = get_num(s,sdx,allow_superscripts:=true)
if f!=1 then ?9/0 end if
if sdiv then e = -e end if
else
if sdiv then ?9/0 end if
end if
if sdx<=length(s)
and (s[sdx]='/' or
s[sdx]=DIVDE) then
if d!=1 then ?9/0 end if
{d,f,sdx} = get_num(s,sdx+1)
if f!=1 then ?9/0 end if
if not uni_frac then
c /= d
d = 1
end if
end if
if sdx=last_sdx then ?9/0 end if
return {e,c,d,sdx}
end function
 
function unicode_superscripts(integer e)
-- display helper
string res = ""
if e>9 then
res = unicode_superscripts(floor(e/10))
e = remainder(e,10)
end if
res &= utf32_to_utf8({EXPONENTS[e+1]})
return res
end function
 
enum EXP, COEF, FRAC -- contents of terms[i]
 
function poly(sequence terms)
-- display helper
string r = ""
for t=length(terms) to 1 by -1 do
{integer e, atom c, integer f} = terms[t]
if c!=0 then
if c=1 and f=1 and e!=0 then
r &= iff(r=""? "":" + ")
elsif c=-1 and f=1 and e!=0 then
r &= iff(r=""?"-":" - ")
else
if r!="" then
r &= iff(c<0?" - ":" + ")
c = abs(c)
end if
if f!=1 then -- (hence/only when uni_frac==true)
integer k = find({remainder(c,f),f},vulgar_fractions)
if k then
c = floor(c/f)
if c!=0 then r &= sprintf("%d",c) end if
r &= utf32_to_utf8({unicode_vulgar_fractions[k]})
else
r &= sprintf("%g",c/f)
end if
else
r &= sprintf("%g",c)
end if
end if
if e!=0 then
r &= 'x'
if e!=1 then
if uni_frac then
if e<0 then
r &= utf32_to_utf8({SMNUS})
e = -e
end if
r &= unicode_superscripts(e)
else
r &= sprintf("^%d",e)
end if
end if
end if
end if
end for
if r="" then r="0" end if
return r
end function
 
function parse(sequence s)
sequence terms = {}
integer sdx = 1, e, f
atom c
s = match_replace("<sup>",s,"^")
s = match_replace("</sup>",s,"")
s = match_replace("**",s,"^")
s = match_replace(" ^",s,"^")
s = match_replace("^ ",s,"^")
s = match_replace({' ',SPLUS},s,{SPLUS})
s = match_replace({' ',SMNUS},s,{SMNUS})
for i=1 to length(EXPONENTS) do
e = EXPONENTS[i]
s = match_replace({' ',e},s,{e})
end for
while sdx<=length(s) do
{e,c,f,sdx} = get_term(s,sdx)
if c!=0 then -- (aside: +5 -5 may yet leave c==0)
terms = append(terms,{e,c,f})
end if
end while
terms = sort(terms)
-- merge, eg "10x^2 + 10x^2" -> 20x^2
for i=length(terms) to 2 by -1 do
if terms[i][EXP] = terms[i-1][EXP] then
if terms[i-1][FRAC]!=terms[i][FRAC] then
?9/0 -- placeholder for more code
else
terms[i-1][COEF] += terms[i][COEF]
end if
terms[i..i] = {} -- (delete 2nd)
end if
end for
return poly(terms)
end function
 
sequence alts = {}, -- (unicode versions)
altn = {}, -- (idx of ascii equivalents)
lines = read_lines("Unicode_polynomial_equation.txt")
 
if lines[1][1..3] = UTF8BOM then
-- remove/ignore any utf8 byte order mark
lines[1] = lines[1][4..$]
end if
 
for i=2 to length(lines) do
if length(lines[i])>5
and lines[i][1..5] = "===> " then
alts = append(alts,lines[i-1][5..$])
altn = append(altn,i)
end if
end for
 
for i=2 to length(lines) do
if length(lines[i])>4
and lines[i][1..4] = "==> " then
sequence line = utf8_to_utf32(lines[i-1])
sequence res = parse(line)
sequence expected = lines[i][5..$]
if res=expected then
-- (res is the unicode version)
if platform()!=WINDOWS or res="0" then
printf(1,"%2d: %40s ok\n",{i-1,res})
else
-- (unicode output on windows consoles is fiddly...)
printf(1,"%2d: ok\n",i-1)
end if
else
integer k = find(expected,alts)
if k and res=lines[altn[k]][6..$] then
-- (res is the ascii equivalent)
printf(1,"%2d: %40s ok\n",{i-1,res})
else
printf(1,"%d: error - %s\n",{i-1,res})
end if
end if
end if
end for</lang>
{{out}}
uni_frac = false
<pre>
1: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
4: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
6: 0 ok
8: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
10: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
12: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
14: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
16: 1 + 40x^-1 + 42x^-3 - 2x^-4 + x^-5 ok
19: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
21: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
23: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
25: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
27: x^5 - 0.5x^4 + 101.25x^3 + 100.75x + 2.5 ok
30: x^5 - 0.5x^4 + 101.25x^3 + 100.75x + 2.5 ok
32: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
34: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
36: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
38: x^5 - 2x^4 + 42x^3 + 40x + 1 ok
</pre>
uni_frac = true (linux only)
<pre>
1: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
4: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
6: 0 ok
8: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
10: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
12: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
14: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
16: 1 + 40x⁻¹ + 42x⁻³ - 2x⁻⁴ + x⁻⁵ ok
19: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
21: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
23: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
25: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
27: x⁵ - ½x⁴ + 101¼x³ + 100¾x + 2½ ok
30: x⁵ - ½x⁴ + 101¼x³ + 100¾x + 2½ ok
32: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
34: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
36: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
38: x⁵ - 2x⁴ + 42x³ + 40x + 1 ok
</pre>
 
{{omit from|AWK}}
7,795

edits