Numeric separator syntax: Difference between revisions

From Rosetta Code
Content added Content deleted
m (→‎{{header|Factor}}: remove redundant word)
m (→‎{{header|Factor}}: more consistent verbiage)
Line 37: Line 37:
C{ 5.225,312 2.0 } . ! C{ 5.225312 2.0 }</lang>
C{ 5.225,312 2.0 } . ! C{ 5.225312 2.0 }</lang>


If one desires to define a syntax for different grouping rules, that is possible:
If one desires to define a syntax for different separator rules, that is possible:
<lang factor>USING: lexer math.parser prettyprint sequences sets ;
<lang factor>USING: lexer math.parser prettyprint sequences sets ;



Revision as of 18:16, 30 August 2019

Numeric separator syntax is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Several programming languages allow separators in numerals in order to group digits together.

Task

Show the numeric separator syntax and describe its specification. E.g., what separators are eligible? Can there be multiple consecutive separators? What position can a separator be in? Etc.



Factor

Factor allows the comma , as a separator character in number literals. <lang factor>USE: prettyprint

12,345 .  ! 12345

! commas may be used at arbitrary intervals 1,23,456,78910 .  ! 12345678910

! a comma at the beginning or end will parse as a word, likely causing an error ! ,123 .  ! No word named “,123” found in current vocabulary search path ! 123, .  ! No word named “123,” found in current vocabulary search path

! likewise, two commas in a row will parse as a word ! 1,,23 .  ! No word named “1,,23” found in current vocabulary search path

! There are no exceptions to which numbers may have separators ! binary/octal/decimal/hexadecimal integers and floats are supported 0b1,000,001 .  ! 65 -1,234e-4,5 .  ! -1.234e-42 0x1.4,4p3 .  ! 10.125

! as are ratios 45,2+1,1/43,2 .  ! 452+11/432 1,1/1,7 .  ! 11/17

! and complex numbers C{ 5.225,312 2.0 } .  ! C{ 5.225312 2.0 }</lang>

If one desires to define a syntax for different separator rules, that is possible: <lang factor>USING: lexer math.parser prettyprint sequences sets ;

<< SYNTAX: PN: scan-token "_" without string>number suffix! ; >>

! permissive numbers PN: _1_2_3_ .  ! 123 PN: 1__234___567 .  ! 1234567 PN: 0b0___10.100001p3 .  ! 20.125</lang>

Since Factor's parser is exposed, one could even make changes to the number parser, obviating the need for parsing words. <lang factor>USING: eval prettyprint ;

<<

"IN: math.parser.private USE: combinators

@pos-digit-or-punc ( i number-parse n char -- n/f )
   {
       { 95 [ [ @pos-digit ] require-next-digit ] }   ! normally 44
       { 43 [ ->numerator ] }
       { 47 [ ->denominator ] }
       { 46 [ ->mantissa ] }
       [ [ @pos-digit ] or-exponent ]
   } case ; inline" eval( -- )

>>

3_333_333 .  ! 3333333</lang>

Perl 6

Perl 6 allows underscore as a grouping / separator character in numeric inputs, though there are a few restrictions.

<lang perl6># Any numeric input value may use an underscore as a grouping/separator character.

  1. May occur in nearly any position, in any* number. * See restrictions below.
  1. Int

say 1_2_3; # 123

  1. Binary Int

say 0b1_0_1_0_1; # 21

  1. Hexadecimal Int

say 0xa_bc_d; # 43981

  1. Rat

say 1_2_3_4.2_5; # 1234.25

  1. Num

say 6.0_22e4; # 60220

  1. There are some restrictions on the placement.
  2. An underscore may not be on an edge boundary, or next to another underscore.
  3. The following are all syntax errors.
  1. say _1234.25;
  2. say 1234_.25;
  3. say 1234._25;
  4. say 1234.25_;
  5. say 12__34.25;</lang>

Racket

Vanilla Racket does not have numeric separator syntax. However, it can be defined by users. For instance:

<lang racket>#lang racket

(require syntax/parse/define

        (only-in racket [#%top racket:#%top])
        (for-syntax racket/string))

(define-syntax-parser #%top

 [(_ . x)
  #:do [(define s (symbol->string (syntax-e #'x)))
        (define num (string->number (string-replace s "_" "")))]
  #:when num
  #`#,num]
 [(_ . x) #'(racket:#%top . x)])

1_234_567.89 1_234__567.89</lang>

Output:
1234567.89
1234567.89

In the above implementation of the syntax, _ is the separator. It allows multiple consecutive separators, and allows the separator anywhere in the numeral (front, middle, and back).

Implementation details: any token with _ is considered an identifier in vanilla Racket. If it's not defined already, it would be unbound. We therefore can define #%top to control these unbound identifiers: if the token is a number after removing _, expand it to that number.

If we wish to, for example, disallow multiple consecutive separators like 1_234__567.89, we could do so easily:

<lang racket>#lang racket

(require syntax/parse/define

        (only-in racket [#%top racket:#%top])
        (for-syntax racket/string))

(define-syntax-parser #%top

 [(_ . x)
  #:do [(define s (symbol->string (syntax-e #'x)))
        (define num (string->number (string-replace s "_" "")))]
  #:when num
  (syntax-parse #'x
    [_ #:fail-when (string-contains? s "__") "invalid multiple consecutive separators"
       #`#,num])]
 [(_ . x) #'(racket:#%top . x)])

1_234_567.89 1_234__567.89</lang>

Output:
1_234__567.89: invalid multiple consecutive separators in: 1_234__567.89