String comparison: Difference between revisions

From Rosetta Code
Content added Content deleted
(Clarify that bonus might demonstrate the distinction between generic comparison and coercive comparison if applicable.)
Line 6: Line 6:
* Comparing two strings to see if one is lexically higher than the other
* Comparing two strings to see if one is lexically higher than the other
'''Bonus:'''
'''Bonus:'''
* Demonstrate the other kinds of string comparisons that the language provides.
* Demonstrate the other kinds of string comparisons that the language provides. For example, demonstrate the difference between generic comparison and coercive comparison if your language supports such a distinction. If not, show how you would compare two numbers lexically rather than numerically.


'''See also:'''<!-- Note that this part might go away, as it is handled by the “basic data operation” template -->
'''See also:'''<!-- Note that this part might go away, as it is handled by the “basic data operation” template -->

Revision as of 18:10, 23 February 2013

String comparison is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Basic Data Operation
This is a basic data operation. It represents a fundamental action on a basic data type.

You may see other such operations in the Basic Data Operations category, or:

Integer Operations
Arithmetic | Comparison

Boolean Operations
Bitwise | Logical

String Operations
Concatenation | Interpolation | Comparison | Matching

Memory Operations
Pointers & references | Addresses

The task is to demonstrate how to compare two strings from within the language and how to achieve a lexical comparison. The task should demonstrate:

  • Comparing two strings for exact equality
  • Comparing two strings for inequality (i.e., the inverse of exact equality)
  • Comparing two strings to see if one is lexically lower than the other
  • Comparing two strings to see if one is lexically higher than the other

Bonus:

  • Demonstrate the other kinds of string comparisons that the language provides. For example, demonstrate the difference between generic comparison and coercive comparison if your language supports such a distinction. If not, show how you would compare two numbers lexically rather than numerically.

See also:

AWK

<lang awk>BEGIN {

 a="BALL"
 b="BELL"
 IF (a == b) { print "The strings are equal" }
 IF (a != b) { print "The strings are not equal" }
 IF (a > b) { print "The first string is lexically higher than the second" }
 IF (a < b) { print "The first string is lexically lower than the second" }
 IF (a >= b) { print "The first string is not lexically lower than the second" }
 IF (a <= b) { print "The first string is not lexically higher than the second" }

}</lang>

BASIC

<lang basic>10 LET "A$="BELL" 20 LET B$="BELT" 30 IF A$ = B$ THEN PRINT "THE STRINGS ARE EQUAL": REM TEST FOR EQUALITY 40 IF A$ <> B$ THEN PRINT "THE STRINGS ARE NOT EQUAL": REM TEST FOR INEQUALITY 50 IF A$ > B$ THEN PRINT A$;" IS LEXICALLY HIGHER THAN ";B$: REM TEST FOR LEXICALLY HIGHER 60 IF A$ < B$ THEN PRINT A$;" IS LEXICALLY LOWER THAN ";B$: REM TEST FOR LEXICALLY LOWER 70 IF A$ <= B$ THEN PRINT A$;" IS NOT LEXICALLY HIGHER THAN ";B$ 80 IF A$ >= B$ THEN PRINT A$;" IS NOT LEXICALLY LOWER THAN ";B$ 90 END</lang>

Burlesque

<lang burlesque> blsq ) "abc""abc"== 1 blsq ) "abc""abc"!= 0 blsq ) "abc""Abc"cm 1 blsq ) "ABC""Abc"cm -1 </lang>

cm is used for comparision which returns 1,0,-1 like C's strcmp. == is Equal and != is NotEqual.

J

Solution: The primitive -: can be used to determine whether two strings are equivalent, but J doesn't have other inbuilt lexical comparison operators. They can defined as follows: <lang j>eq=: -: NB. equal ne=: -.@-: NB. not equal gt=: {.@/:@,&boxopen *. ne NB. lexically greater than lt=: -.@{.@/:@,&boxopen *. ne NB. lexically less than ge=: {.@/:@,&boxopen +. eq NB. lexically greater than or equal to le=: -.@{.@/:@,&boxopen NB. lexically less than or equal to</lang>

Usage: <lang j> 'ball' (eq , ne , gt , lt , ge , le) 'bell' 0 1 0 1 0 1

  'ball' (eq , ne , gt , lt , ge , le) 'ball'

1 0 0 0 1 1

  'YUP' (eq , ne , gt , lt , ge , le) 'YEP'

0 1 1 0 1 0</lang>

Perl 6

This code demonstrates that Perl 6's string comparisons are coercive. (You may use generic comparison operators if you want polymorphic comparison, but usually you don't. :) <lang perl6>sub compare($a,$b) {

   my $A = "{$a.WHAT.^name} '$a'";
   my $B = "{$b.WHAT.^name} '$b'";
   if $a eq $b { say "$A and $B are lexically equal" }
   if $a ne $b { say "$A and $B are not lexically equal" }
   if $a gt $b { say "$A is lexically higher than $B" }
   if $a lt $b { say "$A is lexically lower than $B" }
   if $a ge $b { say "$A is not lexically lower than $B" }
   if $a le $b { say "$A is not lexically higher than $B" }
   if $a eqv $b { say "$A and $B are generically equal" }
   if $a !eqv $b { say "$A and $B are not generically equal" }
   if $a before $b { say "$A is generically higher than $B" }
   if $a after $b { say "$A is generically lower than $B" }
   if $a !after $b { say "$A is not generically lower than $B" }
   if $a !before $b { say "$A is not generically higher than $B" }
   say "The lexical relationship of $A and $B is { $a leg $b }" if $a ~~ Stringy;
   say "The generic relationship of $A and $B is { $a cmp $b }";
   say "The numeric relationship of $A and $B is { $a <=> $b }" if $a ~~ Numeric;
   say ;

}

compare 'YUP', 'YUP'; compare 'BALL', 'BELL'; compare 24, 123; compare 5.1, 5;</lang>

Output:
Str 'YUP' and Str 'YUP' are lexically equal
Str 'YUP' is not lexically lower than Str 'YUP'
Str 'YUP' is not lexically higher than Str 'YUP'
Str 'YUP' and Str 'YUP' are generically equal
Str 'YUP' is not generically lower than Str 'YUP'
Str 'YUP' is not generically higher than Str 'YUP'
The lexical relationship of Str 'YUP' and Str 'YUP' is Same
The generic relationship of Str 'YUP' and Str 'YUP' is Same

Str 'BALL' and Str 'BELL' are not lexically equal
Str 'BALL' is lexically lower than Str 'BELL'
Str 'BALL' is not lexically higher than Str 'BELL'
Str 'BALL' and Str 'BELL' are not generically equal
Str 'BALL' is generically higher than Str 'BELL'
Str 'BALL' is not generically lower than Str 'BELL'
The lexical relationship of Str 'BALL' and Str 'BELL' is Increase
The generic relationship of Str 'BALL' and Str 'BELL' is Increase

Int '24' and Int '123' are not lexically equal
Int '24' is lexically higher than Int '123'
Int '24' is not lexically lower than Int '123'
Int '24' and Int '123' are not generically equal
Int '24' is generically higher than Int '123'
Int '24' is not generically lower than Int '123'
The generic relationship of Int '24' and Int '123' is Increase
The numeric relationship of Int '24' and Int '123' is Increase

Rat '5.1' and Int '5' are not lexically equal
Rat '5.1' is lexically higher than Int '5'
Rat '5.1' is not lexically lower than Int '5'
Rat '5.1' and Int '5' are not generically equal
Rat '5.1' is generically lower than Int '5'
Rat '5.1' is not generically higher than Int '5'
The generic relationship of Rat '5.1' and Int '5' is Decrease
The numeric relationship of Rat '5.1' and Int '5' is Decrease

Python

Note that Python is strongly typed. The string '24' is never coerced to a number, (or vice versa). <lang python>def compare(a, b):

   print("\n%r is of type %r and %r is of type %r"
         % (a, type(a), b, type(b)))
   if a <  b:      print('%r is strictly less than  %r' % (a, b))
   if a <= b:      print('%r is less than or equal to %r' % (a, b))
   if a >  b:      print('%r is strictly greater than  %r' % (a, b))
   if a >= b:      print('%r is greater than or equal to %r' % (a, b))
   if a == b:      print('%r is equal to %r' % (a, b))
   if a != b:      print('%r is not equal to %r' % (a, b))
   if a is b:      print('%r has object identity with %r' % (a, b))
   if a is not b:  print('%r has negated object identity with %r' % (a, b))

compare('YUP', 'YUP') compare('BALL', 'BELL') compare('24', '123') compare(24, 123) compare(5.0, 5)</lang>

Output:
'YUP' is of type <class 'str'> and 'YUP' is of type <class 'str'>
'YUP' is less than or equal to 'YUP'
'YUP' is greater than or equal to 'YUP'
'YUP' is equal to 'YUP'
'YUP' has object identity with 'YUP'

'BALL' is of type <class 'str'> and 'BELL' is of type <class 'str'>
'BALL' is strictly less than  'BELL'
'BALL' is less than or equal to 'BELL'
'BALL' is not equal to 'BELL'
'BALL' has negated object identity with 'BELL'

'24' is of type <class 'str'> and '123' is of type <class 'str'>
'24' is strictly greater than  '123'
'24' is greater than or equal to '123'
'24' is not equal to '123'
'24' has negated object identity with '123'

24 is of type <class 'int'> and 123 is of type <class 'int'>
24 is strictly less than  123
24 is less than or equal to 123
24 is not equal to 123
24 has negated object identity with 123

5.0 is of type <class 'float'> and 5 is of type <class 'int'>
5.0 is less than or equal to 5
5.0 is greater than or equal to 5
5.0 is equal to 5
5.0 has negated object identity with 5

REXX

<lang rexx>animal = 'dog' if animal = 'cat' then

 say animal "is lexically equal to cat"

if animal != 'cat' then

 say animal "is not lexically equal cat"

if animal > 'cat' then

 say animal "is lexically higher than cat"

if animal < 'cat' then

 say animal "is lexically lower than cat"

if animal >= 'cat' then

 say animal "is not lexically lower than cat"

if animal <= 'cat' then

 say animal "is not lexically higher than cat"

/* The above comparative operators do not consider

  leading and trailing whitespace when making comparisons. */

if ' cat ' = 'cat' then

 say "this will print because whitespace is stripped"

/* To consider all whitespace in a comparison

  we need to use strict comparative operators */

if ' cat ' == 'cat' then

 say "this will not print because comparison is strict"</lang>

Here is a list of the strict comparative operators and their meaning:

  • == Strictly Equal To
  • !\== Strictly Not Equal To
  • << Strictly Less Than
  • >> Strictly Greater Than
  • <<= Strictly Less Than or Equal To
  • >>= Strictly Greater Than or Equal To
  • !\<< Strictly Not Less Than
  • !\>> Strictly Not Greater Than

Run BASIC

<lang runbasic>a$ = "dog" b$ = "cat" if a$ = b$ then print "the strings are equal" ' test for equalitY if a$ <> b$ then print "the strings are not equal" ' test for inequalitY if a$ > b$ then print a$;" is lexicallY higher than ";b$ ' test for lexicallY higher if a$ < b$ then print a$;" is lexicallY lower than ";b$ ' test for lexicallY lower if a$ <= b$ then print a$;" is not lexicallY higher than ";b$ if a$ >= b$ then print a$;" is not lexicallY lower than ";b$ end</lang>

Tcl

The best way to compare two strings in Tcl for equality is with the eq and ne expression operators: <lang tcl>if {$a eq $b} {

   puts "the strings are equal"

} if {$a ne $b} {

   puts "the strings are not equal"

}</lang> The numeric == and != operators also mostly work, but can give somewhat unexpected results when the both the values look numeric. The string equal command is equally suited to equality-testing (and generates the same bytecode).

For ordering, the < and > operators may be used, but again they are principally numeric operators. For guaranteed string ordering, the result of the string compare command should be used instead (which uses the unicode codepoints of the string): <lang tcl>if {[string compare $a $b] < 0} {

   puts "first string lower than second"

} if {[string compare $a $b] > 0} {

   puts "first string higher than second"

}</lang> Greater-or-equal and less-or-equal operations can be done by changing what exact comparison is used on the result of the string compare.

Tcl also can do a prefix-equal (approximately the same as strncmp() in C) through the use of the -length option: <lang tcl>if {[string equal -length 3 $x "abc123"]} {

   puts "first three characters are equal"

}</lang> And case-insensitive equality is (orthogonally) enabled through the -nocase option. These options are supported by both string equal and string compare, but not by the expression operators.

UNIX Shell

<lang sh>#!/bin/sh

A=Bell B=Ball

  1. Traditional test command implementations test for equality and inequality
  2. but do not have a lexical comparison facility

if [ $A = $B ] ; then

 ECHO 'The strings are equal'

fi if [ $A != $B ] ; then

 ECHO 'The strings are not equal'

fi</lang>