String comparison: Difference between revisions

From Rosetta Code
Content added Content deleted
m (mispaste and typo fixes)
m (wordliness)
Line 5: Line 5:
* Comparing two strings to see if one is lexically lower than the other
* Comparing two strings to see if one is lexically lower than the other
* Comparing two strings to see if one is lexically higher than the other
* Comparing two strings to see if one is lexically higher than the other
* Show whether or not the string matching operations within the language are lettercase sensitive
* How to achieve both case sensitive comparisons and case insensitive comparisons within the language


'''Bonus:'''
'''Bonus:'''

Revision as of 21:25, 23 February 2013

String comparison is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Basic Data Operation
This is a basic data operation. It represents a fundamental action on a basic data type.

You may see other such operations in the Basic Data Operations category, or:

Integer Operations
Arithmetic | Comparison

Boolean Operations
Bitwise | Logical

String Operations
Concatenation | Interpolation | Comparison | Matching

Memory Operations
Pointers & references | Addresses

The task is to demonstrate how to compare two strings from within the language and how to achieve a lexical comparison. The task should demonstrate:

  • Comparing two strings for exact equality
  • Comparing two strings for inequality (i.e., the inverse of exact equality)
  • Comparing two strings to see if one is lexically lower than the other
  • Comparing two strings to see if one is lexically higher than the other
  • How to achieve both case sensitive comparisons and case insensitive comparisons within the language

Bonus:

  • Demonstrate the other kinds of string comparisons that the language provides. For example, demonstrate the difference between generic comparison and coercive comparison if your language supports such a distinction. If not, show how you would compare two numbers lexically rather than numerically.

See also:

AWK

<lang awk>BEGIN {

 a="BALL"
 b="BELL"
 IF (a == b) { print "The strings are equal" }
 IF (a != b) { print "The strings are not equal" }
 IF (a > b) { print "The first string is lexically higher than the second" }
 IF (a < b) { print "The first string is lexically lower than the second" }
 IF (a >= b) { print "The first string is not lexically lower than the second" }
 IF (a <= b) { print "The first string is not lexically higher than the second" }

}</lang>

BASIC

<lang basic>10 LET "A$="BELL" 20 LET B$="BELT" 30 IF A$ = B$ THEN PRINT "THE STRINGS ARE EQUAL": REM TEST FOR EQUALITY 40 IF A$ <> B$ THEN PRINT "THE STRINGS ARE NOT EQUAL": REM TEST FOR INEQUALITY 50 IF A$ > B$ THEN PRINT A$;" IS LEXICALLY HIGHER THAN ";B$: REM TEST FOR LEXICALLY HIGHER 60 IF A$ < B$ THEN PRINT A$;" IS LEXICALLY LOWER THAN ";B$: REM TEST FOR LEXICALLY LOWER 70 IF A$ <= B$ THEN PRINT A$;" IS NOT LEXICALLY HIGHER THAN ";B$ 80 IF A$ >= B$ THEN PRINT A$;" IS NOT LEXICALLY LOWER THAN ";B$ 90 END</lang>

Burlesque

<lang burlesque> blsq ) "abc""abc"== 1 blsq ) "abc""abc"!= 0 blsq ) "abc""Abc"cm 1 blsq ) "ABC""Abc"cm -1 </lang>

cm is used for comparision which returns 1,0,-1 like C's strcmp. == is Equal and != is NotEqual.

J

Solution: The primitive -: can be used to determine whether two strings are equivalent, but J doesn't have other inbuilt lexical comparison operators. They can defined as follows: <lang j>eq=: -: NB. equal ne=: -.@-: NB. not equal gt=: {.@/:@,&boxopen *. ne NB. lexically greater than lt=: -.@{.@/:@,&boxopen *. ne NB. lexically less than ge=: {.@/:@,&boxopen +. eq NB. lexically greater than or equal to le=: -.@{.@/:@,&boxopen NB. lexically less than or equal to</lang>

Usage: <lang j> 'ball' (eq , ne , gt , lt , ge , le) 'bell' 0 1 0 1 0 1

  'ball' (eq , ne , gt , lt , ge , le) 'ball'

1 0 0 0 1 1

  'YUP' (eq , ne , gt , lt , ge , le) 'YEP'

0 1 1 0 1 0</lang>

Perl 6

Perl 6 uses strong typing dynamically (and gradual typing statically), but normal string and numeric comparisons are coercive. (You may use generic comparison operators if you want polymorphic comparison—but usually you don't. :) <lang perl6>sub compare($a,$b) {

   my $A = "{$a.WHAT.^name} '$a'";
   my $B = "{$b.WHAT.^name} '$b'";
   if $a eq $b { say "$A and $B are lexically equal" }
   if $a ne $b { say "$A and $B are not lexically equal" }
   if $a gt $b { say "$A is lexically after $B" }
   if $a lt $b { say "$A is lexically before than $B" }
   if $a ge $b { say "$A is not lexically before $B" }
   if $a le $b { say "$A is not lexically after $B" }
   if $a === $b { say "$A and $B are identical objects" }
   if $a !=== $b { say "$A and $B are not identical objects" }
   if $a eqv $b { say "$A and $B are generically equal" }
   if $a !eqv $b { say "$A and $B are not generically equal" }
   if $a before $b { say "$A is generically after $B" }
   if $a after $b { say "$A is generically before $B" }
   if $a !after $b { say "$A is not generically before $B" }
   if $a !before $b { say "$A is not generically after $B" }
   say "The lexical relationship of $A and $B is { $a leg $b }" if $a ~~ Stringy;
   say "The generic relationship of $A and $B is { $a cmp $b }";
   say "The numeric relationship of $A and $B is { $a <=> $b }" if $a ~~ Numeric;
   say ;

}

compare 'YUP', 'YUP'; compare 'BALL', 'BELL'; compare 24, 123; compare 5.1, 5; compare 5.1e0, 5 + 1/10;</lang>

Output:
Str 'YUP' and Str 'YUP' are lexically equal
Str 'YUP' is not lexically before Str 'YUP'
Str 'YUP' is not lexically after Str 'YUP'
Str 'YUP' and Str 'YUP' are identical objects
Str 'YUP' and Str 'YUP' are generically equal
Str 'YUP' is not generically before Str 'YUP'
Str 'YUP' is not generically after Str 'YUP'
The lexical relationship of Str 'YUP' and Str 'YUP' is Same
The generic relationship of Str 'YUP' and Str 'YUP' is Same

Str 'BALL' and Str 'BELL' are not lexically equal
Str 'BALL' is lexically before than Str 'BELL'
Str 'BALL' is not lexically after Str 'BELL'
Str 'BALL' and Str 'BELL' are not identical objects
Str 'BALL' and Str 'BELL' are not generically equal
Str 'BALL' is generically after Str 'BELL'
Str 'BALL' is not generically before Str 'BELL'
The lexical relationship of Str 'BALL' and Str 'BELL' is Increase
The generic relationship of Str 'BALL' and Str 'BELL' is Increase

Int '24' and Int '123' are not lexically equal
Int '24' is lexically after Int '123'
Int '24' is not lexically before Int '123'
Int '24' and Int '123' are not identical objects
Int '24' and Int '123' are not generically equal
Int '24' is generically after Int '123'
Int '24' is not generically before Int '123'
The generic relationship of Int '24' and Int '123' is Increase
The numeric relationship of Int '24' and Int '123' is Increase

Rat '5.1' and Int '5' are not lexically equal
Rat '5.1' is lexically after Int '5'
Rat '5.1' is not lexically before Int '5'
Rat '5.1' and Int '5' are not identical objects
Rat '5.1' and Int '5' are not generically equal
Rat '5.1' is generically before Int '5'
Rat '5.1' is not generically after Int '5'
The generic relationship of Rat '5.1' and Int '5' is Decrease
The numeric relationship of Rat '5.1' and Int '5' is Decrease

Num '5.1' and Rat '5.1' are lexically equal
Num '5.1' is not lexically before Rat '5.1'
Num '5.1' is not lexically after Rat '5.1'
Num '5.1' and Rat '5.1' are not identical objects
Num '5.1' and Rat '5.1' are not generically equal
Num '5.1' is not generically before Rat '5.1'
Num '5.1' is not generically after Rat '5.1'
The generic relationship of Num '5.1' and Rat '5.1' is Same
The numeric relationship of Num '5.1' and Rat '5.1' is Same

Python

Note that Python is strongly typed. The string '24' is never coerced to a number, (or vice versa). <lang python>def compare(a, b):

   print("\n%r is of type %r and %r is of type %r"
         % (a, type(a), b, type(b)))
   if a <  b:      print('%r is strictly less than  %r' % (a, b))
   if a <= b:      print('%r is less than or equal to %r' % (a, b))
   if a >  b:      print('%r is strictly greater than  %r' % (a, b))
   if a >= b:      print('%r is greater than or equal to %r' % (a, b))
   if a == b:      print('%r is equal to %r' % (a, b))
   if a != b:      print('%r is not equal to %r' % (a, b))
   if a is b:      print('%r has object identity with %r' % (a, b))
   if a is not b:  print('%r has negated object identity with %r' % (a, b))

compare('YUP', 'YUP') compare('BALL', 'BELL') compare('24', '123') compare(24, 123) compare(5.0, 5)</lang>

Output:
'YUP' is of type <class 'str'> and 'YUP' is of type <class 'str'>
'YUP' is less than or equal to 'YUP'
'YUP' is greater than or equal to 'YUP'
'YUP' is equal to 'YUP'
'YUP' has object identity with 'YUP'

'BALL' is of type <class 'str'> and 'BELL' is of type <class 'str'>
'BALL' is strictly less than  'BELL'
'BALL' is less than or equal to 'BELL'
'BALL' is not equal to 'BELL'
'BALL' has negated object identity with 'BELL'

'24' is of type <class 'str'> and '123' is of type <class 'str'>
'24' is strictly greater than  '123'
'24' is greater than or equal to '123'
'24' is not equal to '123'
'24' has negated object identity with '123'

24 is of type <class 'int'> and 123 is of type <class 'int'>
24 is strictly less than  123
24 is less than or equal to 123
24 is not equal to 123
24 has negated object identity with 123

5.0 is of type <class 'float'> and 5 is of type <class 'int'>
5.0 is less than or equal to 5
5.0 is greater than or equal to 5
5.0 is equal to 5
5.0 has negated object identity with 5

REXX

<lang rexx>animal = 'dog' if animal = 'cat' then

 say animal "is lexically equal to cat"

if animal != 'cat' then

 say animal "is not lexically equal cat"

if animal > 'cat' then

 say animal "is lexically higher than cat"

if animal < 'cat' then

 say animal "is lexically lower than cat"

if animal >= 'cat' then

 say animal "is not lexically lower than cat"

if animal <= 'cat' then

 say animal "is not lexically higher than cat"

/* The above comparative operators do not consider

  leading and trailing whitespace when making comparisons. */

if ' cat ' = 'cat' then

 say "this will print because whitespace is stripped"

/* To consider all whitespace in a comparison

  we need to use strict comparative operators */

if ' cat ' == 'cat' then

 say "this will not print because comparison is strict"</lang>

Here is a list of the strict comparative operators and their meaning:

  • == Strictly Equal To
  • !\== Strictly Not Equal To
  • << Strictly Less Than
  • >> Strictly Greater Than
  • <<= Strictly Less Than or Equal To
  • >>= Strictly Greater Than or Equal To
  • !\<< Strictly Not Less Than
  • !\>> Strictly Not Greater Than

Run BASIC

<lang runbasic>a$ = "dog" b$ = "cat" if a$ = b$ then print "the strings are equal" ' test for equalitY if a$ <> b$ then print "the strings are not equal" ' test for inequalitY if a$ > b$ then print a$;" is lexicallY higher than ";b$ ' test for lexicallY higher if a$ < b$ then print a$;" is lexicallY lower than ";b$ ' test for lexicallY lower if a$ <= b$ then print a$;" is not lexicallY higher than ";b$ if a$ >= b$ then print a$;" is not lexicallY lower than ";b$ end</lang>

Tcl

The best way to compare two strings in Tcl for equality is with the eq and ne expression operators: <lang tcl>if {$a eq $b} {

   puts "the strings are equal"

} if {$a ne $b} {

   puts "the strings are not equal"

}</lang> The numeric == and != operators also mostly work, but can give somewhat unexpected results when the both the values look numeric. The string equal command is equally suited to equality-testing (and generates the same bytecode).

For ordering, the < and > operators may be used, but again they are principally numeric operators. For guaranteed string ordering, the result of the string compare command should be used instead (which uses the unicode codepoints of the string): <lang tcl>if {[string compare $a $b] < 0} {

   puts "first string lower than second"

} if {[string compare $a $b] > 0} {

   puts "first string higher than second"

}</lang> Greater-or-equal and less-or-equal operations can be done by changing what exact comparison is used on the result of the string compare.

Tcl also can do a prefix-equal (approximately the same as strncmp() in C) through the use of the -length option: <lang tcl>if {[string equal -length 3 $x "abc123"]} {

   puts "first three characters are equal"

}</lang> And case-insensitive equality is (orthogonally) enabled through the -nocase option. These options are supported by both string equal and string compare, but not by the expression operators.

UNIX Shell

<lang sh>#!/bin/sh

A=Bell B=Ball

  1. Traditional test command implementations test for equality and inequality
  2. but do not have a lexical comparison facility

if [ $A = $B ] ; then

 ECHO 'The strings are equal'

fi if [ $A != $B ] ; then

 ECHO 'The strings are not equal'

fi</lang>