Regular expressions: Difference between revisions

From Rosetta Code
Content added Content deleted
(Added Scala)
m (Fixed lang tags.)
Line 7: Line 7:
=={{header|AppleScript}}==
=={{header|AppleScript}}==
{{libheader|Satimage.osax}}
{{libheader|Satimage.osax}}
<lang applescript>try
try
find text ".*string$" in "I am a string" with regexp
find text ".*string$" in "I am a string" with regexp
on error message
on error message
return message
return message
end try
end try


try
try
change "original" into "modified" in "I am the original string" with regexp
change "original" into "modified" in "I am the original string" with regexp
on error message
on error message
return message
return message
end try
end try</lang>


=={{header|ALGOL 68}}==
=={{header|ALGOL 68}}==
Line 24: Line 24:
<!-- {{does not work with|ALGOL 68|Standard - grep/sub in string are not part of the standard's prelude. }} -->
<!-- {{does not work with|ALGOL 68|Standard - grep/sub in string are not part of the standard's prelude. }} -->
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}}
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}}
<lang algol>INT match=0, no match=1, out of memory error=2, other error=3;
<lang algol68>INT match=0, no match=1, out of memory error=2, other error=3;


STRING str := "i am a string";
STRING str := "i am a string";
Line 51: Line 51:
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}}
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}}


For example:<lang algol>
For example:<lang algol68>FORMAT pattern = $ddd" "c("cats","dogs")$;
FORMAT pattern = $ddd" "c("cats","dogs")$;
FILE file; STRING book; associate(file, book);
FILE file; STRING book; associate(file, book);
on value error(file, (REF FILE f)BOOL: stop);
on value error(file, (REF FILE f)BOOL: stop);
Line 71: Line 70:


=={{header|AutoHotkey}}==
=={{header|AutoHotkey}}==
<lang AutoHotkey>
<lang AutoHotkey>MsgBox % foundpos := RegExMatch("Hello World", "World$")
MsgBox % foundpos := RegExMatch("Hello World", "World$")
MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself")</lang>
MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself")
</lang>


=={{header|AWK}}==
=={{header|AWK}}==
AWK supports regular expressions, which are typically marked up with slashes in front and back, and the "~" operator:
AWK supports regular expressions, which are typically marked up with slashes in front and back, and the "~" operator:
$ awk '{if($0~/[A-Z]/)print "uppercase detected"}'
<lang awk>$ awk '{if($0~/[A-Z]/)print "uppercase detected"}'
abc
abc
ABC
ABC
uppercase detected
uppercase detected</lang>
As shorthand, a regular expression in the condition part fires if it matches an input line:
As shorthand, a regular expression in the condition part fires if it matches an input line:
awk '/[A-Z]/{print "uppercase detected"}'
<lang awk>awk '/[A-Z]/{print "uppercase detected"}'
def
def
DeF
DeF
uppercase detected
uppercase detected</lang>
For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match):
For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match):
$ awk '{gsub(/[A-Z]/,"*");print}'
<lang awk>$ awk '{gsub(/[A-Z]/,"*");print}'
abCDefG
abCDefG
ab**ef*
ab**ef*
$ awk '{gsub(/[A-Z]/,"(&)");print}'
$ awk '{gsub(/[A-Z]/,"(&)");print}'
abCDefGH
abCDefGH
ab(C)(D)ef(G)(H)
ab(C)(D)ef(G)(H)</lang>
This variant matches one or more uppercase letters in one round:
This variant matches one or more uppercase letters in one round:
$ awk '{gsub(/[A-Z]+/,"(&)");print}'
<lang awk>$ awk '{gsub(/[A-Z]+/,"(&)");print}'
abCDefGH
abCDefGH
ab(CD)ef(GH)
ab(CD)ef(GH)</lang>


=={{header|C}}==
=={{header|C}}==
Line 152: Line 149:


{{libheader|Boost}}
{{libheader|Boost}}
<lang cpp> #include <iostream>
<lang cpp>#include <iostream>
#include <string>
#include <string>
#include <iterator>
#include <iterator>
#include <boost/regex.hpp>
#include <boost/regex.hpp>

int main()
int main()
{
boost::regex re(".* string$");
std::string s = "Hi, I am a string";

// match the complete string
if (boost::regex_match(s, re))
std::cout << "The string matches.\n";
else
std::cout << "Oops - not found?\n";

// match a substring
boost::regex re2(" a.*a");
boost::smatch match;
if (boost::regex_search(s, match, re2))
{
{
std::cout << "Matched " << match.length()
boost::regex re(".* string$");
<< " characters starting at " << match.position() << ".\n";
std::string s = "Hi, I am a string";
std::cout << "Matched character sequence: \""
<< match.str() << "\"\n";
// match the complete string
}
if (boost::regex_match(s, re))
else
std::cout << "The string matches.\n";
else
{
std::cout << "Oops - not found?\n";
std::cout << "Oops - not found?\n";
}

// match a substring
// replace a substring
boost::regex re2(" a.*a");
std::string dest_string;
boost::smatch match;
boost::regex_replace(std::back_inserter(dest_string),
if (boost::regex_search(s, match, re2))
s.begin(), s.end(),
{
std::cout << "Matched " << match.length()
re2,
<< " characters starting at " << match.position() << ".\n";
"'m now a changed");
std::cout << "Matched character sequence: \""
std::cout << dest_string << std::endl;
}</lang>
<< match.str() << "\"\n";
}
else
{
std::cout << "Oops - not found?\n";
}
// replace a substring
std::string dest_string;
boost::regex_replace(std::back_inserter(dest_string),
s.begin(), s.end(),
re2,
"'m now a changed");
std::cout << dest_string << std::endl;
}</lang>


=={{header|C sharp|C#}}==
=={{header|C sharp|C#}}==
Line 235: Line 232:


=={{header|D}}==
=={{header|D}}==
<lang d> import std.stdio, std.regexp;
<lang d>import std.stdio, std.regexp;


void main() {
void main() {
string s = "I am a string";
string s = "I am a string";


// Test:
// Test:
if (search(s, r"string$"))
if (search(s, r"string$"))
writefln("Ends with 'string'");
writefln("Ends with 'string'");


// Test, storing the regular expression:
// Test, storing the regular expression:
auto re1 = RegExp(r"string$");
auto re1 = RegExp(r"string$");
if (re1.search(s).test)
if (re1.search(s).test)
writefln("Ends with 'string'");
writefln("Ends with 'string'");


// Substitute:
// Substitute:
writefln(sub(s, " a ", " another "));
writefln(sub(s, " a ", " another "));


// Substitute, storing the regular expression:
// Substitute, storing the regular expression:
auto re2 = RegExp(" a ");
auto re2 = RegExp(" a ");
writefln(re2.replace(s, " another "));
writefln(re2.replace(s, " another "));
}</lang>
}</lang>


Note that in std.string there are string functions to perform those string operations in a faster way.
Note that in std.string there are string functions to perform those string operations in a faster way.


=={{header|Erlang}}==
=={{header|Erlang}}==
<lang erlang>
<lang erlang>match() ->
match() ->
String = "This is a string",
String = "This is a string",
case re:run(String, "string$") of
case re:run(String, "string$") of
Line 271: Line 267:
String = "This is a string",
String = "This is a string",
NewString = re:replace(String, " a ", " another ", [{return, list}]),
NewString = re:replace(String, " a ", " another ", [{return, list}]),
io:format("~s~n",[NewString]).
io:format("~s~n",[NewString]).</lang>
</lang>


=={{header|Forth}}==
=={{header|Forth}}==
{{libheader|Forth Foundation Library}}
{{libheader|Forth Foundation Library}}
Test/Match
Test/Match
include ffl/rgx.fs
<lang forth>include ffl/rgx.fs

\ Create a regular expression variable 'exp' in the dictionary
\ Create a regular expression variable 'exp' in the dictionary

rgx-create exp
rgx-create exp

\ Compile an expression
\ Compile an expression

s" Hello (World)" exp rgx-compile [IF]
s" Hello (World)" exp rgx-compile [IF]
.( Regular expression successful compiled.) cr
.( Regular expression successful compiled.) cr
[THEN]
[THEN]

\ (Case sensitive) match a string with the expression
\ (Case sensitive) match a string with the expression

s" Hello World" exp rgx-cmatch? [IF]
s" Hello World" exp rgx-cmatch? [IF]
.( String matches with the expression.) cr
.( String matches with the expression.) cr
[ELSE]
[ELSE]
.( No match.) cr
.( No match.) cr
[THEN]
[THEN]</lang>


=={{header|Haskell}}==
=={{header|Haskell}}==
Test
Test
<lang haskell> import Text.Regex
<lang haskell>import Text.Regex

str = "I am a string"
str = "I am a string"

case matchRegex (mkRegex ".*string$") str of
case matchRegex (mkRegex ".*string$") str of
Just _ -> putStrLn $ "ends with 'string'"
Just _ -> putStrLn $ "ends with 'string'"
Nothing -> return ()</lang>
Nothing -> return ()</lang>


Substitute
Substitute
<lang haskell> import Text.Regex
<lang haskell>import Text.Regex

orig = "I am the original string"
orig = "I am the original string"
result = subRegex (mkRegex "original") orig "modified"
result = subRegex (mkRegex "original") orig "modified"
putStrLn $ result</lang>
putStrLn $ result</lang>


=={{header|J}}==
=={{header|J}}==
Line 318: Line 313:
J's regex support is built on top of PCRE.
J's regex support is built on top of PCRE.


load'regex' NB. Load regex library
<lang j>load'regex' NB. Load regex library
str =: 'I am a string' NB. String used in examples.
str =: 'I am a string' NB. String used in examples.</lang>


Matching:
Matching:
'.*string$' rxeq str NB. 1 is true, 0 is false
<lang j> '.*string$' rxeq str NB. 1 is true, 0 is false
1</lang>
1


Substitution:
Substitution:
('am';'am still') rxrplc str
<lang j> ('am';'am still') rxrplc str
I am still a string
I am still a string</lang>


=={{header|Java}}==
=={{header|Java}}==
Line 335: Line 330:
Test
Test


<lang java> String str = "I am a string";
<lang java>String str = "I am a string";
if (str.matches(".*string$")) {
if (str.matches(".*string$")) {
System.out.println("ends with 'string'");
System.out.println("ends with 'string'");
}</lang>
}</lang>


Substitute
Substitute


<lang java> String orig = "I am the original string";
<lang java>String orig = "I am the original string";
String result = orig.replaceAll("original", "modified");
String result = orig.replaceAll("original", "modified");
// result is now "I am the modified string"</lang>
// result is now "I am the modified string"</lang>


=={{header|JavaScript}}==
=={{header|JavaScript}}==
Test/Match
Test/Match
<lang javascript> var subject = "Hello world!";
<lang javascript>var subject = "Hello world!";

// Two different ways to create the RegExp object
// Two different ways to create the RegExp object
// Both examples use the exact same pattern... matching "hello"
// Both examples use the exact same pattern... matching "hello"
var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity
var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity
var re_PatternToMatch2 = new RegExp("Hello (World)", "i");
var re_PatternToMatch2 = new RegExp("Hello (World)", "i");

// Test for a match - return a bool
// Test for a match - return a bool
var isMatch = re_PatternToMatch.test(subject);
var isMatch = re_PatternToMatch.test(subject);

// Get the match details
// Get the match details
// Returns an array with the match's details
// Returns an array with the match's details
// matches[0] == "Hello world"
// matches[0] == "Hello world"
// matches[1] == "world"
// matches[1] == "world"
var matches = re_PatternToMatch2.exec(subject);</lang>
var matches = re_PatternToMatch2.exec(subject);</lang>


Substitute
Substitute
<lang javascript> var subject = "Hello world!";
<lang javascript>var subject = "Hello world!";

// Perform a string replacement
// Perform a string replacement
// newSubject == "Replaced!"
// newSubject == "Replaced!"
var newSubject = subject.replace(re_PatternToMatch, "Replaced");</lang>
var newSubject = subject.replace(re_PatternToMatch, "Replaced");</lang>


=={{header|M4}}==
=={{header|M4}}==
<lang M4>regexp(`GNUs not Unix', `\<[a-z]\w+')
<lang M4>
regexp(`GNUs not Unix', `\<[a-z]\w+')
regexp(`GNUs not Unix', `\<[a-z]\(\w+\)', `a \& b \1 c')</lang>
regexp(`GNUs not Unix', `\<[a-z]\(\w+\)', `a \& b \1 c')
</lang>


Output:
Output:
Line 386: Line 379:
Test
Test
{{works with|Mac OS X|10.4+}}
{{works with|Mac OS X|10.4+}}
<lang objc>
<lang objc>NSString *str = @"I am a string";
NSString *str = @"I am a string";
NSString *regex = @".*string$";
NSString *regex = @".*string$";


Line 394: Line 386:
if ([pred evaluateWithObject:str]) {
if ([pred evaluateWithObject:str]) {
NSLog(@"ends with 'string'");
NSLog(@"ends with 'string'");
}</lang>
}
</lang>
Unfortunately this method cannot find the location of the match or do substitution.
Unfortunately this method cannot find the location of the match or do substitution.


Line 401: Line 392:
=== With the standard library ===
=== With the standard library ===
Test
Test
<lang ocaml> #load "str.cma";;
<lang ocaml>#load "str.cma";;
let str = "I am a string";;
let str = "I am a string";;
try
try
ignore(Str.search_forward (Str.regexp ".*string$") str 0);
ignore(Str.search_forward (Str.regexp ".*string$") str 0);
print_endline "ends with 'string'"
print_endline "ends with 'string'"
with Not_found -> ()
with Not_found -> ()
;;</lang>
;;</lang>


Substitute
Substitute
<lang ocaml> #load "str.cma";;
<lang ocaml>#load "str.cma";;
let orig = "I am the original string";;
let orig = "I am the original string";;
let result = Str.global_replace (Str.regexp "original") "modified" orig;;
let result = Str.global_replace (Str.regexp "original") "modified" orig;;
(* result is now "I am the modified string" *)</lang>
(* result is now "I am the modified string" *)</lang>


=== Using Pcre ===
=== Using Pcre ===
Line 467: Line 458:
Test
Test


<lang php> if (preg_match('/string$/', $string))
<lang php>if (preg_match('/string$/', $string))
{
{
echo "Ends with 'string'\n";
echo "Ends with 'string'\n";
}</lang>
}</lang>


Replace
Replace


<lang php> $string = preg_replace('/\ba\b/', 'another', $string);
<lang php>$string = preg_replace('/\ba\b/', 'another', $string);
echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";</lang>
echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";</lang>


=={{header|PowerShell}}==
=={{header|PowerShell}}==
Line 495: Line 486:
=={{header|R}}==
=={{header|R}}==
First, define some strings.
First, define some strings.
<lang R>
<lang R>pattern <- "string"
pattern <- "string"
text1 <- "this is a matching string"
text1 <- "this is a matching string"
text2 <- "this does not match"
text2 <- "this does not match"</lang>
</lang>
Matching with grep. The indices of the texts containing matches are returned.
Matching with grep. The indices of the texts containing matches are returned.
<lang R>grep(pattern, c(text1, text2)) # 1</lang>
<lang R>
grep(pattern, c(text1, text2)) # 1
</lang>
Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches.
Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches.
<lang R>regexpr(pattern, c(text1, text2))</lang>
<lang R>
regexpr(pattern, c(text1, text2))
</lang>
[1] 20 -1
[1] 20 -1
attr(,"match.length")
attr(,"match.length")
[1] 6 -1
[1] 6 -1
Replacement
Replacement
<lang R>gsub(pattern, "pair of socks", c(text1, text2))</lang>
<lang R>
gsub(pattern, "pair of socks", c(text1, text2))
</lang>
[1] "this is a matching pair of socks" "this does not match"
[1] "this is a matching pair of socks" "this does not match"


=={{header|Raven}}==
=={{header|Raven}}==


'i am a string' as str
<lang raven>'i am a string' as str</lang>


Match:
Match:


str m/string$/
<lang raven>str m/string$/
if "Ends with 'string'\n" print
if "Ends with 'string'\n" print</lang>


Replace:
Replace:


str r/ a / another / print
<lang raven>str r/ a / another / print</lang>


=={{header|Ruby}}==
=={{header|Ruby}}==
Test
Test
<lang ruby> string="I am a string"
<lang ruby>string="I am a string"
puts "Ends with 'string'" if string[/string$/]
puts "Ends with 'string'" if string[/string$/]
puts "Does not start with 'You'" if !string[/^You/]</lang>
puts "Does not start with 'You'" if !string[/^You/]</lang>


Substitute
Substitute
<lang ruby> puts string.gsub(/ a /,' another ')
<lang ruby>puts string.gsub(/ a /,' another ')
#or
#or
string[/ a /]='another'
string[/ a /]='another'
puts string</lang>
puts string</lang>


Substitute using block
Substitute using block
<lang ruby> puts(string.gsub(/\bam\b/) do |match|
<lang ruby>puts(string.gsub(/\bam\b/) do |match|
puts "I found #{match}"
puts "I found #{match}"
#place "was" instead of the match
#place "was" instead of the match
"was"
"was"
end)</lang>
end)</lang>


=={{header|Scala}}==
=={{header|Scala}}==
Line 554: Line 537:
val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes
val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes
val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard
val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard
val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups
val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups</lang>
</lang>


Search and replace with string methods:
Search and replace with string methods:
<lang scala>"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match
<lang scala>"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match
"99 bottles of beer" replace ("99", "98") // Single replacement
"99 bottles of beer" replace ("99", "98") // Single replacement
"99 bottles of beer" replaceAll ("b", "B") // Multiple replacement
"99 bottles of beer" replaceAll ("b", "B") // Multiple replacement</lang>
</lang>


Search with regex methods:
Search with regex methods:
Line 569: Line 550:
Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None
Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None
Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes
Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes
val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name
val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name</lang>
</lang>


Using pattern matching with regex:
Using pattern matching with regex:
Line 585: Line 565:
for {
for {
matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match
matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match
} println("Matched from "+matched.start+" to "+matched.end)
} println("Matched from "+matched.start+" to "+matched.end)</lang>
</lang>


Replacing with regex:
Replacing with regex:
<lang scala>Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer")
<lang scala>Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer")
Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer")
Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer")</lang>
</lang>


=={{header|Slate}}==
=={{header|Slate}}==
Line 597: Line 575:
This library is still in its early stages. There isn't currently a feature to replace a substring.
This library is still in its early stages. There isn't currently a feature to replace a substring.


<lang slate>(Regex Matcher newOn: '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?')
<lang slate>
(Regex Matcher newOn: '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?')
`>> [match: 'http://slatelanguage.org/test/page?query'. subexpressionMatches]
`>> [match: 'http://slatelanguage.org/test/page?query'. subexpressionMatches]


"==> {"Dictionary traitsWindow" 0 -> 'http:'. 1 -> 'http'. 2 -> '//slatelanguage.org'.
"==> {"Dictionary traitsWindow" 0 -> 'http:'. 1 -> 'http'. 2 -> '//slatelanguage.org'.
3 -> 'slatelanguage.org'. 4 -> '/test/page'. 5 -> '?query'. 6 -> 'query'. 7 -> Nil}"
3 -> 'slatelanguage.org'. 4 -> '/test/page'. 5 -> '?query'. 6 -> 'query'. 7 -> Nil}"</lang>

</lang>


=={{header|Smalltalk}}==
=={{header|Smalltalk}}==
Line 652: Line 627:
Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings.
Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings.


#! Include the regex library
<lang toka>#! Include the regex library
needs regex
needs regex

#! The two test strings
#! The two test strings
" This is a string" is-data test.1
" This is a string" is-data test.1
" Another string" is-data test.2
" Another string" is-data test.2

#! Create a new regex named 'expression' which tries
#! Create a new regex named 'expression' which tries
#! to match strings beginning with 'This'.
#! to match strings beginning with 'This'.
" ^This" regex: expression
" ^This" regex: expression

#! An array to store the results of the match
#! An array to store the results of the match
#! (Element 0 = starting offset, Element 1 = ending offset of match)
#! (Element 0 = starting offset, Element 1 = ending offset of match)
2 cells is-array match
2 cells is-array match

#! Try both test strings against the expression.
#! Try both test strings against the expression.
#! try-regex will return a flag. -1 is TRUE, 0 is FALSE
#! try-regex will return a flag. -1 is TRUE, 0 is FALSE
expression test.1 2 match try-regex .
expression test.1 2 match try-regex .
expression test.2 2 match try-regex .
expression test.2 2 match try-regex .</lang>


=={{header|Vedit macro language}}==
=={{header|Vedit macro language}}==
Line 676: Line 651:


Match text at cursor location:
Match text at cursor location:
<lang vedit>
<lang vedit>if (Match(".* string$", REGEXP)==0) {
if (Match(".* string$", REGEXP)==0) {
Statline_Message("This line ends with 'string'")
Statline_Message("This line ends with 'string'")
}</lang>
}
</lang>


Search for a pattern:
Search for a pattern:
<lang vedit>
<lang vedit>if (Search("string$", REGEXP+NOERR)) {
if (Search("string$", REGEXP+NOERR)) {
Statline_Message("'string' at and of line found")
Statline_Message("'string' at and of line found")
}</lang>
}
</lang>


Replace:
Replace:
<lang vedit>Replace(" a ", " another ", REGEXP+NOERR)</lang>
<lang vedit>
Replace(" a ", " another ", REGEXP+NOERR)
</lang>

Revision as of 20:35, 21 November 2009

Task
Regular expressions
You are encouraged to solve this task according to the task description, using any language you may know.

The goal of this task is

  • to match a string against a regular expression
  • to substitute part of a string using a regular expression

AppleScript

Library: Satimage.osax

<lang applescript>try

   find text ".*string$" in "I am a string" with regexp

on error message

   return message

end try

try

   change "original" into "modified" in "I am the original string" with regexp

on error message

   return message

end try</lang>

ALGOL 68

The routines grep in strings and sub in string are not part of ALGOL 68's standard prelude.

Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386

<lang algol68>INT match=0, no match=1, out of memory error=2, other error=3;

STRING str := "i am a string";

  1. Match: #

STRING m := "string$"; INT start, end; IF grep in string(m, str, start, end) = match THEN printf(($"Ends with """g""""l$, str[start:end])) FI;

  1. Replace: #

IF sub in string(" a ", " another ",str) = match THEN printf(($gl$, str)) FI;</lang> Output:

Ends with "string"
i am another string

Standard ALGOL 68 does have an primordial form of pattern matching called a format. This is designed to extract values from input data. But it can also be used for outputting (and transputting) the original data.

Works with: ALGOL 68 version Standard - But declaring book as flex[]flex[]string
Works with: ALGOL 68G version Any - tested with release mk15-0.8b.fc9.i386

For example:<lang algol68>FORMAT pattern = $ddd" "c("cats","dogs")$; FILE file; STRING book; associate(file, book); on value error(file, (REF FILE f)BOOL: stop); on format error(file, (REF FILE f)BOOL: stop);

book := "100 dogs"; STRUCT(INT count, type) dalmatians;

getf(file, (pattern, dalmatians)); print(("Dalmatians: ", dalmatians, new line)); count OF dalmatians +:=1; printf(($"Gives: "$, pattern, dalmatians, $l$))</lang> Output:

Dalmatians:        +100         +2
Gives 101 dogs

AutoHotkey

<lang AutoHotkey>MsgBox % foundpos := RegExMatch("Hello World", "World$") MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself")</lang>

AWK

AWK supports regular expressions, which are typically marked up with slashes in front and back, and the "~" operator: <lang awk>$ awk '{if($0~/[A-Z]/)print "uppercase detected"}' abc ABC uppercase detected</lang> As shorthand, a regular expression in the condition part fires if it matches an input line: <lang awk>awk '/[A-Z]/{print "uppercase detected"}' def DeF uppercase detected</lang> For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match): <lang awk>$ awk '{gsub(/[A-Z]/,"*");print}' abCDefG ab**ef* $ awk '{gsub(/[A-Z]/,"(&)");print}' abCDefGH ab(C)(D)ef(G)(H)</lang> This variant matches one or more uppercase letters in one round: <lang awk>$ awk '{gsub(/[A-Z]+/,"(&)");print}' abCDefGH ab(CD)ef(GH)</lang>

C

Works with: POSIX

As far as I can see, POSIX defined function for regex matching, but nothing for substitution. So we must do all the hard work by hand. The complex-appearing code could be turned into a function.

<lang c>#include <stdio.h>

  1. include <stdlib.h>
  2. include <sys/types.h>
  3. include <regex.h>
  4. include <string.h>

int main() {

  regex_t preg;
  regmatch_t substmatch[1];
  const char *tp = "string$";
  const char *t1 = "this is a matching string";
  const char *t2 = "this is not a matching string!";
  const char *ss = "istyfied";
  
  regcomp(&preg, "string$", REG_EXTENDED);
  printf("'%s' %smatched with '%s'\n", t1,
                                       (regexec(&preg, t1, 0, NULL, 0)==0) ? "" : "did not ", tp);
  printf("'%s' %smatched with '%s'\n", t2,
                                       (regexec(&preg, t2, 0, NULL, 0)==0) ? "" : "did not ", tp);
  regfree(&preg);
  /* change "a[a-z]+" into "istifyed"?*/
  regcomp(&preg, "a[a-z]+", REG_EXTENDED);
  if ( regexec(&preg, t1, 1, substmatch, 0) == 0 )
  {
     //fprintf(stderr, "%d, %d\n", substmatch[0].rm_so, substmatch[0].rm_eo);
     char *ns = malloc(substmatch[0].rm_so + 1 + strlen(ss) +
                       (strlen(t1) - substmatch[0].rm_eo) + 2);
     memcpy(ns, t1, substmatch[0].rm_so+1);
     memcpy(&ns[substmatch[0].rm_so], ss, strlen(ss));
     memcpy(&ns[substmatch[0].rm_so+strlen(ss)], &t1[substmatch[0].rm_eo],
               strlen(&t1[substmatch[0].rm_eo]));
     ns[ substmatch[0].rm_so + strlen(ss) +
         strlen(&t1[substmatch[0].rm_eo]) ] = 0;
     printf("mod string: '%s'\n", ns);
     free(ns); 
  } else {
     printf("the string '%s' is the same: no matching!\n", t1);
  }
  regfree(&preg);
  
  return 0;

}</lang>

C++

Works with: g++ version 4.0.2
Library: Boost

<lang cpp>#include <iostream>

  1. include <string>
  2. include <iterator>
  3. include <boost/regex.hpp>

int main() {

 boost::regex re(".* string$");
 std::string s = "Hi, I am a string";
 // match the complete string
 if (boost::regex_match(s, re))
   std::cout << "The string matches.\n";
 else
   std::cout << "Oops - not found?\n";
 // match a substring
 boost::regex re2(" a.*a");
 boost::smatch match;
 if (boost::regex_search(s, match, re2))
 {
   std::cout << "Matched " << match.length()
             << " characters starting at " << match.position() << ".\n";
   std::cout << "Matched character sequence: \""
             << match.str() << "\"\n";
 }
 else
 {
   std::cout << "Oops - not found?\n";
 }
 // replace a substring
 std::string dest_string;
 boost::regex_replace(std::back_inserter(dest_string),
                      s.begin(), s.end(),
                      re2,
                      "'m now a changed");
 std::cout << dest_string << std::endl;

}</lang>

C#

<lang csharp>using System; using System.Text.RegularExpressions;

class Program {

   static void Main(string[] args) {
       string str = "I am a string";
       if (new Regex("string$").IsMatch(str)) {
           Console.WriteLine("Ends with string.");
       }
       str = new Regex(" a ").Replace(str, " another ");
       Console.WriteLine(str);
   }

}</lang>

Common Lisp

Translation of: Perl

Uses CL-PPCRE - Portable Perl-compatible regular expressions for Common Lisp.

<lang lisp>(let ((string "I am a string"))

 (when (cl-ppcre:scan "string$" string)
   (write-line "Ends with string"))
 (unless (cl-ppcre:scan "^You" string )
   (write-line "Does not start with 'You'")))</lang>

Substitute

<lang lisp>(let* ((string "I am a string")

      (string (cl-ppcre:regex-replace " a " string " another ")))
 (write-line string))</lang>

Test and Substitute

<lang lisp>(let ((string "I am a string"))

 (multiple-value-bind (string matchp)
     (cl-ppcre:regex-replace "\\bam\\b" string "was")
   (when matchp
     (write-line "I was able to find and replace 'am' with 'was'."))))</lang>

D

<lang d>import std.stdio, std.regexp;

void main() {

   string s = "I am a string";
   // Test:
   if (search(s, r"string$"))
       writefln("Ends with 'string'");
   // Test, storing the regular expression:
   auto re1 = RegExp(r"string$");
   if (re1.search(s).test)
       writefln("Ends with 'string'");
   // Substitute:
   writefln(sub(s, " a ", " another "));
   // Substitute, storing the regular expression:
   auto re2 = RegExp(" a ");
   writefln(re2.replace(s, " another "));

}</lang>

Note that in std.string there are string functions to perform those string operations in a faster way.

Erlang

<lang erlang>match() -> String = "This is a string", case re:run(String, "string$") of {match,_} -> io:format("Ends with 'string'~n"); _ -> ok end.

substitute() -> String = "This is a string", NewString = re:replace(String, " a ", " another ", [{return, list}]), io:format("~s~n",[NewString]).</lang>

Forth

Test/Match <lang forth>include ffl/rgx.fs

\ Create a regular expression variable 'exp' in the dictionary

rgx-create exp

\ Compile an expression

s" Hello (World)" exp rgx-compile [IF]

 .( Regular expression successful compiled.) cr

[THEN]

\ (Case sensitive) match a string with the expression

s" Hello World" exp rgx-cmatch? [IF]

 .( String matches with the expression.) cr

[ELSE]

 .( No match.) cr

[THEN]</lang>

Haskell

Test <lang haskell>import Text.Regex

str = "I am a string"

case matchRegex (mkRegex ".*string$") str of

 Just _  -> putStrLn $ "ends with 'string'"
 Nothing -> return ()</lang>

Substitute <lang haskell>import Text.Regex

orig = "I am the original string" result = subRegex (mkRegex "original") orig "modified" putStrLn $ result</lang>

J

J's regex support is built on top of PCRE.

<lang j>load'regex' NB. Load regex library str =: 'I am a string' NB. String used in examples.</lang>

Matching:

<lang j> '.*string$' rxeq str NB. 1 is true, 0 is false 1</lang>

Substitution:

<lang j> ('am';'am still') rxrplc str I am still a string</lang>

Java

Works with: Java version 1.5+

Test

<lang java>String str = "I am a string"; if (str.matches(".*string$")) {

 System.out.println("ends with 'string'");

}</lang>

Substitute

<lang java>String orig = "I am the original string"; String result = orig.replaceAll("original", "modified"); // result is now "I am the modified string"</lang>

JavaScript

Test/Match <lang javascript>var subject = "Hello world!";

// Two different ways to create the RegExp object // Both examples use the exact same pattern... matching "hello" var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity var re_PatternToMatch2 = new RegExp("Hello (World)", "i");

// Test for a match - return a bool var isMatch = re_PatternToMatch.test(subject);

// Get the match details // Returns an array with the match's details // matches[0] == "Hello world" // matches[1] == "world" var matches = re_PatternToMatch2.exec(subject);</lang>

Substitute <lang javascript>var subject = "Hello world!";

// Perform a string replacement // newSubject == "Replaced!" var newSubject = subject.replace(re_PatternToMatch, "Replaced");</lang>

M4

<lang M4>regexp(`GNUs not Unix', `\<[a-z]\w+') regexp(`GNUs not Unix', `\<[a-z]\(\w+\)', `a \& b \1 c')</lang>

Output:

5
a not b ot c

Objective-C

Test

Works with: Mac OS X version 10.4+

<lang objc>NSString *str = @"I am a string"; NSString *regex = @".*string$";

NSPredicate *pred = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", regex];

if ([pred evaluateWithObject:str]) {

   NSLog(@"ends with 'string'");

}</lang> Unfortunately this method cannot find the location of the match or do substitution.

OCaml

With the standard library

Test <lang ocaml>#load "str.cma";; let str = "I am a string";; try

 ignore(Str.search_forward (Str.regexp ".*string$") str 0);
 print_endline "ends with 'string'"

with Not_found -> ()

</lang>

Substitute <lang ocaml>#load "str.cma";; let orig = "I am the original string";; let result = Str.global_replace (Str.regexp "original") "modified" orig;; (* result is now "I am the modified string" *)</lang>

Using Pcre

Library: ocaml-pcre

<lang ocaml>let matched pat str =

 try ignore(Pcre.exec ~pat str); (true)
 with Not_found -> (false)

let () =

 Printf.printf "matched = %b\n" (matched "string$" "I am a string");
 Printf.printf "Substitute: %s\n"
   (Pcre.replace ~pat:"original" ~templ:"modified" "I am the original string")
</lang>

Perl

Works with: Perl version 5.8.8

Test <lang perl>$string = "I am a string"; if ($string =~ /string$/) {

  print "Ends with 'string'\n";

}

if ($string !~ /^You/) {

  print "Does not start with 'You'\n";

}</lang>

Substitute <lang perl>$string = "I am a string"; $string =~ s/ a / another /; # makes "I am a string" into "I am another string" print $string;</lang>

Test and Substitute <lang perl>$string = "I am a string"; if ($string =~ s/\bam\b/was/) { # \b is a word border

  print "I was able to find and replace 'am' with 'was'\n";

}</lang>

Options <lang perl># add the following just after the last / for additional control

  1. g = globally (match as many as possible)
  2. i = case-insensitive
  3. s = treat all of $string as a single line (in case you have line breaks in the content)
  4. m = multi-line (the expression is run on each line individually)

$string =~ s/i/u/ig; # would change "I am a string" into "u am a strung"</lang>

PHP

Works with: PHP version 5.2.0

<lang php> $string = 'I am a string';</lang>

Test

<lang php>if (preg_match('/string$/', $string)) {

   echo "Ends with 'string'\n";

}</lang>

Replace

<lang php>$string = preg_replace('/\ba\b/', 'another', $string); echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";</lang>

PowerShell

<lang powershell>"I am a string" -match '\bstr' # true "I am a string" -replace 'a\b','no' # I am no string</lang> By default both the -match and -replace operators are case-insensitive. They can be made case-sensitive by using the -cmatch and -creplace operators.

Python

<lang python>import re

string = "This is a string"

if re.search('string$',string):

   print "Ends with string."

string = re.sub(" a "," another ",string) print string</lang>

R

First, define some strings. <lang R>pattern <- "string" text1 <- "this is a matching string" text2 <- "this does not match"</lang> Matching with grep. The indices of the texts containing matches are returned. <lang R>grep(pattern, c(text1, text2)) # 1</lang> Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches. <lang R>regexpr(pattern, c(text1, text2))</lang>

[1] 20 -1
attr(,"match.length")
[1]  6 -1

Replacement <lang R>gsub(pattern, "pair of socks", c(text1, text2))</lang>

[1] "this is a matching pair of socks" "this does not match"

Raven

<lang raven>'i am a string' as str</lang>

Match:

<lang raven>str m/string$/ if "Ends with 'string'\n" print</lang>

Replace:

<lang raven>str r/ a / another / print</lang>

Ruby

Test <lang ruby>string="I am a string" puts "Ends with 'string'" if string[/string$/] puts "Does not start with 'You'" if !string[/^You/]</lang>

Substitute <lang ruby>puts string.gsub(/ a /,' another ')

  1. or

string[/ a /]='another' puts string</lang>

Substitute using block <lang ruby>puts(string.gsub(/\bam\b/) do |match|

      puts "I found #{match}"
      #place "was" instead of the match
      "was"
    end)</lang>

Scala

Define <lang scala>val Bottles1 = "(\\d+) bottles of beer".r // syntactic sugar val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups</lang>

Search and replace with string methods: <lang scala>"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match "99 bottles of beer" replace ("99", "98") // Single replacement "99 bottles of beer" replaceAll ("b", "B") // Multiple replacement</lang>

Search with regex methods: <lang scala>"\\d+".r findFirstIn "99 bottles of beer" // returns first partial match, or None "\\w+".r findAllIn "99 bottles of beer" // returns all partial matches as an iterator "\\s+".r findPrefixOf "99 bottles of beer" // returns a matching prefix, or None Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name</lang>

Using pattern matching with regex: <lang scala>val Some(bottles) = Bottles4 findPrefixOf "99 bottles of beer" // throws an exception if the matching fails; full string must match for {

 line <- """|99 bottles of beer on the wall
            |99 bottles of beer
            |Take one down, pass it around
            |98 bottles of beer on the wall""".stripMargin.lines

} line match {

 case Bottles1(bottles) => println("There are still "+bottles+" bottles.") // full string must match, so this will match only once
 case _ =>

} for {

 matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match

} println("Matched from "+matched.start+" to "+matched.end)</lang>

Replacing with regex: <lang scala>Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer") Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer")</lang>

Slate

This library is still in its early stages. There isn't currently a feature to replace a substring.

<lang slate>(Regex Matcher newOn: '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?')

   `>> [match: 'http://slatelanguage.org/test/page?query'. subexpressionMatches]

"==> {"Dictionary traitsWindow" 0 -> 'http:'. 1 -> 'http'. 2 -> '//slatelanguage.org'.

      3 -> 'slatelanguage.org'. 4 -> '/test/page'. 5 -> '?query'. 6 -> 'query'. 7 -> Nil}"</lang>

Smalltalk

<lang smalltalk>|re s s1| re := Regex fromString: '[a-z]+ing'. s := 'this is a matching string'. s1 := 'this does not match'.

(s =~ re) ifMatched: [ :b |

  b match displayNl

]. (s1 =~ re) ifMatched: [ :b |

  'Strangely matched!' displayNl

] ifNotMatched: [

  'no match!' displayNl

].

(s replacingRegex: re with: 'modified') displayNl.</lang>

Tcl

Test using regexp: <lang tcl>set theString "I am a string" if {[regexp -- {string$} $theString]} {

   puts "Ends with 'string'"

}

if {![regexp -- {^You} $theString]} {

   puts "Does not start with 'You'"

}</lang>

Extract substring using regexp <lang tcl>set theString "This string has >123< a number in it" if {[regexp -- {>(\d+)<} $theString -> number]} {

   puts "Contains the number $number"

}</lang>

Substitute using regsub <lang tcl>set theString = "I am a string" puts [regsub -- { +a +} $theString { another }]</lang>

Toka

Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings.

<lang toka>#! Include the regex library needs regex

  1. ! The two test strings

" This is a string" is-data test.1 " Another string" is-data test.2

  1. ! Create a new regex named 'expression' which tries
  2. ! to match strings beginning with 'This'.

" ^This" regex: expression

  1. ! An array to store the results of the match
  2. ! (Element 0 = starting offset, Element 1 = ending offset of match)

2 cells is-array match

  1. ! Try both test strings against the expression.
  2. ! try-regex will return a flag. -1 is TRUE, 0 is FALSE

expression test.1 2 match try-regex . expression test.2 2 match try-regex .</lang>

Vedit macro language

Vedit can perform searches and matching with either regular expressions, pattern matching codes or plain text. These examples use regular expressions.

Match text at cursor location: <lang vedit>if (Match(".* string$", REGEXP)==0) {

   Statline_Message("This line ends with 'string'")

}</lang>

Search for a pattern: <lang vedit>if (Search("string$", REGEXP+NOERR)) {

   Statline_Message("'string' at and of line found")

}</lang>

Replace: <lang vedit>Replace(" a ", " another ", REGEXP+NOERR)</lang>