Regular expressions: Difference between revisions
(Added Scala) |
m (Fixed lang tags.) |
||
Line 7: | Line 7: | ||
=={{header|AppleScript}}== |
=={{header|AppleScript}}== |
||
{{libheader|Satimage.osax}} |
{{libheader|Satimage.osax}} |
||
<lang applescript>try |
|||
try |
|||
find text ".*string$" in "I am a string" with regexp |
|||
on error message |
|||
return message |
|||
end try |
|||
try |
|||
change "original" into "modified" in "I am the original string" with regexp |
|||
on error message |
|||
return message |
|||
end try</lang> |
|||
=={{header|ALGOL 68}}== |
=={{header|ALGOL 68}}== |
||
Line 24: | Line 24: | ||
<!-- {{does not work with|ALGOL 68|Standard - grep/sub in string are not part of the standard's prelude. }} --> |
<!-- {{does not work with|ALGOL 68|Standard - grep/sub in string are not part of the standard's prelude. }} --> |
||
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}} |
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}} |
||
<lang |
<lang algol68>INT match=0, no match=1, out of memory error=2, other error=3; |
||
STRING str := "i am a string"; |
STRING str := "i am a string"; |
||
Line 51: | Line 51: | ||
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}} |
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}} |
||
For example:<lang |
For example:<lang algol68>FORMAT pattern = $ddd" "c("cats","dogs")$; |
||
FORMAT pattern = $ddd" "c("cats","dogs")$; |
|||
FILE file; STRING book; associate(file, book); |
FILE file; STRING book; associate(file, book); |
||
on value error(file, (REF FILE f)BOOL: stop); |
on value error(file, (REF FILE f)BOOL: stop); |
||
Line 71: | Line 70: | ||
=={{header|AutoHotkey}}== |
=={{header|AutoHotkey}}== |
||
<lang AutoHotkey> |
<lang AutoHotkey>MsgBox % foundpos := RegExMatch("Hello World", "World$") |
||
MsgBox % |
MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself")</lang> |
||
MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself") |
|||
</lang> |
|||
=={{header|AWK}}== |
=={{header|AWK}}== |
||
AWK supports regular expressions, which are typically marked up with slashes in front and back, and the "~" operator: |
AWK supports regular expressions, which are typically marked up with slashes in front and back, and the "~" operator: |
||
<lang awk>$ awk '{if($0~/[A-Z]/)print "uppercase detected"}' |
|||
abc |
|||
ABC |
|||
uppercase detected</lang> |
|||
As shorthand, a regular expression in the condition part fires if it matches an input line: |
As shorthand, a regular expression in the condition part fires if it matches an input line: |
||
<lang awk>awk '/[A-Z]/{print "uppercase detected"}' |
|||
def |
|||
DeF |
|||
uppercase detected</lang> |
|||
For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match): |
For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match): |
||
<lang awk>$ awk '{gsub(/[A-Z]/,"*");print}' |
|||
abCDefG |
|||
ab**ef* |
|||
$ awk '{gsub(/[A-Z]/,"(&)");print}' |
|||
abCDefGH |
|||
ab(C)(D)ef(G)(H)</lang> |
|||
This variant matches one or more uppercase letters in one round: |
This variant matches one or more uppercase letters in one round: |
||
<lang awk>$ awk '{gsub(/[A-Z]+/,"(&)");print}' |
|||
abCDefGH |
|||
ab(CD)ef(GH)</lang> |
|||
=={{header|C}}== |
=={{header|C}}== |
||
Line 152: | Line 149: | ||
{{libheader|Boost}} |
{{libheader|Boost}} |
||
<lang cpp> |
<lang cpp>#include <iostream> |
||
#include <string> |
|||
#include <iterator> |
|||
#include <boost/regex.hpp> |
|||
int main() |
|||
{ |
|||
boost::regex re(".* string$"); |
|||
std::string s = "Hi, I am a string"; |
|||
// match the complete string |
|||
if (boost::regex_match(s, re)) |
|||
std::cout << "The string matches.\n"; |
|||
else |
|||
std::cout << "Oops - not found?\n"; |
|||
// match a substring |
|||
boost::regex re2(" a.*a"); |
|||
boost::smatch match; |
|||
if (boost::regex_search(s, match, re2)) |
|||
{ |
{ |
||
std::cout << "Matched " << match.length() |
|||
boost::regex re(".* string$"); |
|||
<< " characters starting at " << match.position() << ".\n"; |
|||
std::string s = "Hi, I am a string"; |
|||
std::cout << "Matched character sequence: \"" |
|||
<< match.str() << "\"\n"; |
|||
// match the complete string |
|||
} |
|||
if (boost::regex_match(s, re)) |
|||
else |
|||
std::cout << "The string matches.\n"; |
|||
{ |
|||
std::cout << "Oops - not found?\n"; |
|||
} |
|||
// match a substring |
|||
// replace a substring |
|||
boost::regex re2(" a.*a"); |
|||
std::string dest_string; |
|||
boost::smatch match; |
|||
boost::regex_replace(std::back_inserter(dest_string), |
|||
if (boost::regex_search(s, match, re2)) |
|||
s.begin(), s.end(), |
|||
{ |
|||
re2, |
|||
"'m now a changed"); |
|||
std::cout << dest_string << std::endl; |
|||
}</lang> |
|||
<< match.str() << "\"\n"; |
|||
} |
|||
else |
|||
{ |
|||
std::cout << "Oops - not found?\n"; |
|||
} |
|||
// replace a substring |
|||
std::string dest_string; |
|||
boost::regex_replace(std::back_inserter(dest_string), |
|||
s.begin(), s.end(), |
|||
re2, |
|||
"'m now a changed"); |
|||
std::cout << dest_string << std::endl; |
|||
}</lang> |
|||
=={{header|C sharp|C#}}== |
=={{header|C sharp|C#}}== |
||
Line 235: | Line 232: | ||
=={{header|D}}== |
=={{header|D}}== |
||
<lang d> |
<lang d>import std.stdio, std.regexp; |
||
void main() { |
|||
string s = "I am a string"; |
|||
// Test: |
|||
if (search(s, r"string$")) |
|||
writefln("Ends with 'string'"); |
|||
// Test, storing the regular expression: |
|||
auto re1 = RegExp(r"string$"); |
|||
if (re1.search(s).test) |
|||
writefln("Ends with 'string'"); |
|||
// Substitute: |
|||
writefln(sub(s, " a ", " another ")); |
|||
// Substitute, storing the regular expression: |
|||
auto re2 = RegExp(" a "); |
|||
writefln(re2.replace(s, " another ")); |
|||
}</lang> |
|||
Note that in std.string there are string functions to perform those string operations in a faster way. |
Note that in std.string there are string functions to perform those string operations in a faster way. |
||
=={{header|Erlang}}== |
=={{header|Erlang}}== |
||
<lang erlang> |
<lang erlang>match() -> |
||
match() -> |
|||
String = "This is a string", |
String = "This is a string", |
||
case re:run(String, "string$") of |
case re:run(String, "string$") of |
||
Line 271: | Line 267: | ||
String = "This is a string", |
String = "This is a string", |
||
NewString = re:replace(String, " a ", " another ", [{return, list}]), |
NewString = re:replace(String, " a ", " another ", [{return, list}]), |
||
io:format("~s~n",[NewString]). |
io:format("~s~n",[NewString]).</lang> |
||
</lang> |
|||
=={{header|Forth}}== |
=={{header|Forth}}== |
||
{{libheader|Forth Foundation Library}} |
{{libheader|Forth Foundation Library}} |
||
Test/Match |
Test/Match |
||
<lang forth>include ffl/rgx.fs |
|||
\ Create a regular expression variable 'exp' in the dictionary |
|||
rgx-create exp |
|||
\ Compile an expression |
|||
s" Hello (World)" exp rgx-compile [IF] |
|||
.( Regular expression successful compiled.) cr |
|||
[THEN] |
|||
\ (Case sensitive) match a string with the expression |
|||
s" Hello World" exp rgx-cmatch? [IF] |
|||
.( String matches with the expression.) cr |
|||
[ELSE] |
|||
.( No match.) cr |
|||
[THEN]</lang> |
|||
=={{header|Haskell}}== |
=={{header|Haskell}}== |
||
Test |
Test |
||
<lang haskell> |
<lang haskell>import Text.Regex |
||
str = "I am a string" |
|||
case matchRegex (mkRegex ".*string$") str of |
|||
Just _ -> putStrLn $ "ends with 'string'" |
|||
Nothing -> return ()</lang> |
|||
Substitute |
Substitute |
||
<lang haskell> |
<lang haskell>import Text.Regex |
||
orig = "I am the original string" |
|||
result = subRegex (mkRegex "original") orig "modified" |
|||
putStrLn $ result</lang> |
|||
=={{header|J}}== |
=={{header|J}}== |
||
Line 318: | Line 313: | ||
J's regex support is built on top of PCRE. |
J's regex support is built on top of PCRE. |
||
<lang j>load'regex' NB. Load regex library |
|||
str =: 'I am a string' NB. String used in examples.</lang> |
|||
Matching: |
Matching: |
||
'.*string$' rxeq str NB. 1 is true, 0 is false |
<lang j> '.*string$' rxeq str NB. 1 is true, 0 is false |
||
1</lang> |
|||
1 |
|||
Substitution: |
Substitution: |
||
('am';'am still') rxrplc str |
<lang j> ('am';'am still') rxrplc str |
||
I am still a string</lang> |
|||
=={{header|Java}}== |
=={{header|Java}}== |
||
Line 335: | Line 330: | ||
Test |
Test |
||
<lang java> |
<lang java>String str = "I am a string"; |
||
if (str.matches(".*string$")) { |
|||
System.out.println("ends with 'string'"); |
|||
}</lang> |
|||
Substitute |
Substitute |
||
<lang java> |
<lang java>String orig = "I am the original string"; |
||
String result = orig.replaceAll("original", "modified"); |
|||
// result is now "I am the modified string"</lang> |
|||
=={{header|JavaScript}}== |
=={{header|JavaScript}}== |
||
Test/Match |
Test/Match |
||
<lang javascript> |
<lang javascript>var subject = "Hello world!"; |
||
// Two different ways to create the RegExp object |
|||
// Both examples use the exact same pattern... matching "hello" |
|||
var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity |
|||
var re_PatternToMatch2 = new RegExp("Hello (World)", "i"); |
|||
// Test for a match - return a bool |
|||
var isMatch = re_PatternToMatch.test(subject); |
|||
// Get the match details |
|||
// Returns an array with the match's details |
|||
// matches[0] == "Hello world" |
|||
// matches[1] == "world" |
|||
var matches = re_PatternToMatch2.exec(subject);</lang> |
|||
Substitute |
Substitute |
||
<lang javascript> |
<lang javascript>var subject = "Hello world!"; |
||
// Perform a string replacement |
|||
// newSubject == "Replaced!" |
|||
var newSubject = subject.replace(re_PatternToMatch, "Replaced");</lang> |
|||
=={{header|M4}}== |
=={{header|M4}}== |
||
<lang M4>regexp(`GNUs not Unix', `\<[a-z]\w+') |
|||
<lang M4> |
|||
regexp(`GNUs not Unix', `\<[a-z]\w+') |
regexp(`GNUs not Unix', `\<[a-z]\(\w+\)', `a \& b \1 c')</lang> |
||
regexp(`GNUs not Unix', `\<[a-z]\(\w+\)', `a \& b \1 c') |
|||
</lang> |
|||
Output: |
Output: |
||
Line 386: | Line 379: | ||
Test |
Test |
||
{{works with|Mac OS X|10.4+}} |
{{works with|Mac OS X|10.4+}} |
||
<lang objc> |
<lang objc>NSString *str = @"I am a string"; |
||
NSString *str = @"I am a string"; |
|||
NSString *regex = @".*string$"; |
NSString *regex = @".*string$"; |
||
Line 394: | Line 386: | ||
if ([pred evaluateWithObject:str]) { |
if ([pred evaluateWithObject:str]) { |
||
NSLog(@"ends with 'string'"); |
NSLog(@"ends with 'string'"); |
||
}</lang> |
|||
} |
|||
</lang> |
|||
Unfortunately this method cannot find the location of the match or do substitution. |
Unfortunately this method cannot find the location of the match or do substitution. |
||
Line 401: | Line 392: | ||
=== With the standard library === |
=== With the standard library === |
||
Test |
Test |
||
<lang ocaml> |
<lang ocaml>#load "str.cma";; |
||
let str = "I am a string";; |
|||
try |
|||
ignore(Str.search_forward (Str.regexp ".*string$") str 0); |
|||
print_endline "ends with 'string'" |
|||
with Not_found -> () |
|||
;;</lang> |
|||
Substitute |
Substitute |
||
<lang ocaml> |
<lang ocaml>#load "str.cma";; |
||
let orig = "I am the original string";; |
|||
let result = Str.global_replace (Str.regexp "original") "modified" orig;; |
|||
(* result is now "I am the modified string" *)</lang> |
|||
=== Using Pcre === |
=== Using Pcre === |
||
Line 467: | Line 458: | ||
Test |
Test |
||
<lang php> |
<lang php>if (preg_match('/string$/', $string)) |
||
{ |
|||
echo "Ends with 'string'\n"; |
|||
}</lang> |
|||
Replace |
Replace |
||
<lang php> |
<lang php>$string = preg_replace('/\ba\b/', 'another', $string); |
||
echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";</lang> |
|||
=={{header|PowerShell}}== |
=={{header|PowerShell}}== |
||
Line 495: | Line 486: | ||
=={{header|R}}== |
=={{header|R}}== |
||
First, define some strings. |
First, define some strings. |
||
<lang R> |
<lang R>pattern <- "string" |
||
pattern <- "string" |
|||
text1 <- "this is a matching string" |
text1 <- "this is a matching string" |
||
text2 <- "this does not match" |
text2 <- "this does not match"</lang> |
||
</lang> |
|||
Matching with grep. The indices of the texts containing matches are returned. |
Matching with grep. The indices of the texts containing matches are returned. |
||
<lang R>grep(pattern, c(text1, text2)) # 1</lang> |
|||
<lang R> |
|||
grep(pattern, c(text1, text2)) # 1 |
|||
</lang> |
|||
Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches. |
Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches. |
||
<lang R>regexpr(pattern, c(text1, text2))</lang> |
|||
<lang R> |
|||
regexpr(pattern, c(text1, text2)) |
|||
</lang> |
|||
[1] 20 -1 |
[1] 20 -1 |
||
attr(,"match.length") |
attr(,"match.length") |
||
[1] 6 -1 |
[1] 6 -1 |
||
Replacement |
Replacement |
||
<lang R>gsub(pattern, "pair of socks", c(text1, text2))</lang> |
|||
<lang R> |
|||
gsub(pattern, "pair of socks", c(text1, text2)) |
|||
</lang> |
|||
[1] "this is a matching pair of socks" "this does not match" |
[1] "this is a matching pair of socks" "this does not match" |
||
=={{header|Raven}}== |
=={{header|Raven}}== |
||
<lang raven>'i am a string' as str</lang> |
|||
Match: |
Match: |
||
<lang raven>str m/string$/ |
|||
if "Ends with 'string'\n" print</lang> |
|||
Replace: |
Replace: |
||
<lang raven>str r/ a / another / print</lang> |
|||
=={{header|Ruby}}== |
=={{header|Ruby}}== |
||
Test |
Test |
||
<lang ruby> |
<lang ruby>string="I am a string" |
||
puts "Ends with 'string'" if string[/string$/] |
|||
puts "Does not start with 'You'" if !string[/^You/]</lang> |
|||
Substitute |
Substitute |
||
<lang ruby> |
<lang ruby>puts string.gsub(/ a /,' another ') |
||
#or |
|||
string[/ a /]='another' |
|||
puts string</lang> |
|||
Substitute using block |
Substitute using block |
||
<lang ruby> |
<lang ruby>puts(string.gsub(/\bam\b/) do |match| |
||
puts "I found #{match}" |
|||
#place "was" instead of the match |
|||
"was" |
|||
end)</lang> |
|||
=={{header|Scala}}== |
=={{header|Scala}}== |
||
Line 554: | Line 537: | ||
val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes |
val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes |
||
val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard |
val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard |
||
val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups |
val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups</lang> |
||
</lang> |
|||
Search and replace with string methods: |
Search and replace with string methods: |
||
<lang scala>"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match |
<lang scala>"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match |
||
"99 bottles of beer" replace ("99", "98") // Single replacement |
"99 bottles of beer" replace ("99", "98") // Single replacement |
||
"99 bottles of beer" replaceAll ("b", "B") // Multiple replacement |
"99 bottles of beer" replaceAll ("b", "B") // Multiple replacement</lang> |
||
</lang> |
|||
Search with regex methods: |
Search with regex methods: |
||
Line 569: | Line 550: | ||
Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None |
Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None |
||
Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes |
Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes |
||
val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name |
val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name</lang> |
||
</lang> |
|||
Using pattern matching with regex: |
Using pattern matching with regex: |
||
Line 585: | Line 565: | ||
for { |
for { |
||
matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match |
matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match |
||
} println("Matched from "+matched.start+" to "+matched.end) |
} println("Matched from "+matched.start+" to "+matched.end)</lang> |
||
</lang> |
|||
Replacing with regex: |
Replacing with regex: |
||
<lang scala>Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer") |
<lang scala>Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer") |
||
Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer") |
Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer")</lang> |
||
</lang> |
|||
=={{header|Slate}}== |
=={{header|Slate}}== |
||
Line 597: | Line 575: | ||
This library is still in its early stages. There isn't currently a feature to replace a substring. |
This library is still in its early stages. There isn't currently a feature to replace a substring. |
||
<lang slate>(Regex Matcher newOn: '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?') |
|||
<lang slate> |
|||
(Regex Matcher newOn: '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?') |
|||
`>> [match: 'http://slatelanguage.org/test/page?query'. subexpressionMatches] |
`>> [match: 'http://slatelanguage.org/test/page?query'. subexpressionMatches] |
||
"==> {"Dictionary traitsWindow" 0 -> 'http:'. 1 -> 'http'. 2 -> '//slatelanguage.org'. |
"==> {"Dictionary traitsWindow" 0 -> 'http:'. 1 -> 'http'. 2 -> '//slatelanguage.org'. |
||
3 -> 'slatelanguage.org'. 4 -> '/test/page'. 5 -> '?query'. 6 -> 'query'. 7 -> Nil}" |
3 -> 'slatelanguage.org'. 4 -> '/test/page'. 5 -> '?query'. 6 -> 'query'. 7 -> Nil}"</lang> |
||
</lang> |
|||
=={{header|Smalltalk}}== |
=={{header|Smalltalk}}== |
||
Line 652: | Line 627: | ||
Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings. |
Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings. |
||
<lang toka>#! Include the regex library |
|||
needs regex |
|||
#! The two test strings |
|||
" This is a string" is-data test.1 |
|||
" Another string" is-data test.2 |
|||
#! Create a new regex named 'expression' which tries |
|||
#! to match strings beginning with 'This'. |
|||
" ^This" regex: expression |
|||
#! An array to store the results of the match |
|||
#! (Element 0 = starting offset, Element 1 = ending offset of match) |
|||
2 cells is-array match |
|||
#! Try both test strings against the expression. |
|||
#! try-regex will return a flag. -1 is TRUE, 0 is FALSE |
|||
expression test.1 2 match try-regex . |
|||
expression test.2 2 match try-regex .</lang> |
|||
=={{header|Vedit macro language}}== |
=={{header|Vedit macro language}}== |
||
Line 676: | Line 651: | ||
Match text at cursor location: |
Match text at cursor location: |
||
<lang vedit> |
<lang vedit>if (Match(".* string$", REGEXP)==0) { |
||
if (Match(".* string$", REGEXP)==0) { |
|||
Statline_Message("This line ends with 'string'") |
Statline_Message("This line ends with 'string'") |
||
}</lang> |
|||
} |
|||
</lang> |
|||
Search for a pattern: |
Search for a pattern: |
||
<lang vedit> |
<lang vedit>if (Search("string$", REGEXP+NOERR)) { |
||
if (Search("string$", REGEXP+NOERR)) { |
|||
Statline_Message("'string' at and of line found") |
Statline_Message("'string' at and of line found") |
||
}</lang> |
|||
} |
|||
</lang> |
|||
Replace: |
Replace: |
||
<lang vedit>Replace(" a ", " another ", REGEXP+NOERR)</lang> |
|||
<lang vedit> |
|||
Replace(" a ", " another ", REGEXP+NOERR) |
|||
</lang> |
Revision as of 20:35, 21 November 2009
You are encouraged to solve this task according to the task description, using any language you may know.
The goal of this task is
- to match a string against a regular expression
- to substitute part of a string using a regular expression
AppleScript
<lang applescript>try
find text ".*string$" in "I am a string" with regexp
on error message
return message
end try
try
change "original" into "modified" in "I am the original string" with regexp
on error message
return message
end try</lang>
ALGOL 68
The routines grep in strings and sub in string are not part of ALGOL 68's standard prelude.
<lang algol68>INT match=0, no match=1, out of memory error=2, other error=3;
STRING str := "i am a string";
- Match: #
STRING m := "string$"; INT start, end; IF grep in string(m, str, start, end) = match THEN printf(($"Ends with """g""""l$, str[start:end])) FI;
- Replace: #
IF sub in string(" a ", " another ",str) = match THEN printf(($gl$, str)) FI;</lang> Output:
Ends with "string" i am another string
Standard ALGOL 68 does have an primordial form of pattern matching called a format. This is designed to extract values from input data. But it can also be used for outputting (and transputting) the original data.
For example:<lang algol68>FORMAT pattern = $ddd" "c("cats","dogs")$; FILE file; STRING book; associate(file, book); on value error(file, (REF FILE f)BOOL: stop); on format error(file, (REF FILE f)BOOL: stop);
book := "100 dogs"; STRUCT(INT count, type) dalmatians;
getf(file, (pattern, dalmatians)); print(("Dalmatians: ", dalmatians, new line)); count OF dalmatians +:=1; printf(($"Gives: "$, pattern, dalmatians, $l$))</lang> Output:
Dalmatians: +100 +2 Gives 101 dogs
AutoHotkey
<lang AutoHotkey>MsgBox % foundpos := RegExMatch("Hello World", "World$") MsgBox % replaced := RegExReplace("Hello World", "World$", "yourself")</lang>
AWK
AWK supports regular expressions, which are typically marked up with slashes in front and back, and the "~" operator: <lang awk>$ awk '{if($0~/[A-Z]/)print "uppercase detected"}' abc ABC uppercase detected</lang> As shorthand, a regular expression in the condition part fires if it matches an input line: <lang awk>awk '/[A-Z]/{print "uppercase detected"}' def DeF uppercase detected</lang> For substitution, the first argument can be a regular expression, while the replacement string is constant (only that '&' in it receives the value of the match): <lang awk>$ awk '{gsub(/[A-Z]/,"*");print}' abCDefG ab**ef* $ awk '{gsub(/[A-Z]/,"(&)");print}' abCDefGH ab(C)(D)ef(G)(H)</lang> This variant matches one or more uppercase letters in one round: <lang awk>$ awk '{gsub(/[A-Z]+/,"(&)");print}' abCDefGH ab(CD)ef(GH)</lang>
C
As far as I can see, POSIX defined function for regex matching, but nothing for substitution. So we must do all the hard work by hand. The complex-appearing code could be turned into a function.
<lang c>#include <stdio.h>
- include <stdlib.h>
- include <sys/types.h>
- include <regex.h>
- include <string.h>
int main() {
regex_t preg; regmatch_t substmatch[1]; const char *tp = "string$"; const char *t1 = "this is a matching string"; const char *t2 = "this is not a matching string!"; const char *ss = "istyfied"; regcomp(&preg, "string$", REG_EXTENDED); printf("'%s' %smatched with '%s'\n", t1, (regexec(&preg, t1, 0, NULL, 0)==0) ? "" : "did not ", tp); printf("'%s' %smatched with '%s'\n", t2, (regexec(&preg, t2, 0, NULL, 0)==0) ? "" : "did not ", tp); regfree(&preg); /* change "a[a-z]+" into "istifyed"?*/ regcomp(&preg, "a[a-z]+", REG_EXTENDED); if ( regexec(&preg, t1, 1, substmatch, 0) == 0 ) { //fprintf(stderr, "%d, %d\n", substmatch[0].rm_so, substmatch[0].rm_eo); char *ns = malloc(substmatch[0].rm_so + 1 + strlen(ss) + (strlen(t1) - substmatch[0].rm_eo) + 2); memcpy(ns, t1, substmatch[0].rm_so+1); memcpy(&ns[substmatch[0].rm_so], ss, strlen(ss)); memcpy(&ns[substmatch[0].rm_so+strlen(ss)], &t1[substmatch[0].rm_eo], strlen(&t1[substmatch[0].rm_eo])); ns[ substmatch[0].rm_so + strlen(ss) + strlen(&t1[substmatch[0].rm_eo]) ] = 0; printf("mod string: '%s'\n", ns); free(ns); } else { printf("the string '%s' is the same: no matching!\n", t1); } regfree(&preg); return 0;
}</lang>
C++
<lang cpp>#include <iostream>
- include <string>
- include <iterator>
- include <boost/regex.hpp>
int main() {
boost::regex re(".* string$"); std::string s = "Hi, I am a string";
// match the complete string if (boost::regex_match(s, re)) std::cout << "The string matches.\n"; else std::cout << "Oops - not found?\n";
// match a substring boost::regex re2(" a.*a"); boost::smatch match; if (boost::regex_search(s, match, re2)) { std::cout << "Matched " << match.length() << " characters starting at " << match.position() << ".\n"; std::cout << "Matched character sequence: \"" << match.str() << "\"\n"; } else { std::cout << "Oops - not found?\n"; }
// replace a substring std::string dest_string; boost::regex_replace(std::back_inserter(dest_string), s.begin(), s.end(), re2, "'m now a changed"); std::cout << dest_string << std::endl;
}</lang>
C#
<lang csharp>using System; using System.Text.RegularExpressions;
class Program {
static void Main(string[] args) { string str = "I am a string";
if (new Regex("string$").IsMatch(str)) { Console.WriteLine("Ends with string."); }
str = new Regex(" a ").Replace(str, " another "); Console.WriteLine(str); }
}</lang>
Common Lisp
Uses CL-PPCRE - Portable Perl-compatible regular expressions for Common Lisp.
<lang lisp>(let ((string "I am a string"))
(when (cl-ppcre:scan "string$" string) (write-line "Ends with string")) (unless (cl-ppcre:scan "^You" string ) (write-line "Does not start with 'You'")))</lang>
Substitute
<lang lisp>(let* ((string "I am a string")
(string (cl-ppcre:regex-replace " a " string " another "))) (write-line string))</lang>
Test and Substitute
<lang lisp>(let ((string "I am a string"))
(multiple-value-bind (string matchp) (cl-ppcre:regex-replace "\\bam\\b" string "was") (when matchp (write-line "I was able to find and replace 'am' with 'was'."))))</lang>
D
<lang d>import std.stdio, std.regexp;
void main() {
string s = "I am a string";
// Test: if (search(s, r"string$")) writefln("Ends with 'string'");
// Test, storing the regular expression: auto re1 = RegExp(r"string$"); if (re1.search(s).test) writefln("Ends with 'string'");
// Substitute: writefln(sub(s, " a ", " another "));
// Substitute, storing the regular expression: auto re2 = RegExp(" a "); writefln(re2.replace(s, " another "));
}</lang>
Note that in std.string there are string functions to perform those string operations in a faster way.
Erlang
<lang erlang>match() -> String = "This is a string", case re:run(String, "string$") of {match,_} -> io:format("Ends with 'string'~n"); _ -> ok end.
substitute() -> String = "This is a string", NewString = re:replace(String, " a ", " another ", [{return, list}]), io:format("~s~n",[NewString]).</lang>
Forth
Test/Match <lang forth>include ffl/rgx.fs
\ Create a regular expression variable 'exp' in the dictionary
rgx-create exp
\ Compile an expression
s" Hello (World)" exp rgx-compile [IF]
.( Regular expression successful compiled.) cr
[THEN]
\ (Case sensitive) match a string with the expression
s" Hello World" exp rgx-cmatch? [IF]
.( String matches with the expression.) cr
[ELSE]
.( No match.) cr
[THEN]</lang>
Haskell
Test <lang haskell>import Text.Regex
str = "I am a string"
case matchRegex (mkRegex ".*string$") str of
Just _ -> putStrLn $ "ends with 'string'" Nothing -> return ()</lang>
Substitute <lang haskell>import Text.Regex
orig = "I am the original string" result = subRegex (mkRegex "original") orig "modified" putStrLn $ result</lang>
J
J's regex support is built on top of PCRE.
<lang j>load'regex' NB. Load regex library str =: 'I am a string' NB. String used in examples.</lang>
Matching:
<lang j> '.*string$' rxeq str NB. 1 is true, 0 is false 1</lang>
Substitution:
<lang j> ('am';'am still') rxrplc str I am still a string</lang>
Java
Test
<lang java>String str = "I am a string"; if (str.matches(".*string$")) {
System.out.println("ends with 'string'");
}</lang>
Substitute
<lang java>String orig = "I am the original string"; String result = orig.replaceAll("original", "modified"); // result is now "I am the modified string"</lang>
JavaScript
Test/Match <lang javascript>var subject = "Hello world!";
// Two different ways to create the RegExp object // Both examples use the exact same pattern... matching "hello" var re_PatternToMatch = /Hello (World)/i; // creates a RegExp literal with case-insensitivity var re_PatternToMatch2 = new RegExp("Hello (World)", "i");
// Test for a match - return a bool var isMatch = re_PatternToMatch.test(subject);
// Get the match details // Returns an array with the match's details // matches[0] == "Hello world" // matches[1] == "world" var matches = re_PatternToMatch2.exec(subject);</lang>
Substitute <lang javascript>var subject = "Hello world!";
// Perform a string replacement // newSubject == "Replaced!" var newSubject = subject.replace(re_PatternToMatch, "Replaced");</lang>
M4
<lang M4>regexp(`GNUs not Unix', `\<[a-z]\w+') regexp(`GNUs not Unix', `\<[a-z]\(\w+\)', `a \& b \1 c')</lang>
Output:
5 a not b ot c
Objective-C
Test
<lang objc>NSString *str = @"I am a string"; NSString *regex = @".*string$";
NSPredicate *pred = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", regex];
if ([pred evaluateWithObject:str]) {
NSLog(@"ends with 'string'");
}</lang> Unfortunately this method cannot find the location of the match or do substitution.
OCaml
With the standard library
Test <lang ocaml>#load "str.cma";; let str = "I am a string";; try
ignore(Str.search_forward (Str.regexp ".*string$") str 0); print_endline "ends with 'string'"
with Not_found -> ()
- </lang>
Substitute <lang ocaml>#load "str.cma";; let orig = "I am the original string";; let result = Str.global_replace (Str.regexp "original") "modified" orig;; (* result is now "I am the modified string" *)</lang>
Using Pcre
Library: ocaml-pcre
<lang ocaml>let matched pat str =
try ignore(Pcre.exec ~pat str); (true) with Not_found -> (false)
let () =
Printf.printf "matched = %b\n" (matched "string$" "I am a string"); Printf.printf "Substitute: %s\n" (Pcre.replace ~pat:"original" ~templ:"modified" "I am the original string")
- </lang>
Perl
Test <lang perl>$string = "I am a string"; if ($string =~ /string$/) {
print "Ends with 'string'\n";
}
if ($string !~ /^You/) {
print "Does not start with 'You'\n";
}</lang>
Substitute <lang perl>$string = "I am a string"; $string =~ s/ a / another /; # makes "I am a string" into "I am another string" print $string;</lang>
Test and Substitute <lang perl>$string = "I am a string"; if ($string =~ s/\bam\b/was/) { # \b is a word border
print "I was able to find and replace 'am' with 'was'\n";
}</lang>
Options <lang perl># add the following just after the last / for additional control
- g = globally (match as many as possible)
- i = case-insensitive
- s = treat all of $string as a single line (in case you have line breaks in the content)
- m = multi-line (the expression is run on each line individually)
$string =~ s/i/u/ig; # would change "I am a string" into "u am a strung"</lang>
PHP
<lang php> $string = 'I am a string';</lang>
Test
<lang php>if (preg_match('/string$/', $string)) {
echo "Ends with 'string'\n";
}</lang>
Replace
<lang php>$string = preg_replace('/\ba\b/', 'another', $string); echo "Found 'a' and replace it with 'another', resulting in this string: $string\n";</lang>
PowerShell
<lang powershell>"I am a string" -match '\bstr' # true
"I am a string" -replace 'a\b','no' # I am no string</lang>
By default both the -match
and -replace
operators are case-insensitive. They can be made case-sensitive by using the -cmatch
and -creplace
operators.
Python
<lang python>import re
string = "This is a string"
if re.search('string$',string):
print "Ends with string."
string = re.sub(" a "," another ",string) print string</lang>
R
First, define some strings. <lang R>pattern <- "string" text1 <- "this is a matching string" text2 <- "this does not match"</lang> Matching with grep. The indices of the texts containing matches are returned. <lang R>grep(pattern, c(text1, text2)) # 1</lang> Matching with regexpr. The positions of the starts of the matches are returned, along with the lengths of the matches. <lang R>regexpr(pattern, c(text1, text2))</lang>
[1] 20 -1 attr(,"match.length") [1] 6 -1
Replacement <lang R>gsub(pattern, "pair of socks", c(text1, text2))</lang>
[1] "this is a matching pair of socks" "this does not match"
Raven
<lang raven>'i am a string' as str</lang>
Match:
<lang raven>str m/string$/ if "Ends with 'string'\n" print</lang>
Replace:
<lang raven>str r/ a / another / print</lang>
Ruby
Test <lang ruby>string="I am a string" puts "Ends with 'string'" if string[/string$/] puts "Does not start with 'You'" if !string[/^You/]</lang>
Substitute <lang ruby>puts string.gsub(/ a /,' another ')
- or
string[/ a /]='another' puts string</lang>
Substitute using block <lang ruby>puts(string.gsub(/\bam\b/) do |match|
puts "I found #{match}" #place "was" instead of the match "was" end)</lang>
Scala
Define <lang scala>val Bottles1 = "(\\d+) bottles of beer".r // syntactic sugar val Bottles2 = """(\d+) bottles of beer""".r // using triple-quotes to preserve backslashes val Bottles3 = new scala.util.matching.Regex("(\\d+) bottles of beer") // standard val Bottles4 = new scala.util.matching.Regex("""(\d+) bottles of beer""", "bottles") // with named groups</lang>
Search and replace with string methods: <lang scala>"99 bottles of beer" matches "(\\d+) bottles of beer" // the full string must match "99 bottles of beer" replace ("99", "98") // Single replacement "99 bottles of beer" replaceAll ("b", "B") // Multiple replacement</lang>
Search with regex methods: <lang scala>"\\d+".r findFirstIn "99 bottles of beer" // returns first partial match, or None "\\w+".r findAllIn "99 bottles of beer" // returns all partial matches as an iterator "\\s+".r findPrefixOf "99 bottles of beer" // returns a matching prefix, or None Bottles4 findFirstMatchIn "99 bottles of beer" // returns a "Match" object, or None Bottles4 findPrefixMatchOf "99 bottles of beer" // same thing, for prefixes val bottles = (Bottles4 findFirstMatchIn "99 bottles of beer").get.group("bottles") // Getting a group by name</lang>
Using pattern matching with regex: <lang scala>val Some(bottles) = Bottles4 findPrefixOf "99 bottles of beer" // throws an exception if the matching fails; full string must match for {
line <- """|99 bottles of beer on the wall |99 bottles of beer |Take one down, pass it around |98 bottles of beer on the wall""".stripMargin.lines
} line match {
case Bottles1(bottles) => println("There are still "+bottles+" bottles.") // full string must match, so this will match only once case _ =>
} for {
matched <- "(\\w+)".r findAllIn "99 bottles of beer" matchData // matchData converts to an Iterator of Match
} println("Matched from "+matched.start+" to "+matched.end)</lang>
Replacing with regex: <lang scala>Bottles2 replaceFirstIn ("99 bottles of beer", "98 bottles of beer") Bottles3 replaceAllIn ("99 bottles of beer", "98 bottles of beer")</lang>
Slate
This library is still in its early stages. There isn't currently a feature to replace a substring.
<lang slate>(Regex Matcher newOn: '^(([^:/?#]+)\\:)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?')
`>> [match: 'http://slatelanguage.org/test/page?query'. subexpressionMatches]
"==> {"Dictionary traitsWindow" 0 -> 'http:'. 1 -> 'http'. 2 -> '//slatelanguage.org'.
3 -> 'slatelanguage.org'. 4 -> '/test/page'. 5 -> '?query'. 6 -> 'query'. 7 -> Nil}"</lang>
Smalltalk
<lang smalltalk>|re s s1| re := Regex fromString: '[a-z]+ing'. s := 'this is a matching string'. s1 := 'this does not match'.
(s =~ re) ifMatched: [ :b |
b match displayNl
]. (s1 =~ re) ifMatched: [ :b |
'Strangely matched!' displayNl
] ifNotMatched: [
'no match!' displayNl
].
(s replacingRegex: re with: 'modified') displayNl.</lang>
Tcl
Test using regexp
:
<lang tcl>set theString "I am a string"
if {[regexp -- {string$} $theString]} {
puts "Ends with 'string'"
}
if {![regexp -- {^You} $theString]} {
puts "Does not start with 'You'"
}</lang>
Extract substring using regexp
<lang tcl>set theString "This string has >123< a number in it"
if {[regexp -- {>(\d+)<} $theString -> number]} {
puts "Contains the number $number"
}</lang>
Substitute using regsub
<lang tcl>set theString = "I am a string"
puts [regsub -- { +a +} $theString { another }]</lang>
Toka
Toka's regular expression library allows for matching, but does not yet provide for replacing elements within strings.
<lang toka>#! Include the regex library needs regex
- ! The two test strings
" This is a string" is-data test.1 " Another string" is-data test.2
- ! Create a new regex named 'expression' which tries
- ! to match strings beginning with 'This'.
" ^This" regex: expression
- ! An array to store the results of the match
- ! (Element 0 = starting offset, Element 1 = ending offset of match)
2 cells is-array match
- ! Try both test strings against the expression.
- ! try-regex will return a flag. -1 is TRUE, 0 is FALSE
expression test.1 2 match try-regex . expression test.2 2 match try-regex .</lang>
Vedit macro language
Vedit can perform searches and matching with either regular expressions, pattern matching codes or plain text. These examples use regular expressions.
Match text at cursor location: <lang vedit>if (Match(".* string$", REGEXP)==0) {
Statline_Message("This line ends with 'string'")
}</lang>
Search for a pattern: <lang vedit>if (Search("string$", REGEXP+NOERR)) {
Statline_Message("'string' at and of line found")
}</lang>
Replace: <lang vedit>Replace(" a ", " another ", REGEXP+NOERR)</lang>