XXXX redacted: Difference between revisions
→{{header|Go}}: Rewritten to eliminate many of the previous problems. Now does all the Raku test cases.
Thundergnat (talk | contribs) (→{{header|Phix}}: try to make clear for other readers that a stretch goal is indeed a stretch) |
(→{{header|Go}}: Rewritten to eliminate many of the previous problems. Now does all the Raku test cases.) |
||
Line 63:
=={{header|Go}}==
{{libheader|Unicode Text Segmentation for Go}}
Go has a problem with zero width joiner (ZWJ) emojis such as the final one in the test string which is not recognized as a single 'character' by the language as it consists of five Unicode code-points (or 'runes') instead of one. This problem is aggravated (as here) when one of the constituents of the ZWJ emoji happens to be a 'normal' emoji contained within the same test string!
Care is therefore needed to ensure that when a normal emoji is being redacted it doesn't also redact one of the constituents of a ZWJ emoji.
To get the number of 'X's right where a ZWJ emoji or other character combination is being replaced, a third party library function is used which counts the number of graphemes in a string, as required by the task.
<lang go>package main
import (
"fmt"
"github.com/rivo/uniseg"
"log"
"regexp"
"strings"
)
func
ls :=
if lw != ls+1
log.Fatal("mismatch between number of words and separators")
}
var sb strings.Builder
for i := 0; i < ls; i++ {
sb.WriteString(words[i])
sb.WriteString(seps[i])
}
sb.WriteString(words[lw-1])
return sb.String()
}
func redact(text, word, opts string) {
var partial, overkill bool
exp := word
if strings.IndexByte(opts, 'p') >= 0 {
partial = true
}
if strings.IndexByte(opts, 'o') >= 0 {
}
if strings.IndexByte(opts, 'i') >= 0 {
}
rgx := regexp.MustCompile(`[\s!-&(-,./:-@[-^{-~]+`) // all punctuation except -'_
rgx2 := regexp.MustCompile(exp)
for i, w := range words
match := rgx2.FindString(w)
// check there's a match and it's not part of a ZWJ emoji
if match == "" || strings.Index(w, match+"\u200d") >= 0 ||
strings.Index(w, "\u200d"+match) >= 0 {
continue
}
case
if words[i] == match {
words[i] = strings.Repeat("X", uniseg.GraphemeClusterCount(w))
}
case partial:
repl := strings.Repeat("X", uniseg.GraphemeClusterCount(word))
words[i] = rgx2.ReplaceAllLiteralString(w, repl)
}
}
fmt.Printf("%s %s\n\n", opts, join(words, seps))
}
func printResults(text string, allOpts, allWords []string) {
fmt.Printf("Text: %s\n\n", text)
for _, word := range allWords {
fmt.Printf("Redact '%s':\n", word)
for _, opts := range allOpts {
redact(text, word, opts)
}
}
fmt.Println()
}
func main() {
text := `Tom? Toms bottom tomato is in his stomach while playing the "Tom-tom" brand tom-toms. That's so tom.
'Tis very tomish, don't you think?`
allOpts := []string{"[w|s|n]", "[w|i|n]", "[p|s|n]", "[p|i|n]", "[p|s|o]", "[p|i|o]"}
allWords := []string{"Tom", "tom", "t"}
printResults(text, allOpts, allWords)
text = "🧑 👨 🧔 👨👩👦"
allWords = []string{"👨", "👨👩👦"}
text = "Argentina🧑🇦🇹 France👨🇫🇷 Germany🧔🇩🇪 Netherlands👨👩👦🇳🇱"
allOpts = []string{"[p]", "[p|o]"}
printResults(text, allOpts, allWords)
}</lang>
{{out}}
<pre style="height:80ex;overflow:scroll;">
Text: Tom? Toms bottom tomato is in his stomach while playing the "Tom-tom" brand tom-toms. That's so tom.
'Tis very tomish, don't you think?
Redact 'Tom':
[w|s|n] XXX? Toms bottom tomato is in his stomach while playing the "Tom-tom" brand tom-toms. That's so tom.
Line 207 ⟶ 183:
[p|i|o] XXX? XXXX XXXXXX XXXXXX is in his XXXXXXX while playing the "XXXXXXX" brand XXXXXXXX. That's so XXX.
'Tis very XXXXXX, don't you think?
Redact 'tom':
Line 227 ⟶ 202:
[p|i|o] XXX? XXXX XXXXXX XXXXXX is in his XXXXXXX while playing the "XXXXXXX" brand XXXXXXXX. That's so XXX.
'Tis very XXXXXX, don't you think?
Redact 't':
Line 248 ⟶ 222:
XXXX very XXXXXX, XXXXX you XXXXX?
Text: 🧑 👨 🧔 👨👩👦
Redact '👨':
[w] 🧑 X 🧔 👨👩👦
Redact '👨👩👦':
[w] 🧑 👨 🧔 X
Text: Argentina🧑🇦🇹 France👨🇫🇷 Germany🧔🇩🇪 Netherlands👨👩👦🇳🇱
Redact '👨':
[p] Argentina🧑🇦🇹 FranceX🇫🇷 Germany🧔🇩🇪 Netherlands👨👩👦🇳🇱
[p|o] Argentina🧑🇦🇹 XXXXXXXX Germany🧔🇩🇪 Netherlands👨👩👦🇳🇱
Redact '👨👩👦':
[p] Argentina🧑🇦🇹 France👨🇫🇷 Germany🧔🇩🇪 NetherlandsX🇳🇱
[p|o] Argentina🧑🇦🇹 France👨🇫🇷 Germany🧔🇩🇪 XXXXXXXXXXXXX
</pre>
=={{header|Julia}}==
|