I'm working on modernizing Rosetta Code's infrastructure. Starting with communications. Please accept this time-limited open invite to RC's Slack.. --Michael Mol (talk) 20:59, 30 May 2020 (UTC)

Determine sentence type

From Rosetta Code
Task
Determine sentence type
You are encouraged to solve this task according to the task description, using any language you may know.

Use these sentences: "hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it."

Task
Search for the last used punctuation in a sentence, and determine its type according to its punctuation.
Output one of these letters
"E" (Exclamation!), "Q" (Question?), "S" (Serious.), "N" (Neutral).
Extra
Make your code able to determine multiple sentences.


Don't leave any errors!


Other tasks related to string operations:
Metrics
Counting
Remove/replace
Anagrams/Derangements/shuffling
Find/Search/Determine
Formatting
Song lyrics/poems/Mad Libs/phrases
Tokenize
Sequences



ALGOL 68[edit]

Classifies an empty string as "".

BEGIN # determuine the type of a sentence by looking at the final punctuation  #
CHAR exclamation = "E"; # classification codes... #
CHAR question = "Q";
CHAR serious = "S";
CHAR neutral = "N";
# returns the type(s) of the sentence(s) in s - exclamation, question, #
# serious or neutral; if there are multiple sentences #
# the types are separated by | #
PROC classify = ( STRING s )STRING:
BEGIN
STRING result := "";
BOOL pending neutral := FALSE;
FOR s pos FROM LWB s TO UPB s DO
IF pending neutral := FALSE;
CHAR c = s[ s pos ];
c = "?"
THEN result +:= question + "|"
ELIF c = "!"
THEN result +:= exclamation + "|"
ELIF c = "."
THEN result +:= serious + "|"
ELSE pending neutral := TRUE
FI
OD;
IF pending neutral
THEN result +:= neutral + "|"
FI;
# if s was empty, then return an empty string, otherwise remove the final separator #
IF result = "" THEN "" ELSE result[ LWB result : UPB result - 1 ] FI
END # classify # ;
# task test case #
print( ( classify( "hi there, how are you today? I'd like to present to you the washing machine 9001. "
+ "You have been nominated to win one of these! Just make sure you don't break it"
)
, newline
)
)
END
Output:
Q|S|E|N

AutoHotkey[edit]

Sentence := "hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it"
Msgbox, % SentenceType(Sentence)
 
SentenceType(Sentence) {
Sentence := Trim(Sentence)
Loop, Parse, Sentence, .?!
{
N := (!E && !Q && !S)
, S := (InStr(SubStr(Sentence, InStr(Sentence, A_LoopField)+StrLen(A_LoopField), 3), "."))
, Q := (InStr(SubStr(Sentence, InStr(Sentence, A_LoopField)+StrLen(A_LoopField), 3), "?"))
, E := (InStr(SubStr(Sentence, InStr(Sentence, A_LoopField)+StrLen(A_LoopField), 3), "!"))
, type .= (E) ? ("E|") : ((Q) ? ("Q|") : ((S) ? ("S|") : "N|"))
, D := SubStr(Sentence, InStr(Sentence, A_LoopField)+StrLen(A_LoopField), 3)
}
return (D = SubStr(Sentence, 1, 3)) ? RTrim(RTrim(type, "|"), "N|") : RTrim(type, "|")
}
Output:
Q|S|E|N

Factor[edit]

This program attempts to prevent common abbreviations from ending sentences early. It also tries to handle parenthesized sentences and implements an additional type for exclamatory questions (EQ).

Works with: Factor version 0.99 2021-06-02
USING: combinators io kernel regexp sequences sets splitting
wrap.strings ;
 
! courtesy of https://www.infoplease.com/common-abbreviations
 
CONSTANT: common-abbreviations {
"A.B." "abbr." "Acad." "A.D." "alt." "A.M." "Assn."
"at. no." "at. wt." "Aug." "Ave." "b." "B.A." "B.C." "b.p."
"B.S." "c." "Capt." "cent." "co." "Col." "Comdr." "Corp."
"Cpl." "d." "D.C." "Dec." "dept." "dist." "div." "Dr." "ed."
"est." "et al." "Feb." "fl." "gal." "Gen." "Gov." "grad."
"Hon." "i.e." "in." "inc." "Inst." "Jan." "Jr." "lat."
"Lib." "long." "Lt." "Ltd." "M.D." "Mr." "Mrs." "mt." "mts."
"Mus." "no." "Nov." "Oct." "Op." "pl." "pop." "pseud." "pt."
"pub." "Rev." "rev." "R.N." "Sept." "Ser." "Sgt." "Sr."
"St." "uninc." "Univ." "U.S." "vol." "vs." "wt."
}
 
: sentence-enders ( str -- newstr )
R/ \)/ "" re-replace
" " split harvest
unclip-last swap
[ common-abbreviations member? ] reject
[ last ".!?" member? ] filter
swap suffix ;
 
: serious? ( str -- ? ) last CHAR: . = ;
: neutral? ( str -- ? ) last ".!?" member? not ;
: mixed? ( str -- ? ) "?!" intersect length 2 = ;
: exclamation? ( str -- ? ) last CHAR: ! = ;
: question? ( str -- ? ) last CHAR: ? = ;
 
: type ( str -- newstr )
{
{ [ dup serious? ] [ drop "S" ] }
{ [ dup neutral? ] [ drop "N" ] }
{ [ dup mixed? ] [ drop "EQ" ] }
{ [ dup exclamation? ] [ drop "E" ] }
{ [ dup question? ] [ drop "Q" ] }
[ drop "UNKNOWN" ]
} cond ;
 
: sentences ( str -- newstr )
sentence-enders [ type ] map "|" join ;
 
: show ( str -- )
dup sentences " -> " glue 60 wrap-string print ;
 
"Hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it" show
nl
"(There was nary a mouse stirring.) But the cats were going
bonkers!" show
nl
"\"Why is the car so slow?\" she said." show
nl
"Hello, Mr. Anderson!" show
nl
"Are you sure?!?! How can you know?" show
Output:
Hi there, how are you today? I'd like to present to you the
washing machine 9001. You have been nominated to win one of
these! Just make sure you don't break it -> Q|S|E|N

(There was nary a mouse stirring.) But the cats were going
bonkers! -> S|E

"Why is the car so slow?" she said. -> S

Hello, Mr. Anderson! -> E

Are you sure?!?! How can you know? -> EQ|Q

FreeBASIC[edit]

function sentype( byref s as string ) as string
'determines the sentence type of the first sentence in the string
'returns "E" for an exclamation, "Q" for a question, "S" for serious
'and "N" for neutral.
'modifies the string to remove the first sentence
for i as uinteger = 1 to len(s)
if mid(s, i, 1) = "!" then
s=right(s,len(s)-i)
return "E"
end if
if mid(s, i, 1) = "." then
s=right(s,len(s)-i)
return "S"
end if
if mid(s, i, 1) = "?" then
s=right(s,len(s)-i)
return "Q"
end if
next i
'if we get to the end without encountering punctuation, this
'must be a neutral sentence, which can only happen as the last one
s=""
return "N"
end function
 
dim as string spam = "hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it"
 
while len(spam)>0
print sentype(spam)
wend
Output:
Q

S E N


Go[edit]

Translation of: Wren
package main
 
import (
"fmt"
"strings"
)
 
func sentenceType(s string) string {
if len(s) == 0 {
return ""
}
var types []string
for _, c := range s {
if c == '?' {
types = append(types, "Q")
} else if c == '!' {
types = append(types, "E")
} else if c == '.' {
types = append(types, "S")
}
}
if strings.IndexByte("?!.", s[len(s)-1]) == -1 {
types = append(types, "N")
}
return strings.Join(types, "|")
}
 
func main() {
s := "hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it"
fmt.Println(sentenceType(s))
}
Output:
Q|S|E|N

Julia[edit]

const text = """
Hi there, how are you today? I'd like to present to you the washing machine 9001.
You have been nominated to win one of these! Just make sure you don't break it"""
 
haspunctotype(s) = '.' in s ? "S" : '!' in s ? "E" : '?' in s ? "Q" : "N"
 
text = replace(text, "\n" => " ")
parsed = strip.(split(text, r"(?:(?:(?<=[\?\!\.])(?:))|(?:(?:)(?=[\?\!\.])))"))
isodd(length(parsed)) && push!(parsed, "") # if ends without pnctuation
for i in 1:2:length(parsed)-1
println(rpad(parsed[i] * parsed[i + 1], 52), " ==> ", haspunctotype(parsed[i + 1]))
end
 
Output:
Hi there, how are you today?                         ==> Q
I'd like to present to you the washing machine 9001. ==> S
You have been nominated to win one of these!         ==> E
Just make sure you don't break it                    ==> N

Perl[edit]

use strict;
use warnings;
use feature 'say';
use Lingua::Sentence;
 
my $para1 = <<'EOP';
hi there, how are you today? I'd like to present to you the washing machine
9001. You have been nominated to win one of these! Just make sure you don't
break it
EOP

 
my $para2 = <<'EOP';
Just because there are punctuation characters like "?", "!" or especially "."
present, it doesn't necessarily mean you have reached the end of a sentence,
does it Mr. Magoo? The syntax highlighting here for Perl isn't bad at all.
EOP

 
my $splitter = Lingua::Sentence->new("en");
for my $text ($para1, $para2) {
for my $s (split /\n/, $splitter->split( $text =~ s/\n//gr ) {
print "$s| ";
if ($s =~ /!$/) { say 'E' }
elsif ($s =~ /\?$/) { say 'Q' }
elsif ($s =~ /\.$/) { say 'S' }
else { say 'N' }
}
}
Output:
hi there, how are you today?| Q
I'd like to present to you the washing machine 9001.| S
You have been nominated to win one of these!| E
Just make sure you don't break it.| N
Just because there are punctuation characters like "?", "!" or especially "." present, it doesn't necessarily mean you have reached the end of a sentence, does it Mr. Magoo?| Q
The syntax highlighting here for Perl isn't bad at all.| S

Phix[edit]

with javascript_semantics
constant s = `hi there, how are you today? I'd like to present 
to you the washing machine 9001. You have been nominated to win 
one of these! Just make sure you don't break it`
sequence t = split_any(trim(s),"?!."),
         u = substitute_all(s,t,repeat("|",length(t))),
         v = substitute_all(u,{"|?","|!","|.","|"},"QESN"),
         w = join(v,'|')
?w
Output:
"Q|S|E|N"

Python[edit]

import re
 
txt = """
Hi there, how are you today? I'd like to present to you the washing machine 9001.
You have been nominated to win one of these! Just make sure you don't break it"""

 
def haspunctotype(s):
return 'S' if '.' in s else 'E' if '!' in s else 'Q' if '?' in s else 'N'
 
txt = re.sub('\n', '', txt)
pars = [s.strip() for s in re.split("(?:(?:(?<=[\?\!\.])(?:))|(?:(?:)(?=[\?\!\.])))", txt)]
if len(pars) % 2:
pars.append('') # if ends without punctuation
for i in range(0, len(pars)-1, 2):
print((pars[i] + pars[i + 1]).ljust(54), "==>", haspunctotype(pars[i + 1]))
 
Output:
Hi there, how are you today?                           ==> Q
I'd like to present to you the washing machine 9001.   ==> S
You have been nominated to win one of these!           ==> E
Just make sure you don't break it                      ==> N


Or for more generality, and an alternative to hand-crafted regular expressions:

'''Grouping and tagging by final character of string'''
 
from functools import reduce
from itertools import groupby
 
 
# tagGroups :: Dict -> [String] -> [(String, [String])]
def tagGroups(tagDict):
'''A list of (Tag, SentenceList) tuples, derived
from an input text and a supplied dictionary of
tags for each of a set of final punctuation marks.
'''

def go(sentences):
return [
(tagDict.get(k, 'Not punctuated'), list(v))
for (k, v) in groupby(
sorted(sentences, key=last),
key=last
)
]
return go
 
 
# sentenceSegments :: Chars -> String -> [String]
def sentenceSegments(punctuationChars):
'''A list of sentences delimited by the supplied
punctuation characters, where these are followed
by spaces.
'''

def go(s):
return [
''.join(cs).strip() for cs
in splitBy(
sentenceBreak(punctuationChars)
)(s)
]
return go
 
 
# sentenceBreak :: Chars -> (Char, Char) -> Bool
def sentenceBreak(finalPunctuation):
'''True if the first of two characters is a final
punctuation mark and the second is a space.
'''

def go(a, b):
return a in finalPunctuation and " " == b
return go
 
 
# ------------------------- TEST -------------------------
# main :: IO ()
def main():
'''Join, segmentation, tags'''
 
tags = {'!': 'E', '?': 'Q', '.': 'S'}
 
# Joined by spaces,
sample = ' '.join([
"Hi there, how are you today?",
"I'd like to present to you the washing machine 9001.",
"You have been nominated to win one of these!",
"Might it be possible to add some challenge to this task?",
"Feels as light as polystyrene filler.",
"But perhaps substance isn't the goal!",
"Just make sure you don't break off before the"
])
 
# segmented by punctuation,
sentences = sentenceSegments(
tags.keys()
)(sample)
 
# and grouped under tags.
for kv in tagGroups(tags)(sentences):
print(kv)
 
 
# ----------------------- GENERIC ------------------------
 
# last :: [a] -> a
def last(xs):
'''The last element of a non-empty list.'''
return xs[-1]
 
 
# splitBy :: (a -> a -> Bool) -> [a] -> [[a]]
def splitBy(p):
'''A list split wherever two consecutive
items match the binary predicate p.
'''

# step :: ([[a]], [a], a) -> a -> ([[a]], [a], a)
def step(acp, x):
acc, active, prev = acp
 
return (acc + [active], [x], x) if p(prev, x) else (
(acc, active + [x], x)
)
 
# go :: [a] -> [[a]]
def go(xs):
if 2 > len(xs):
return xs
else:
h = xs[0]
ys = reduce(step, xs[1:], ([], [h], h))
# The accumulated sublists, and the final group.
return ys[0] + [ys[1]]
 
return go
 
 
# MAIN ---
if __name__ == '__main__':
main()
Output:
('E', ['You have been nominated to win one of these!', "But perhaps substance isn't the goal!"])
('S', ["I'd like to present to you the washing machine 9001.", 'Feels as light as polystyrene filler.'])
('Q', ['Hi there, how are you today?', 'Might it be possible to add some challenge to this task?'])
('Not punctuated', ["Just make sure you don't break off before the"])

Raku[edit]

use Lingua::EN::Sentence;
 
my $paragraph = q:to/PARAGRAPH/;
hi there, how are you today? I'd like to present to you the washing machine
9001. You have been nominated to win one of these! Just make sure you don'
t
break it
 
 
Just because there are punctuation characters like "?", "!" or especially "."
present, it doesn't necessarily mean you have reached the end of a sentence,
does it Mr. Magoo? The syntax highlighting here for Raku isn'
t the best.
PARAGRAPH
 
say join "\n\n", $paragraph.&get_sentences.map: {
/(<:punct>)$/;
$_ ~ ' | ' ~ do
given $0 {
when '!' { 'E' };
when '?' { 'Q' };
when '.' { 'S' };
default { 'N' };
}
}
Output:
hi there, how are you today? | Q

I'd like to present to you the washing machine
9001. | S

You have been nominated to win one of these! | E

Just make sure you don't
break it | N

Just because there are punctuation characters like "?", "!" or especially "."
present, it doesn't necessarily mean you have reached the end of a sentence,
does it Mr. Magoo? | Q

The syntax highlighting here for Raku isn't the best. | S

Wren[edit]

var sentenceType = Fn.new { |s|
if (s.count == 0) return ""
var types = []
for (c in s) {
if (c == "?") {
types.add("Q")
} else if (c == "!") {
types.add("E")
} else if (c == ".") {
types.add("S")
}
}
if (!"?!.".contains(s[-1])) types.add("N")
return types.join("|")
}
 
var s = "hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it"
System.print(sentenceType.call(s))
Output:
Q|S|E|N


Library: Wren-pattern
Library: Wren-trait

The following alternative version takes the simplistic view that (unless they end the final sentence of the paragraph) ?, ! or . will only end a sentence if they're immediately followed by a space. This of course is nonsense, given the way English is written nowadays, but it's probably an improvement on the first version without the need to search through an inevitably incomplete list of abbreviations.

import "./pattern" for Pattern
import "./trait" for Indexed
 
var map = { "?": "Q", "!": "E", ".": "S", "": "N" }
var p = Pattern.new("[? |! |. ]")
var paras = [
"hi there, how are you today? I'd like to present to you the washing machine 9001. You have been nominated to win one of these! Just make sure you don't break it",
"hi there, how are you on St.David's day (isn't it a holiday yet?), Mr.Smith? I'd like to present to you (well someone has to win one!) the washing machine 900.1. You have been nominated by Capt.Johnson('?') to win one of these! Just make sure you (or Mrs.Smith) don't break it. By the way, what the heck is an exclamatory question!?"
]
 
for (para in paras) {
para = para.trim()
var sentences = p.splitAll(para)
var endings = p.findAll(para).map { |m| m.text[0] }.toList
var lastChar = sentences[-1][-1]
if ("?!.".contains(lastChar)) {
endings.add(lastChar)
sentences[-1] = sentences[-1][0...-1]
} else {
endings.add("")
}
for (se in Indexed.new(sentences)) {
var ix = se.index
var sentence = se.value
System.print("%(map[endings[ix]]) <- %(sentence + endings[ix])")
}
System.print()
}
Output:
Q <- hi there, how are you today?
S <- I'd like to present to you the washing machine 9001.
E <- You have been nominated to win one of these!
N <- Just make sure you don't break it

Q <- hi there, how are you on St.David's day (isn't it a holiday yet?), Mr.Smith?
S <- I'd like to present to you (well someone has to win one!) the washing machine 900.1.
E <- You have been nominated by Capt.Johnson('?') to win one of these!
S <- Just make sure you (or Mrs.Smith) don't break it.
Q <- By the way, what the heck is an exclamatory question!?