Sort an outline at every level
You are encouraged to solve this task according to the task description, using any language you may know.
- Task
Write and test a function over an indented plain text outline which either:
- Returns a copy of the outline in which the sub-lists at every level of indentation are sorted, or
- reports that the indentation characters or widths are not consistent enough to make the outline structure clear.
Your code should detect and warn of at least two types of inconsistent indentation:
- inconsistent use of whitespace characters (e.g. mixed use of tabs and spaces)
- inconsistent indent widths. For example, an indentation with an odd number of spaces in an outline in which the unit indent appears to be 2 spaces, or 4 spaces.
Your code should be able to detect and handle both tab-indented, and space-indented (e.g. 4 space, 2 space etc) outlines, without being given any advance warning of the indent characters used, or the size of the indent units.
You should also be able to specify different types of sort, for example, as a minimum, both ascending and descending lexical sorts.
Your sort should not alter the type or size of the indentation units used in the input outline.
(For an application of Indent Respectful Sort, see the Sublime Text package of that name. The Python source text [1] is available for inspection on Github).
Tests
- Sort every level of the (4 space indented) outline below lexically, once ascending and once descending.
zeta beta gamma lambda kappa mu delta alpha theta iota epsilon
- Do the same with a tab-indented equivalent of the same outline.
zeta gamma mu lambda kappa delta beta alpha theta iota epsilon
The output sequence of an ascending lexical sort of each level should be:
alpha epsilon iota theta zeta beta delta gamma kappa lambda mu
The output sequence of a descending lexical sort of each level should be:
zeta gamma mu lambda kappa delta beta alpha theta iota epsilon
- Attempt to separately sort each of the following two outlines, reporting any inconsistencies detected in their indentations by your validation code.
alpha epsilon iota theta zeta beta delta gamma kappa lambda mu
zeta beta gamma lambda kappa mu delta alpha theta iota epsilon
- Related tasks
AutoHotkey[edit]
Sort_an_outline(data, reverse:=""){Examples: "Example of Data_2"
;-----------------------
; get Delim, Error Check
for i, line in StrSplit(data, "`n", "`r")
if !Delim
RegExMatch(line, "^\h+", Delim)
else if RegExMatch(RegExReplace(line, "^(" Delim ")*"), "^\h+")
return "Error @ " line
;-----------------------
; ascending lexical sort
ancestor:=[], tree:= [], result:=""
for i, line in StrSplit(data, "`n", "`r"){
name := StrSplit(line, delim?delim:"`t")
n := name.count()
son := name[n]
if (n>rank) && father
ancestor.push(father)
loop % rank-n
ancestor.pop()
for i, father in ancestor
Lineage .= father . delim
output .= Lineage son "`n"
rank:=n, father:=son, Lineage:=""
}
Sort, output
for i, line in StrSplit(output, "`n", "`r")
name := StrSplit(line, delim)
, result .= indent(name.count()-1, delim) . name[name.count()] "`n"
if !reverse
return Trim(result, "`n")
;-----------------------
; descending lexical sort
ancestor:=[], Lineage:="", result:=""
Sort, output, R
for i, line in StrSplit(output, "`n", "`r"){
name := StrSplit(line, delim)
if !ancestor[Lineage]
loop % name.count()
result .= indent(A_Index-1, delim) . name[A_Index] "`n"
else if (StrSplit(Lineage, ",")[name.count()] <> name[name.count()])
result .= indent(name.count()-1, delim) . name[name.count()] "`n"
Lineage := ""
loop % name.count()-1
Lineage .= (Lineage ? "," : "") . name[A_Index]
, ancestor[Lineage] := true
}
return result
}
indent(n, delim){
Loop, % n
result.=delim
return result
}
Data_2 =Output: tabulated for ease of reading, actual output is text only, Error check returns first line with inconsistent delimiter!
(
zeta
gamma
mu
lambda
kappa
delta
beta
alpha
theta
iota
epsilon
)
MsgBox % Sort_an_outline(Data_2)
MsgBox % Sort_an_outline(Data_2, 1)
return
====================================================================================================== Data_1, 4-Space |Output 1 ||Data_2, tab |Output 2 |Output 2, Reverse ====================================================================================================== zeta |alpha ||zeta |alpha |zeta beta | epsilon || gamma | epsilon | gamma gamma | iota || mu | iota | mu lambda | theta || lambda | theta | lambda kappa |zeta || kappa |zeta | kappa mu | beta || delta | beta | delta delta | delta || beta | delta | beta alpha | gamma ||alpha | gamma |alpha theta | kappa || theta | kappa | theta iota | lambda || iota | lambda | iota epsilon | mu || epsilon | mu | epsilon ====================================================================================================== ====================================================================================================== inconsistent_1 ||inconsistent_2 ================================= alpha ||zeta epsilon || beta iota || gamma theta || lambda zeta || kappa beta || mu delta || delta gamma ||alpha kappa || theta lambda || iota mu || epsilon ================================= Error @ iota ||Error @ gamma =================================
Go[edit]
package main
import (
"fmt"
"math"
"sort"
"strings"
)
func sortedOutline(originalOutline []string, ascending bool) {
outline := make([]string, len(originalOutline))
copy(outline, originalOutline) // make copy in case we mutate it
indent := ""
del := "\x7f"
sep := "\x00"
var messages []string
if strings.TrimLeft(outline[0], " \t") != outline[0] {
fmt.Println(" outline structure is unclear")
return
}
for i := 1; i < len(outline); i++ {
line := outline[i]
lc := len(line)
if strings.HasPrefix(line, " ") || strings.HasPrefix(line, " \t") || line[0] == '\t' {
lc2 := len(strings.TrimLeft(line, " \t"))
currIndent := line[0 : lc-lc2]
if indent == "" {
indent = currIndent
} else {
correctionNeeded := false
if (strings.ContainsRune(currIndent, '\t') && !strings.ContainsRune(indent, '\t')) ||
(!strings.ContainsRune(currIndent, '\t') && strings.ContainsRune(indent, '\t')) {
m := fmt.Sprintf("corrected inconsistent whitespace use at line %q", line)
messages = append(messages, indent+m)
correctionNeeded = true
} else if len(currIndent)%len(indent) != 0 {
m := fmt.Sprintf("corrected inconsistent indent width at line %q", line)
messages = append(messages, indent+m)
correctionNeeded = true
}
if correctionNeeded {
mult := int(math.Round(float64(len(currIndent)) / float64(len(indent))))
outline[i] = strings.Repeat(indent, mult) + line[lc-lc2:]
}
}
}
}
levels := make([]int, len(outline))
levels[0] = 1
margin := ""
for level := 1; ; level++ {
allPos := true
for i := 1; i < len(levels); i++ {
if levels[i] == 0 {
allPos = false
break
}
}
if allPos {
break
}
mc := len(margin)
for i := 1; i < len(outline); i++ {
if levels[i] == 0 {
line := outline[i]
if strings.HasPrefix(line, margin) && line[mc] != ' ' && line[mc] != '\t' {
levels[i] = level
}
}
}
margin += indent
}
lines := make([]string, len(outline))
lines[0] = outline[0]
var nodes []string
for i := 1; i < len(outline); i++ {
if levels[i] > levels[i-1] {
if len(nodes) == 0 {
nodes = append(nodes, outline[i-1])
} else {
nodes = append(nodes, sep+outline[i-1])
}
} else if levels[i] < levels[i-1] {
j := levels[i-1] - levels[i]
nodes = nodes[0 : len(nodes)-j]
}
if len(nodes) > 0 {
lines[i] = strings.Join(nodes, "") + sep + outline[i]
} else {
lines[i] = outline[i]
}
}
if ascending {
sort.Strings(lines)
} else {
maxLen := len(lines[0])
for i := 1; i < len(lines); i++ {
if len(lines[i]) > maxLen {
maxLen = len(lines[i])
}
}
for i := 0; i < len(lines); i++ {
lines[i] = lines[i] + strings.Repeat(del, maxLen-len(lines[i]))
}
sort.Sort(sort.Reverse(sort.StringSlice(lines)))
}
for i := 0; i < len(lines); i++ {
s := strings.Split(lines[i], sep)
lines[i] = s[len(s)-1]
if !ascending {
lines[i] = strings.TrimRight(lines[i], del)
}
}
if len(messages) > 0 {
fmt.Println(strings.Join(messages, "\n"))
fmt.Println()
}
fmt.Println(strings.Join(lines, "\n"))
}
func main() {
outline := []string{
"zeta",
" beta",
" gamma",
" lambda",
" kappa",
" mu",
" delta",
"alpha",
" theta",
" iota",
" epsilon",
}
outline2 := make([]string, len(outline))
for i := 0; i < len(outline); i++ {
outline2[i] = strings.ReplaceAll(outline[i], " ", "\t")
}
outline3 := []string{
"alpha",
" epsilon",
" iota",
" theta",
"zeta",
" beta",
" delta",
" gamma",
" \t kappa", // same length but \t instead of space
" lambda",
" mu",
}
outline4 := []string{
"zeta",
" beta",
" gamma",
" lambda",
" kappa",
" mu",
" delta",
"alpha",
" theta",
" iota",
" epsilon",
}
fmt.Println("Four space indented outline, ascending sort:")
sortedOutline(outline, true)
fmt.Println("\nFour space indented outline, descending sort:")
sortedOutline(outline, false)
fmt.Println("\nTab indented outline, ascending sort:")
sortedOutline(outline2, true)
fmt.Println("\nTab indented outline, descending sort:")
sortedOutline(outline2, false)
fmt.Println("\nFirst unspecified outline, ascending sort:")
sortedOutline(outline3, true)
fmt.Println("\nFirst unspecified outline, descending sort:")
sortedOutline(outline3, false)
fmt.Println("\nSecond unspecified outline, ascending sort:")
sortedOutline(outline4, true)
fmt.Println("\nSecond unspecified outline, descending sort:")
sortedOutline(outline4, false)
}
- Output:
Four space indented outline, ascending sort: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Four space indented outline, descending sort: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Tab indented outline, ascending sort: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Tab indented outline, descending sort: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon First unspecified outline, ascending sort: corrected inconsistent whitespace use at line " \t kappa" alpha epsilon iota theta zeta beta delta gamma kappa lambda mu First unspecified outline, descending sort: corrected inconsistent whitespace use at line " \t kappa" zeta gamma mu lambda kappa delta beta alpha theta epsilon iota Second unspecified outline, ascending sort: corrected inconsistent indent width at line " gamma" corrected inconsistent indent width at line " kappa" alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Second unspecified outline, descending sort: corrected inconsistent indent width at line " gamma" corrected inconsistent indent width at line " kappa" zeta gamma mu lambda kappa delta beta alpha theta iota epsilon
Haskell[edit]
{-# LANGUAGE OverloadedStrings #-}
import Data.Tree (Tree(..), foldTree)
import qualified Data.Text.IO as T
import qualified Data.Text as T
import qualified Data.List as L
import Data.Bifunctor (first)
import Data.Ord (comparing)
import Data.Char (isSpace)
--------------- OUTLINE SORTED AT EVERY LEVEL --------------
sortedOutline :: (Tree T.Text -> Tree T.Text -> Ordering)
-> T.Text
-> Either T.Text T.Text
sortedOutline cmp outlineText =
let xs = T.lines outlineText
in consistentIndentUnit (nonZeroIndents xs) >>=
\indentUnit ->
let forest = forestFromLineIndents $ indentLevelsFromLines xs
sortedForest =
subForest $
foldTree (\x xs -> Node x (L.sortBy cmp xs)) (Node "" forest)
in Right $ outlineFromForest indentUnit sortedForest
--------------------------- TESTS --------------------------
main :: IO ()
main =
mapM_ T.putStrLn $
concat $
[ \(comparatorLabel, cmp) ->
(\kv ->
let section = headedSection (fst kv) comparatorLabel
in (either (section . (" -> " <>)) section . sortedOutline cmp . snd)
kv) <$>
[ ("Four-spaced", spacedOutline)
, ("Tabbed", tabbedOutline)
, ("First unknown type", confusedOutline)
, ("Second unknown type", raggedOutline)
]
] <*>
[("(A -> Z)", comparing rootLabel), ("(Z -> A)", flip (comparing rootLabel))]
headedSection :: T.Text -> T.Text -> T.Text -> T.Text
headedSection outlineType comparatorName x =
T.concat ["\n", outlineType, " ", comparatorName, ":\n\n", x]
spacedOutline, tabbedOutline, confusedOutline, raggedOutline :: T.Text
spacedOutline =
"zeta\n\
\ beta\n\
\ gamma\n\
\ lambda\n\
\ kappa\n\
\ mu\n\
\ delta\n\
\alpha\n\
\ theta\n\
\ iota\n\
\ epsilon"
tabbedOutline =
"zeta\n\
\\tbeta\n\
\\tgamma\n\
\\t\tlambda\n\
\\t\tkappa\n\
\\t\tmu\n\
\\tdelta\n\
\alpha\n\
\\ttheta\n\
\\tiota\n\
\\tepsilon"
confusedOutline =
"zeta\n\
\ beta\n\
\ gamma\n\
\ lambda\n\
\ \t kappa\n\
\ mu\n\
\ delta\n\
\alpha\n\
\ theta\n\
\ iota\n\
\ epsilon"
raggedOutline =
"zeta\n\
\ beta\n\
\ gamma\n\
\ lambda\n\
\ kappa\n\
\ mu\n\
\ delta\n\
\alpha\n\
\ theta\n\
\ iota\n\
\ epsilon"
-------- OUTLINE TREES :: SERIALIZED AND DESERIALIZED ------
forestFromLineIndents :: [(Int, T.Text)] -> [Tree T.Text]
forestFromLineIndents = go
where
go [] = []
go ((n, s):xs) = Node s (go subOutline) : go rest
where
(subOutline, rest) = span ((n <) . fst) xs
indentLevelsFromLines :: [T.Text] -> [(Int, T.Text)]
indentLevelsFromLines xs = first (`div` indentUnit) <$> pairs
where
pairs = first T.length . T.span isSpace <$> xs
indentUnit = maybe 1 fst (L.find ((0 <) . fst) pairs)
outlineFromForest :: T.Text -> [Tree T.Text] -> T.Text
outlineFromForest tabString forest = T.unlines $ forest >>= go ""
where
go indent node =
indent <> rootLabel node :
(subForest node >>= go (T.append tabString indent))
------ OUTLINE CHECKING - INDENT CHARACTERS AND WIDTHS -----
consistentIndentUnit :: [T.Text] -> Either T.Text T.Text
consistentIndentUnit prefixes = minimumIndent prefixes >>= checked prefixes
where
checked xs indentUnit
| all ((0 ==) . (`rem` unitLength) . T.length) xs = Right indentUnit
| otherwise =
Left
("Inconsistent indent depths: " <>
T.pack (show (T.length <$> prefixes)))
where
unitLength = T.length indentUnit
minimumIndent :: [T.Text] -> Either T.Text T.Text
minimumIndent prefixes = go $ T.foldr newChar "" $ T.concat prefixes
where
newChar c seen
| c `L.elem` seen = seen
| otherwise = c : seen
go cs
| 1 < length cs =
Left $ "Mixed indent characters used: " <> T.pack (show cs)
| otherwise = Right $ L.minimumBy (comparing T.length) prefixes
nonZeroIndents :: [T.Text] -> [T.Text]
nonZeroIndents textLines =
[ s
| x <- textLines
, s <- [T.takeWhile isSpace x]
, 0 /= T.length s ]
- Output:
Four-spaced (A -> Z): alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Tabbed (A -> Z): alpha epsilon iota theta zeta beta delta gamma kappa lambda mu First unknown type (A -> Z): -> Mixed indent characters used: "\t " Second unknown type (A -> Z): -> Inconsistent indent depths: [4,3,8,9,8,4,4,4,4] Four-spaced (Z -> A): zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Tabbed (Z -> A): zeta gamma mu lambda kappa delta beta alpha theta iota epsilon First unknown type (Z -> A): -> Mixed indent characters used: "\t " Second unknown type (Z -> A): -> Inconsistent indent depths: [4,3,8,9,8,4,4,4,4]
Julia[edit]
A for
loop was used in the constructor, and recursive functions for sorting and printing.
import Base.print
abstract type Entry end
mutable struct OutlineEntry <: Entry
level::Int
text::String
parent::Union{Entry, Nothing}
children::Vector{Entry}
end
mutable struct Outline
root::OutlineEntry
entries::Vector{OutlineEntry}
baseindent::String
end
rootentry() = OutlineEntry(0, "", nothing, [])
indentchar(ch) = ch == ' ' || ch == '\t'
firsttext(s) = something(findfirst(!indentchar, s), length(s) + 1)
splitline(s) = begin i = firsttext(s); i == 1 ? ("", s) : (s[1:i-1], s[i:end]) end
const _indents = [" "]
function Base.print(io::IO, oe::OutlineEntry)
println(io, _indents[end]^oe.level, oe.text)
for child in oe.children
print(io, child)
end
end
function Base.print(io::IO, o::Outline)
push!(_indents, o.baseindent)
print(io, o.root)
pop!(_indents)
end
function firstindent(lines, default = " ")
for lin in lines
s1, s2 = splitline(lin)
s1 != "" && return s1
end
return default
end
function Outline(str::String)
arr, lines = OutlineEntry[], filter(x -> x != "", split(str, r"\r\n|\n|\r"))
root, indent, parentindex, lastindents = rootentry(), firstindent(lines), 0, 0
if ' ' in indent && '\t' in indent
throw("Mixed tabs and spaces in indent are not allowed")
end
indentlen, indentregex = length(indent), Regex(indent)
for (i, lin) in enumerate(lines)
header, txt = splitline(lin)
indentcount = length(collect(eachmatch(indentregex, header)))
(indentcount * indentlen < length(header)) &&
throw("Error: bad indent " * string(UInt8.([c for c in header])) *
", expected " * string(UInt8.([c for c in indent])))
if indentcount > lastindents
parentindex = i - 1
elseif indentcount < lastindents
parentindex = something(findlast(x -> x.level == indentcount - 1, arr), 0)
end
lastindents = indentcount
ent = OutlineEntry(indentcount, txt, parentindex == 0 ? root : arr[parentindex], [])
push!(ent.parent.children, ent)
push!(arr, ent)
end
return Outline(root, arr, indent)
end
function sorttree!(ent::OutlineEntry, rev=false, level=0)
for child in ent.children
sorttree!(child, rev)
end
if level == 0 || level == ent.level
sort!(ent.children, lt=(x, y) -> x.text < y.text, rev=rev)
end
return ent
end
outlinesort!(ol::Outline, rev=false, lev=0) = begin sorttree!(ol.root, rev, lev); ol end
const outline4s = Outline("""
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon""")
const outlinet1 = Outline("""
zeta
gamma
mu
lambda
kappa
delta
beta
alpha
theta
iota
epsilon""")
println("Given the text:\n", outline4s)
println("Sorted outline is:\n", outlinesort!(outline4s))
println("Reverse sorted is:\n", outlinesort!(outline4s, true))
println("Using the text:\n", outlinet1)
println("Sorted outline is:\n", outlinesort!(outlinet1))
println("Reverse sorted is:\n", outlinesort!(outlinet1, true))
println("Sorting only third level:\n", outlinesort!(outlinet1, false, 3))
try
println("Trying to parse a bad outline:")
outlinebad1 = Outline("""
alpha
epsilon
iota
theta
zeta
beta
delta
gamma
kappa
lambda
mu""")
catch y
println(y)
end
try
println("Trying to parse another bad outline:")
outlinebad2 = Outline("""
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon""")
catch y
println(y)
end
- Output:
Given the text: zeta beta gamma lambda kappa mu delta alpha theta iota epsilon Sorted outline is: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Reverse sorted is: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Using the text: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Sorted outline is: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Reverse sorted is: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Sorting only third level: zeta beta delta gamma kappa lambda mu alpha epsilon iota theta Trying to parse a bad outline: Error: bad indent UInt8[0x09], expected UInt8[0x20, 0x20, 0x20, 0x20] Trying to parse another bad outline: Error: bad indent UInt8[0x20, 0x20, 0x20], expected UInt8[0x20, 0x20, 0x20, 0x20]
Perl[edit]
#!/usr/bin/perl
use strict; # https://rosettacode.org/wiki/Sort_an_outline_at_every_level
use warnings;
for my $test ( split /^(?=#)/m, join '', <DATA> )
{
my ( $id, $outline ) = $test =~ /(\V*?\n)(.*)/s;
my $sorted = validateandsort( $outline, $id =~ /descend/ );
print $test, '=' x 20, " answer:\n$sorted\n";
}
sub validateandsort
{
my ($outline, $descend) = @_;
$outline =~ /^\h*(?: \t|\t )/m and
return "ERROR: mixed tab and space indentaion\n";
my $adjust = 0;
$adjust++ while $outline =~ s/^(\h*)\H.*\n\1\K\h(?=\H)//m
or $outline =~ s/^(\h*)(\h)\H.*\n\1\K(?=\H)/$2/m;
$adjust and print "WARNING: adjusting indentation on some lines\n";
return levelsort($outline, $descend);
}
sub levelsort # outline_section, descend_flag
{
my ($section, $descend) = @_;
my @parts;
while( $section =~ / ((\h*) .*\n) ( (?:\2\h.*\n)* )/gx )
{
my ($head, $rest) = ($1, $3);
push @parts, $head . ( $rest and levelsort($rest, $descend) );
}
join '', $descend ? reverse sort @parts : sort @parts;
}
__DATA__
# 4 space ascending
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon
# 4 space descending
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon
# mixed tab and space
alpha
epsilon
iota
theta
zeta
beta
delta
gamma
kappa
lambda
mu
# off alignment
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon
- Output:
# 4 space ascending zeta beta gamma lambda kappa mu delta alpha theta iota epsilon ==================== answer: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu # 4 space descending zeta beta gamma lambda kappa mu delta alpha theta iota epsilon ==================== answer: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon # mixed tab and space alpha epsilon iota theta zeta beta delta gamma kappa lambda mu ==================== answer: ERROR: mixed tab and space indentaion WARNING: adjusting indentation on some lines # off alignment zeta beta gamma lambda kappa mu delta alpha theta iota epsilon ==================== answer: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu
Phix[edit]
procedure print_children(sequence lines, children, string indent, bool bRev)
sequence tags = custom_sort(lines,children)
if bRev then tags = reverse(tags) end if
for i=1 to length(tags) do
integer ti = tags[i]
printf(1,"%s%s\n",{indent,lines[ti][1]})
print_children(lines,lines[ti][$],lines[ti][2],bRev)
end for
end procedure
constant spaced = """
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon
""",
tabbed = substitute(spaced," ","\t"),
confused = substitute_all(spaced,{" gamma"," kappa"},{"gamma","\t kappa"}),
ragged = substitute_all(spaced,{" gamma","kappa"},{"gamma"," kappa"}),
tests = {spaced,tabbed,confused,ragged},
names = "spaced,tabbed,confused,ragged"
procedure test(sequence lines)
sequence pi = {-1}, -- indents (to locate parents)
pdx = {0}, -- indexes for ""
children = {},
roots = {}
for i=1 to length(lines) do
string line = trim_tail(lines[i]),
text = trim_head(line)
integer indent = length(line)-length(text)
-- remove any completed parents
while length(pi) and indent<=pi[$] do
pi = pi[1..$-1]
pdx = pdx[1..$-1]
end while
integer parent = 0
if length(pi) then
parent = pdx[$]
if parent=0 then
if indent!=0 then
printf(1,"**invalid indent** (%s, line %d)\n\n",{text,i})
return
end if
roots &= i
else
if lines[parent][$]={} then
lines[parent][2] = line[1..indent]
elsif lines[parent][2]!=line[1..indent] then
printf(1,"**inconsistent indent** (%s, line %d)\n\n",{text,i})
return
end if
lines[parent][$] &= i -- (update children)
end if
end if
pi &= indent
pdx &= i
lines[i] = {text,"",children}
end for
printf(1,"ascending:\n")
print_children(lines,roots,"",false)
printf(1,"\ndescending:\n")
print_children(lines,roots,"",true)
printf(1,"\n")
end procedure
for t=1 to length(tests) do
string name = split(names,",")[t]
-- printf(1,"Test %d (%s):\n%s\n",{t,name,tests[t]})
printf(1,"Test %d (%s):\n",{t,name})
sequence lines = split(tests[t],"\n",no_empty:=true)
test(lines)
end for
- Output:
Test 1 (spaced): ascending: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu descending: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Test 2 (tabbed): ascending: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu descending: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Test 3 (confused): **inconsistent indent** (gamma, line 3) Test 4 (ragged): **inconsistent indent** (gamma, line 3)
Python[edit]
'''Sort an outline at every level'''
from itertools import chain, product, takewhile, tee
from functools import cmp_to_key, reduce
# ------------- OUTLINE SORTED AT EVERY LEVEL --------------
# sortedOutline :: (Tree String -> Tree String -> Ordering)
# -> String
# -> Either String String
def sortedOutline(cmp):
'''Either a message reporting inconsistent
indentation, or an outline sorted at every
level by the supplied comparator function.
'''
def go(outlineText):
indentTuples = indentTextPairs(
outlineText.splitlines()
)
return bindLR(
minimumIndent(enumerate(indentTuples))
)(lambda unitIndent: Right(
outlineFromForest(
unitIndent,
nest(foldTree(
lambda x: lambda xs: Node(x)(
sorted(xs, key=cmp_to_key(cmp))
)
)(Node('')(
forestFromIndentLevels(
indentLevelsFromLines(
unitIndent
)(indentTuples)
)
)))
)
))
return go
# -------------------------- TEST --------------------------
# main :: IO ()
def main():
'''Ascending and descending sorts attempted on
space-indented and tab-indented outlines, both
well-formed and ill-formed.
'''
ascending = comparing(root)
descending = flip(ascending)
spacedOutline = '''
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon'''
tabbedOutline = '''
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon'''
confusedOutline = '''
alpha
epsilon
iota
theta
zeta
beta
delta
gamma
kappa
lambda
mu'''
raggedOutline = '''
zeta
beta
gamma
lambda
kappa
mu
delta
alpha
theta
iota
epsilon'''
def displaySort(kcmp):
'''Sort function output with labelled comparator
for a set of four labelled outlines.
'''
k, cmp = kcmp
return [
tested(cmp, k, label)(
outline
) for (label, outline) in [
('4-space indented', spacedOutline),
('tab indented', tabbedOutline),
('Unknown 1', confusedOutline),
('Unknown 2', raggedOutline)
]
]
def tested(cmp, cmpName, outlineName):
'''Print either message or result.
'''
def go(outline):
print('\n' + outlineName, cmpName + ':')
either(print)(print)(
sortedOutline(cmp)(outline)
)
return go
# Tests applied to two comparators:
ap([
displaySort
])([
("(A -> Z)", ascending),
("(Z -> A)", descending)
])
# ------------- OUTLINE PARSING AND RENDERING --------------
# forestFromIndentLevels :: [(Int, a)] -> [Tree a]
def forestFromIndentLevels(tuples):
'''A list of trees derived from a list of values paired
with integers giving their levels of indentation.
'''
def go(xs):
if xs:
intIndent, v = xs[0]
firstTreeLines, rest = span(
lambda x: intIndent < x[0]
)(xs[1:])
return [Node(v)(go(firstTreeLines))] + go(rest)
else:
return []
return go(tuples)
# indentLevelsFromLines :: String -> [(String, String)]
# -> [(Int, String)]
def indentLevelsFromLines(indentUnit):
'''Each input line stripped of leading
white space, and tupled with a preceding integer
giving its level of indentation from 0 upwards.
'''
def go(xs):
w = len(indentUnit)
return [
(len(x[0]) // w, x[1])
for x in xs
]
return go
# indentTextPairs :: [String] -> (String, String)
def indentTextPairs(xs):
'''A list of (indent, bodyText) pairs.'''
def indentAndText(s):
pfx = list(takewhile(lambda c: c.isspace(), s))
return (pfx, s[len(pfx):])
return [indentAndText(x) for x in xs]
# outlineFromForest :: String -> [Tree String] -> String
def outlineFromForest(tabString, forest):
'''An indented outline serialisation of forest,
using tabString as the unit of indentation.
'''
def go(indent):
def serial(node):
return [indent + root(node)] + list(
concatMap(
go(tabString + indent)
)(nest(node))
)
return serial
return '\n'.join(
concatMap(go(''))(forest)
)
# --------------- MINIMUM INDENT, OR ANOMALY ---------------
# minimumIndent :: [(Int, [Char])]
# -> Either String String
def minimumIndent(indexedPrefixes):
'''Either a message, if indentation characters are
mixed, or indentation widths are inconsistent,
or the smallest consistent non-empty indentation.
'''
(xs, ts) = tee(indexedPrefixes)
(ys, zs) = tee(ts)
def mindentLR(charSet):
if list(charSet):
def w(x):
return len(x[1][0])
unit = min(filter(w, ys), key=w)[1][0]
unitWidth = len(unit)
def widthCheck(a, ix):
'''Is there a line number at which
an anomalous indent width is seen?
'''
wx = len(ix[1][0])
return a if (a or 0 == wx) else (
ix[0] if 0 != wx % unitWidth else a
)
oddLine = reduce(widthCheck, zs, None)
return Left(
'Inconsistent indentation width at line ' + (
str(1 + oddLine)
)
) if oddLine else Right(''.join(unit))
else:
return Right('')
def tabSpaceCheck(a, ics):
'''Is there a line number at which a
variant indent character is used?
'''
charSet = a[0].union(set(ics[1][0]))
return a if a[1] else (
charSet, ics[0] if 1 < len(charSet) else None
)
indentCharSet, mbAnomalyLine = reduce(
tabSpaceCheck, xs, (set([]), None)
)
return bindLR(
Left(
'Mixed indent characters found in line ' + str(
1 + mbAnomalyLine
)
) if mbAnomalyLine else Right(list(indentCharSet))
)(mindentLR)
# ------------------------ GENERIC -------------------------
# Left :: a -> Either a b
def Left(x):
'''Constructor for an empty Either (option type) value
with an associated string.
'''
return {'type': 'Either', 'Right': None, 'Left': x}
# Right :: b -> Either a b
def Right(x):
'''Constructor for a populated Either (option type) value'''
return {'type': 'Either', 'Left': None, 'Right': x}
# Node :: a -> [Tree a] -> Tree a
def Node(v):
'''Constructor for a Tree node which connects a
value of some kind to a list of zero or
more child trees.
'''
return lambda xs: {'type': 'Tree', 'root': v, 'nest': xs}
# ap (<*>) :: [(a -> b)] -> [a] -> [b]
def ap(fs):
'''The application of each of a list of functions,
to each of a list of values.
'''
def go(xs):
return [
f(x) for (f, x)
in product(fs, xs)
]
return go
# bindLR (>>=) :: Either a -> (a -> Either b) -> Either b
def bindLR(m):
'''Either monad injection operator.
Two computations sequentially composed,
with any value produced by the first
passed as an argument to the second.
'''
def go(mf):
return (
mf(m.get('Right')) if None is m.get('Left') else m
)
return go
# comparing :: (a -> b) -> (a -> a -> Ordering)
def comparing(f):
'''An ordering function based on
a property accessor f.
'''
def go(x, y):
fx = f(x)
fy = f(y)
return -1 if fx < fy else (1 if fx > fy else 0)
return go
# concatMap :: (a -> [b]) -> [a] -> [b]
def concatMap(f):
'''A concatenated list over which a function has been mapped.
The list monad can be derived by using a function f which
wraps its output in a list,
(using an empty list to represent computational failure).
'''
def go(xs):
return chain.from_iterable(map(f, xs))
return go
# either :: (a -> c) -> (b -> c) -> Either a b -> c
def either(fl):
'''The application of fl to e if e is a Left value,
or the application of fr to e if e is a Right value.
'''
return lambda fr: lambda e: fl(e['Left']) if (
None is e['Right']
) else fr(e['Right'])
# flip :: (a -> b -> c) -> b -> a -> c
def flip(f):
'''The binary function f with its
arguments reversed.
'''
return lambda a, b: f(b, a)
# foldTree :: (a -> [b] -> b) -> Tree a -> b
def foldTree(f):
'''The catamorphism on trees. A summary
value defined by a depth-first fold.
'''
def go(node):
return f(root(node))([
go(x) for x in nest(node)
])
return go
# nest :: Tree a -> [Tree a]
def nest(t):
'''Accessor function for children of tree node.'''
return t.get('nest')
# root :: Tree a -> a
def root(t):
'''Accessor function for data of tree node.'''
return t.get('root')
# span :: (a -> Bool) -> [a] -> ([a], [a])
def span(p):
'''The longest (possibly empty) prefix of xs
that contains only elements satisfying p,
tupled with the remainder of xs.
span p xs is equivalent to
(takeWhile p xs, dropWhile p xs).
'''
def match(ab):
b = ab[1]
return not b or not p(b[0])
def f(ab):
a, b = ab
return a + [b[0]], b[1:]
def go(xs):
return until(match)(f)(([], xs))
return go
# until :: (a -> Bool) -> (a -> a) -> a -> a
def until(p):
'''The result of repeatedly applying f until p holds.
The initial seed value is x.
'''
def go(f):
def g(x):
v = x
while not p(v):
v = f(v)
return v
return g
return go
# MAIN ---
if __name__ == '__main__':
main()
- Output:
4-space indented (A -> Z): alpha epsilon iota theta zeta beta delta gamma kappa lambda mu tab indented (A -> Z): alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Unknown 1 (A -> Z): Mixed indent characters found in line 4 Unknown 2 (A -> Z): Inconsistent indentation width at line 3 4-space indented (Z -> A): zeta gamma mu lambda kappa delta beta alpha theta iota epsilon tab indented (Z -> A): zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Unknown 1 (Z -> A): Mixed indent characters found in line 4 Unknown 2 (Z -> A): Inconsistent indentation width at line 3
Wren[edit]
import "/sort" for Sort
import "/fmt" for Fmt
var sortedOutline = Fn.new { |originalOutline, ascending|
var outline = originalOutline.toList // make copy in case we mutate it
var indent = ""
var del = "\x7f"
var sep = "\0"
var messages = []
if (outline[0].trimStart(" \t") != outline[0]) {
System.print(" outline structure is unclear")
return
}
for (i in 1...outline.count) {
var line = outline[i]
var lc = line.count
if (line.startsWith(" ") || line.startsWith(" \t") || line.startsWith("\t")) {
var lc2 = line.trimStart(" \t").count
var currIndent = line[0...lc-lc2]
if (indent == "") {
indent = currIndent
} else {
var correctionNeeded = false
if ((currIndent.contains("\t") && !indent.contains("\t")) ||
(!currIndent.contains("\t") && indent.contains("\t"))) {
messages.add(indent + "corrected inconsistent whitespace use at line '%(line)'")
correctionNeeded = true
} else if (currIndent.count % indent.count != 0) {
messages.add(indent + "corrected inconsistent indent width at line '%(line)'")
correctionNeeded = true
}
if (correctionNeeded) {
var mult = (currIndent.count / indent.count).round
outline[i] = (indent * mult) + line[lc-lc2..-1]
}
}
}
}
var levels = List.filled(outline.count, 0)
levels[0] = 1
var level = 1
var margin = ""
while (!levels.all { |l| l > 0 }) {
var mc = margin.count
for (i in 1...outline.count) {
if (levels[i] == 0) {
var line = outline[i]
if (line.startsWith(margin) && line[mc] != " " && line[mc] != "\t") levels[i] = level
}
}
margin = margin + indent
level = level + 1
}
var lines = List.filled(outline.count, "")
lines[0] = outline[0]
var nodes = []
for (i in 1...outline.count) {
if (levels[i] > levels[i-1]) {
nodes.add((nodes.count == 0) ? outline[i - 1] : sep + outline[i-1])
} else if (levels[i] < levels[i-1]) {
var j = levels[i-1] - levels[i]
for (k in 1..j) nodes.removeAt(-1)
}
if (nodes.count > 0) {
lines[i] = nodes.join() + sep + outline[i]
} else {
lines[i] = outline[i]
}
}
if (ascending) {
Sort.insertion(lines)
} else {
var maxLen = lines.reduce(0) { |max, l| (l.count > max) ? l.count : max }
for (i in 0...lines.count) lines[i] = Fmt.ljust(maxLen, lines[i], del)
Sort.insertion(lines, true)
}
for (i in 0...lines.count) {
var s = lines[i].split(sep)
lines[i] = s[-1]
if (!ascending) lines[i] = lines[i].trimEnd(del)
}
if (messages.count > 0) {
System.print(messages.join("\n"))
System.print()
}
System.print(lines.join("\n"))
}
var outline = [
"zeta",
" beta",
" gamma",
" lambda",
" kappa",
" mu",
" delta",
"alpha",
" theta",
" iota",
" epsilon"
]
var outline2 = outline.map { |s| s.replace(" ", "\t") }.toList
var outline3 = [
"alpha",
" epsilon",
" iota",
" theta",
"zeta",
" beta",
" delta",
" gamma",
" \t kappa", // same length but \t instead of space
" lambda",
" mu"
]
var outline4 = [
"zeta",
" beta",
" gamma",
" lambda",
" kappa",
" mu",
" delta",
"alpha",
" theta",
" iota",
" epsilon"
]
System.print("Four space indented outline, ascending sort:")
sortedOutline.call(outline, true)
System.print("\nFour space indented outline, descending sort:")
sortedOutline.call(outline, false)
System.print("\nTab indented outline, ascending sort:")
sortedOutline.call(outline2, true)
System.print("\nTab indented outline, descending sort:")
sortedOutline.call(outline2, false)
System.print("\nFirst unspecified outline, ascending sort:")
sortedOutline.call(outline3, true)
System.print("\nFirst unspecified outline, descending sort:")
sortedOutline.call(outline3, false)
System.print("\nSecond unspecified outline, ascending sort:")
sortedOutline.call(outline4, true)
System.print("\nSecond unspecified outline, descending sort:")
sortedOutline.call(outline4, false)
- Output:
Four space indented outline, ascending sort: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Four space indented outline, descending sort: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon Tab indented outline, ascending sort: alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Tab indented outline, descending sort: zeta gamma mu lambda kappa delta beta alpha theta iota epsilon First unspecified outline, ascending sort: corrected inconsistent whitespace use at line ' kappa' alpha epsilon iota theta zeta beta delta gamma kappa lambda mu First unspecified outline, descending sort: corrected inconsistent whitespace use at line ' kappa' zeta gamma mu lambda kappa delta beta alpha theta epsilon iota Second unspecified outline, ascending sort: corrected inconsistent indent width at line ' gamma' corrected inconsistent indent width at line ' kappa' alpha epsilon iota theta zeta beta delta gamma kappa lambda mu Second unspecified outline, descending sort: corrected inconsistent indent width at line ' gamma' corrected inconsistent indent width at line ' kappa' zeta gamma mu lambda kappa delta beta alpha theta iota epsilon