Tree datastructures

From Rosetta Code
Revision as of 01:41, 16 October 2019 by Hout (talk | contribs) (→‎{{header|Python}}: Added a first draft in JavaScript)
Tree datastructures is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

The following shows a tree of data with nesting denoted by visual levels of indentation:

RosettaCode
    rocks
        code
        comparison
        wiki
    mocks
        golfing

A common datastructure for trees is to define node structures having a name and a, (possibly empty), list of child nodes. The nesting of nodes captures the indentation of the tree. Lets call this the nest form.

Another datastructure for trees is to construct from the root an ordered list of the nodes level of indentation and the name of that node. The indentation for the root node is zero; node 'rocks is indented by one level from the left, and so on. Lets call this the indent form.

0 RosettaCode
1 rocks
2 code
...
Task
  1. Create/use a nest datastructure format and textual representation for arbitrary trees.
  2. Create/use an indent datastructure format and textual representation for arbitrary trees.
  3. Create methods/classes/proceedures/routines etc to:
    1. Change from a nest tree datastructure to an indent one.
    2. Change from an indent tree datastructure to a nest one
  4. Use the above to encode the example at the start into the nest format, and show it.
  5. transform the initial nest format to indent format and show it.
  6. transform the indent format to final nest format and show it.
  7. Compare initial and final nest formats which should be the same.

Show all output on this page.

JavaScript

Parses the initial tree from outline text, and writes out the flat and nested structures in a minimal JSON format: <lang javascript>(() => {

   'use strict';
   // main :: IO ()
   const main = () => {
       // (INDENT, STRING) PAIRS FROM OUTLINE ------------
       const
           indentLevelTuplesA = indentLevelsFromLines(
               lines(strOutlineB)
           );
       // LIST OF TREES FROM LIST OF (INDENT, STRING) PAIRS
       const
           forestA = forestFromIndentLevels(
               indentLevelTuplesA
           );
       // (INDENT, STRING) PAIRS FROM LIST OF TREES ------
       const
           indentLevelTuplesB = indentLevelsFromForest(forestA);
       // LIST OF TREES FROM SECONDARY (INDENT, STRING) PAIRS
       const forestB = forestFromIndentLevels(
           indentLevelTuplesB
       );
       // JSON OUTPUT OF FORESTS AND INDENT TUPLES -------
       console.log('Tree structure from outline:\n')
       console.log(jsonFromForest(forestA));
       console.log('\n\nIndent levels from tree structure:\n')
       console.log(jsonFromIndentLevels(indentLevelTuplesB));
       console.log('\nTree structure from indent levels:\n')
       console.log(jsonFromForest(forestB));
   };
   // CONVERSIONS BETWEEN OUTLINES, TREES, AND (LEVEL, VALUE) PAIRS
   // indentLevelsFromLines :: [String] -> [(Int, String)]
   const indentLevelsFromLines = xs => {
       const
           indentTextPairs = xs.map(compose(
               firstArrow(length), span(isSpace)
           )),
           indentUnit = minimum(indentTextPairs.flatMap(pair => {
               const w = fst(pair);
               return 0 < w ? [w] : [];
           }));
       return indentTextPairs.map(
           firstArrow(flip(div)(indentUnit))
       );
   };
   // forestFromIndentLevels :: [(Int, String)] -> [Tree String]
   const forestFromIndentLevels = tuples => {
       const go = xs =>
           0 < xs.length ? (() => {
               const [n, s] = Array.from(xs[0]);
               // Lines indented under this line,
               // tupled with all the rest.
               const [firstTreeLines, rest] = Array.from(
                   span(x => n < x[0])(xs.slice(1))
               );
               // This first tree, and then the rest.
               return [Node(s)(go(firstTreeLines))]
                   .concat(go(rest));
           })() : [];
       return go(tuples);
   };
   // indentLevelsFromForest
   const indentLevelsFromForest = trees => {
       const go = n => node => [
               [n, node.root]
           ]
           .concat(node.nest.flatMap(go(1 + n)))
       return trees.flatMap(go(0))
   };
   // JSON RENDERING OF NESTED LINES AND (LEVEL, VALUE) PAIRS
   // jsonFromForest :: [Tree a] -> JSON String
   const jsonFromForest = trees =>
       JSON.stringify(
           nestedListsFromForest(trees),
           null, 2
       );
   // nestedListsFromForest :: [Tree a] -> NestedList a
   const nestedListsFromForest = xs => {
       const go = node => [node.root, node.nest.map(go)];
       return xs.map(go);
   };
   // jsonFromIndentLevels :: [(Int, String)] -> JSON String
   const jsonFromIndentLevels = xs =>
       JSON.stringify(
           xs.map(x => Array.from(x)),
           null, 2
       );


   // GENERIC FUNCTIONS ----------------------------
   // Node :: a -> [Tree a] -> Tree a
   const Node = v => xs => ({
       type: 'Node',
       root: v, // any type of value (consistent across tree)
       nest: xs || []
   });
   // Tuple (,) :: a -> b -> (a, b)
   const Tuple = a => b => ({
       type: 'Tuple',
       '0': a,
       '1': b,
       length: 2
   });
   // compose (<<<) :: (b -> c) -> (a -> b) -> a -> c
   const compose = (...fs) =>
       x => fs.reduceRight((a, f) => f(a), x);
   // div :: Int -> Int -> Int
   const div = x => y => Math.floor(x / y);
   // firstArrow :: (a -> b) -> ((a, c) -> (b, c))
   const firstArrow = f => xy => Tuple(f(xy[0]))(
       xy[1]
   );
   // flip :: (a -> b -> c) -> b -> a -> c
   const flip = f =>
       1 < f.length ? (
           (a, b) => f(b, a)
       ) : (x => y => f(y)(x));
   // foldl1 :: (a -> a -> a) -> [a] -> a
   const foldl1 = f => xs =>
       1 < xs.length ? xs.slice(1)
       .reduce(uncurry(f), xs[0]) : xs[0];
   // fst :: (a, b) -> a
   const fst = tpl => tpl[0];
   // isSpace :: Char -> Bool
   const isSpace = c => /\s/.test(c);
   // Returns Infinity over objects without finite length.
   // This enables zip and zipWith to choose the shorter
   // argument when one is non-finite, like cycle, repeat etc
   // length :: [a] -> Int
   const length = xs =>
       (Array.isArray(xs) || 'string' === typeof xs) ? (
           xs.length
       ) : Infinity;
   // lines :: String -> [String]
   const lines = s => s.split(/[\r\n]/);
   // minimum :: Ord a => [a] -> a
   const minimum = xs =>
       0 < xs.length ? (
           foldl1(a => x => x < a ? x : a)(xs)
       ) : undefined;
   // span :: (a -> Bool) -> [a] -> ([a], [a])
   const span = p => xs => {
       const iLast = xs.length - 1;
       return splitAt(
           until(i => iLast < i || !p(xs[i]))(
               succ
           )(0)
       )(xs);
   };
   // splitAt :: Int -> [a] -> ([a], [a])
   const splitAt = n => xs =>
       Tuple(xs.slice(0, n))(
           xs.slice(n)
       );
   // succ :: Enum a => a -> a
   const succ = x => 1 + x;
   // uncurry :: (a -> b -> c) -> ((a, b) -> c)
   const uncurry = f =>
       (x, y) => f(x)(y);
   // until :: (a -> Bool) -> (a -> a) -> a -> a
   const until = p => f => x => {
       let v = x;
       while (!p(v)) v = f(v);
       return v;
   };
   // SAMPLE OUTLINES ------------------------------------
   const strOutlineA = `Heilmeier catechism
   Objectives and benefits
       What are you trying to do?
           Articulate your objectives using absolutely no jargon.
           What are the problems you address ?
               How is it done today,
               and what are the limits of current practice?
           What is your solution ?
               What is new in your approach
               and why do you think it will be successful?
       Who cares? If you are successful, what difference will it make?
   Costs
       What are the risks?
       How much will it cost?
       How long will it take?
   Indicators
       What are the mid-term and final “exams” to check for success?`;
   const strOutlineB = `Rosetta stone
   is a granodiorite stele
       engraved
           with Greek and Egyptian texts
       in different scripts.
   which shed new light
       on lexical equivalences`;
   // MAIN ---
   return main();

})();</lang>

Output:
Tree structure from outline:

[
  [
    "Rosetta stone",
    [
      [
        "is a granodiorite stele",
        [
          [
            "engraved",
            [
              [
                "with Greek and Egyptian texts",
                []
              ]
            ]
          ],
          [
            "in different scripts.",
            []
          ]
        ]
      ],
      [
        "which shed new light",
        [
          [
            "on lexical equivalences",
            []
          ]
        ]
      ]
    ]
  ]
]

Indent levels from tree structure:

[
  [
    0,
    "Rosetta stone"
  ],
  [
    1,
    "is a granodiorite stele"
  ],
  [
    2,
    "engraved"
  ],
  [
    3,
    "with Greek and Egyptian texts"
  ],
  [
    2,
    "in different scripts."
  ],
  [
    1,
    "which shed new light"
  ],
  [
    2,
    "on lexical equivalences"
  ]
]

Tree structure from indent levels:

[
  [
    "Rosetta stone",
    [
      [
        "is a granodiorite stele",
        [
          [
            "engraved",
            [
              [
                "with Greek and Egyptian texts",
                []
              ]
            ]
          ],
          [
            "in different scripts.",
            []
          ]
        ]
      ],
      [
        "which shed new light",
        [
          [
            "on lexical equivalences",
            []
          ]
        ]
      ]
    ]
  ]
]


Python

Just arranges the standard lists and tuples for the datastructures allowing pprint to show the different arrangement of storage.

<lang python>from pprint import pprint as pp from collections import namedtuple

def to_indent(node, depth=0, flat=None):

   if flat is None:
       flat = []
   if node:
       flat.append((depth, node[0]))
   for child in node[1]:
       to_indent(child, depth + 1, flat)
   return flat

def to_nest(lst, depth=0, level=None):

   if level is None:
       level = []
   while lst:
       d, name = lst[0]
       if d == depth:
           children = []
           level.append((name, children))
           lst.pop(0)
       elif d > depth:  # down
           to_nest(lst, d, children)
       elif d < depth:  # up
           return
   return level[0] if level else None
                   

if __name__ == '__main__':

   print('Start Nest format:')
   nest = ('RosettaCode', [('rocks', [('code', []), ('comparison', []), ('wiki', [])]), 
                           ('mocks', [('golfing', [])])])
   pp(nest, width=25)
   print('\n... To Indent format:')
   as_ind = to_indent(nest)
   pp(as_ind, width=25)
   print('\n... To Nest format:')
   as_nest = to_nest(as_ind)
   pp(as_nest, width=25)
   if nest != as_nest:
       print("Whoops round-trip issues")</lang>
Output:
Start Nest format:
('RosettaCode',
 [('rocks',
   [('code', []),
    ('comparison', []),
    ('wiki', [])]),
  ('mocks',
   [('golfing', [])])])

... To Indent format:
[(0, 'RosettaCode'),
 (1, 'rocks'),
 (2, 'code'),
 (2, 'comparison'),
 (2, 'wiki'),
 (1, 'mocks'),
 (2, 'golfing')]

... To Nest format:
('RosettaCode',
 [('rocks',
   [('code', []),
    ('comparison', []),
    ('wiki', [])]),
  ('mocks',
   [('golfing', [])])])