ASCII art diagram converter

From Rosetta Code
Revision as of 15:30, 1 November 2020 by rosettacode>Gumm (Align pattern right)
Task
ASCII art diagram converter
You are encouraged to solve this task according to the task description, using any language you may know.

Given the RFC 1035 message diagram from Section 4.1.1 (Header section format) as a string: http://www.ietf.org/rfc/rfc1035.txt

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Where (every column of the table is 1 bit):

ID is 16 bits
QR = Query (0) or Response (1)
Opcode = Four bits defining kind of query:
  0:    a standard query (QUERY)
  1:    an inverse query (IQUERY)
  2:    a server status request (STATUS)
  3-15: reserved for future use
AA = Authoritative Answer bit
TC = Truncation bit
RD = Recursion Desired bit
RA = Recursion Available bit
Z = Reserved
RCODE = Response code
QC = Question Count
ANC = Answer Count
AUC = Authority Count
ADC = Additional Count

Write a function, member function, class or template that accepts a similar multi-line string as input to define a data structure or something else able to decode or store a header with that specified bit structure.

If your language has macros, introspection, code generation, or powerful enough templates, then accept such string at compile-time to define the header data structure statically.

Such "Header" function or template should accept a table with 8, 16, 32 or 64 columns, and any number of rows. For simplicity the only allowed symbols to define the table are + - | (plus, minus, pipe), and whitespace. Lines of the input string composed just of whitespace should be ignored. Leading and trailing whitespace in the input string should be ignored, as well as before and after each table row. The box for each bit of the diagram takes four chars "+--+". The code should perform a little of validation of the input string, but for brevity a full validation is not required.

Bonus: perform a thoroughly validation of the input string.

D

This solution generates anonymous struct code at compile-time, that can be mixed-in inside a struct or class. <lang d>string makeStructFromDiagram(in string rawDiagram) pure @safe {

   import std.conv: text;
   import std.format: format;
   import std.string: strip, splitLines, indexOf;
   import std.array: empty, popFront;
   static void commitCurrent(ref uint anonCount,
                             ref uint totalBits,
                             ref size_t currentBits,
                             ref string code,
                             ref string currentName) pure @safe {
       if (currentBits) {
           code ~= "\t";
           currentName = currentName.strip;
           if (currentName.empty) {
               anonCount++;
               currentName = "anonymous_field_" ~ anonCount.text;
           }
           string type;
           if (currentBits == 1)
               type = "bool";
           else if (currentBits <= ubyte.sizeof * 8)
               type = "ubyte";
           else if (currentBits <= ushort.sizeof * 8)
               type = "ushort";
           else if (currentBits <= uint.sizeof * 8)
               type = "uint";
           else if (currentBits <= ulong.sizeof * 8)
               type = "ulong";
           //else if (currentBits <= ucent.sizeof * 8)
           //    type = "ucent";
           else assert(0, "Too many bits for the item " ~ currentName);
           immutable byteOffset = totalBits / 8;
           immutable bitOffset = totalBits % 8;


           // Getter:
           code ~= "@property " ~ type ~ " " ~ currentName ~
                   "() const pure nothrow @safe {\n";
           code ~= "\t\t";
           if (currentBits == 1) {
               code ~= format("return (_payload[%d] & (1 << (7-%d))) ? true : false;",
                              byteOffset, bitOffset);
           } else if (currentBits < 8) {
               auto mask = (1 << currentBits) - 1;
               mask <<= 7 - bitOffset - currentBits + 1;
               code ~= format("return (_payload[%d] & 0b%08b) >> %d;",
                              byteOffset, mask, 7 - bitOffset - currentBits + 1);
           } else {
               assert(currentBits % 8 == 0);
               assert(bitOffset == 0);
               code ~= type ~ " v = 0;\n\t\t";
               code ~= "version(LittleEndian) {\n\t\t";
               foreach (immutable i; 0 .. currentBits / 8)
                   code ~=  "\tv |= (cast(" ~ type ~ ") _payload[" ~
                            text(byteOffset + i) ~ "]) << (" ~
                            text((currentBits / 8) - i - 1) ~
                            " * 8);\n\t\t";
               code ~= "} else static assert(0);\n\t\t";
               code ~= "return v;";
           }
           code ~= "\n";
           code ~= "\t}\n\t";


           // Setter:
           code ~= "@property void " ~ currentName ~ "(in " ~ type ~
                   " value) pure nothrow @safe {\n";
           code ~= "\t\t";
           if (currentBits < 8) {
               auto mask = (1 << currentBits) - 1;
               mask <<= 7 - bitOffset - currentBits + 1;
               code ~= format("_payload[%d] &= ~0b%08b;\n\t\t",
                              byteOffset, mask);
               code ~= "assert(value < " ~ text(1 << currentBits) ~
                       ");\n\t\t";
               code~=format("_payload[%d] |= cast(ubyte) value << %d;",
                              byteOffset, 7 - bitOffset - currentBits + 1);
           } else {
               assert(currentBits % 8 == 0);
               assert(bitOffset == 0);
               code ~= "version(LittleEndian) {\n\t\t";
               foreach (immutable i; 0 .. currentBits / 8)
                   code ~= "\t_payload[" ~ text(byteOffset + i) ~
                           "] = (value >> (" ~
                           text((currentBits / 8) - i - 1) ~
                           " * 8) & 0xff);\n\t\t";
               code ~= "} else static assert(0);";
           }
           code ~= "\n";
           code ~= "\t}\n";
           totalBits += currentBits;
       }
       currentBits = 0;
       currentName = null;
   }
   enum C : char { pipe='|', cross='+' }
   enum cWidth = 3; // Width of a bit cell in the table.
   immutable diagram = rawDiagram.strip;
   size_t bitCountPerRow = 0, currentBits;
   uint anonCount = 0, totalBits;
   string currentName;
   string code = "struct {\n"; // Anonymous.
   foreach (line; diagram.splitLines) {
       assert(!line.empty);
       line = line.strip;
       if (line[0] == C.cross) {
           commitCurrent(anonCount, totalBits, currentBits, code, currentName);
           if (bitCountPerRow == 0)
               bitCountPerRow = (line.length - 1) / cWidth;
           else
               assert(bitCountPerRow == (line.length - 1) / cWidth);
       } else {
           // A field of some sort.
           while (line.length > 2) {
               assert(line[0] != '/',
                      "Variable length data not supported");
               assert(line[0] == C.pipe, "Malformed table");
               line.popFront;
               const idx = line[0 .. $ - 1].indexOf(C.pipe);
               if (idx != -1) {
                   const field = line[0 .. idx];
                   line = line[idx .. $];
                   commitCurrent(anonCount, totalBits, currentBits, code, currentName);
                   currentName = field;
                   currentBits = (field.length + 1) / cWidth;
                   commitCurrent(anonCount, totalBits, currentBits, code, currentName);
               } else {
                   // The full row or a continuation of the last.
                   currentName ~= line[0 .. $ - 1];
                   // At this point, line does not include the first
                   // C.pipe, but the length will include the last.
                   currentBits += line.length / cWidth;
                   line = line[$ .. $];
               }
           }
       }
   }
   // Using bytes to avoid endianness issues.
   // hopefully the compiler will optimize it, otherwise
   // maybe we could specialize the properties more.
   code ~= "\n\tprivate ubyte[" ~ text((totalBits + 7) / 8) ~ "] _payload;\n";
   return code ~ "}";

}


void main() { // Testing.

   import std.stdio;
   enum diagram = "
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                      ID                       |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    QDCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ANCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    NSCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ARCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+";
   // To debug the code generation:
   //pragma(msg, diagram.makeStructFromDiagram);
   // Usage.
   static struct Header {
       mixin(diagram.makeStructFromDiagram);
   }
   Header h;
   h.ID = 10;
   h.RA = true;
   h.ARCOUNT = 255;
   h.Opcode = 7;
   // See the byte representation to test the setter's details.
   h._payload.writeln;
   // Test the getters:
   assert(h.ID == 10);
   assert(h.RA == true);
   assert(h.ARCOUNT == 255);
   assert(h.Opcode == 7);

}</lang>

Output:
[0, 10, 56, 128, 0, 0, 0, 0, 0, 0, 0, 255]

Static support for BigEndian is easy to add.

It also supports larger values like this, that is 32 bits long:

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                   ThirtyTwo                   |
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Go

<lang go>package main

import (

   "fmt"
   "log"
   "math/big"
   "strings"

)

type result struct {

   name  string
   size  int
   start int
   end   int

}

func (r result) String() string {

   return fmt.Sprintf("%-7s   %2d    %3d   %3d", r.name, r.size, r.start, r.end)

}

func validate(diagram string) []string {

   var lines []string
   for _, line := range strings.Split(diagram, "\n") {
       line = strings.Trim(line, " \t")
       if line != "" {
           lines = append(lines, line)
       }
   }
   if len(lines) == 0 {
       log.Fatal("diagram has no non-empty lines!")
   }
   width := len(lines[0])
   cols := (width - 1) / 3
   if cols != 8 && cols != 16 && cols != 32 && cols != 64 {
       log.Fatal("number of columns should be 8, 16, 32 or 64")
   }
   if len(lines)%2 == 0 {
       log.Fatal("number of non-empty lines should be odd")
   }
   if lines[0] != strings.Repeat("+--", cols)+"+" {
       log.Fatal("incorrect header line")
   }
   for i, line := range lines {
       if i == 0 {
           continue
       } else if i%2 == 0 {
           if line != lines[0] {
               log.Fatal("incorrect separator line")
           }
       } else if len(line) != width {
           log.Fatal("inconsistent line widths")
       } else if line[0] != '|' || line[width-1] != '|' {
           log.Fatal("non-separator lines must begin and end with '|'")
       }
   }
   return lines

}

func decode(lines []string) []result {

   fmt.Println("Name     Bits  Start  End")
   fmt.Println("=======  ====  =====  ===")
   start := 0
   width := len(lines[0])
   var results []result
   for i, line := range lines {
       if i%2 == 0 {
           continue
       }
       line := line[1 : width-1]
       for _, name := range strings.Split(line, "|") {
           size := (len(name) + 1) / 3
           name = strings.TrimSpace(name)
           res := result{name, size, start, start + size - 1}
           results = append(results, res)
           fmt.Println(res)
           start += size
       }
   }
   return results

}

func unpack(results []result, hex string) {

   fmt.Println("\nTest string in hex:")
   fmt.Println(hex)
   fmt.Println("\nTest string in binary:")
   bin := hex2bin(hex)
   fmt.Println(bin)
   fmt.Println("\nUnpacked:\n")
   fmt.Println("Name     Size  Bit pattern")
   fmt.Println("=======  ====  ================")
   for _, res := range results {
       fmt.Printf("%-7s   %2d   %s\n", res.name, res.size, bin[res.start:res.end+1])
   }

}

func hex2bin(hex string) string {

   z := new(big.Int)
   z.SetString(hex, 16)
   return fmt.Sprintf("%0*b", 4*len(hex), z)

}

func main() {

   const diagram = `
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                      ID                       |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    QDCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    ANCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    NSCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    ARCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   `
   lines := validate(diagram)
   fmt.Println("Diagram after trimming whitespace and removal of blank lines:\n")
   for _, line := range lines {
       fmt.Println(line)
   }
   fmt.Println("\nDecoded:\n")
   results := decode(lines)
   hex := "78477bbf5496e12e1bf169a4" // test string
   unpack(results, hex)

}</lang>

Output:
Diagram after trimming whitespace and removal of blank lines:

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Decoded:

Name     Bits  Start  End
=======  ====  =====  ===
ID        16      0    15
QR         1     16    16
Opcode     4     17    20
AA         1     21    21
TC         1     22    22
RD         1     23    23
RA         1     24    24
Z          3     25    27
RCODE      4     28    31
QDCOUNT   16     32    47
ANCOUNT   16     48    63
NSCOUNT   16     64    79
ARCOUNT   16     80    95

Test string in hex:
78477bbf5496e12e1bf169a4

Test string in binary:
011110000100011101111011101111110101010010010110111000010010111000011011111100010110100110100100

Unpacked:

Name     Size  Bit pattern
=======  ====  ================
ID        16   0111100001000111
QR         1   0
Opcode     4   1111
AA         1   0
TC         1   1
RD         1   1
RA         1   1
Z          3   011
RCODE      4   1111
QDCOUNT   16   0101010010010110
ANCOUNT   16   1110000100101110
NSCOUNT   16   0001101111110001
ARCOUNT   16   0110100110100100

J

<lang J>require'strings'

soul=: -. {. normalize=: [:soul' ',dltb;._2

mask=: 0: _1} '+' = {. partition=: '|' = mask #"1 soul labels=: ;@(([: <@}: <@dltb;._1)"1~ '|'&=)@soul names=: ;:^:(0 = L.)

unpacker=:1 :0

 p=. , partition normalize m
 p #.;.1 (8#2) ,@:#: ]

)

packer=:1 :0

 w=. -#;.1 ,partition normalize m
 _8 (#.\ ;) w ({. #:)&.> ]

)

getter=:1 :0

 nm=. labels normalize m
 (nm i. names@[) { ]

)

setter=:1 :0

 q=. '
 n=. q,~q,;:inv labels normalize m
 1 :('(',n,' i.&names m)}')

)

starter=:1 :0

 0"0 labels normalize m

)</lang>

Sample definition (note the deliberate introduction of extraneous whitespace in locations the task requires us to ignore it.

<lang j>sample=: 0 :0

 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

| ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| Opcode |AA|TC|RD|RA| Z | RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

)

unpack=: sample unpacker pack=: sample packer get=: sample getter set=: sample setter start=: sample starter</lang>

Example data for sample definition:

<lang J>

  4095 13 5 6144 4096 'ID Opcode RCODE ARCOUNT QDCOUNT' set start

4095 0 13 0 0 0 0 0 5 4096 0 0 6144

  pack 4095 13 5 6144 4096 'ID Opcode RCODE ARCOUNT QDCOUNT' set start

15 255 104 5 16 0 0 0 0 0 24 0

  unpack 0 10 56 128 0 0 0 0 0 0 0 255

10 0 7 0 0 0 1 0 0 0 0 0 255

  'Opcode' get unpack 0 10 56 128 0 0 0 0 0 0 0 255

7</lang>

In other words:

unpack converts an octet sequence to the corresponding numeric sequence
pack converts a numeric sequence to the corresponding octet sequence
get extracts named elements from the numeric sequence
set updates named elements in the numeric sequence
start represents the default "all zeros" sequence which may be used to derive other sequences

Note that this implementation assumes that the ascii diagram represents the native word width on a single line, and assumes well formed data.

Java

Separate methods to validate, display, decode ASCII art, and decode hex value.

<lang Java> import java.math.BigInteger; import java.util.ArrayList; import java.util.LinkedHashMap; import java.util.List; import java.util.Map;

public class AsciiArtDiagramConverter {

   private static final String TEST = "+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+\r\n" +
           "|                      ID                       |\r\n" +
           "+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+\r\n" +
           "|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |\r\n" +
           "+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+\r\n" +
           "|                    QDCOUNT                    |\r\n" +
           "+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+\r\n" +
           "|                    ANCOUNT                    |\r\n" +
           "+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+\r\n" +
           "|                    NSCOUNT                    |\r\n" +
           "+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+\r\n" +
           "|                    ARCOUNT                    |\r\n" +
           "+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+";
   public static void main(String[] args) {
       validate(TEST);
       display(TEST);
       Map<String,List<Integer>> asciiMap = decode(TEST);
       displayMap(asciiMap);
       displayCode(asciiMap, "78477bbf5496e12e1bf169a4");
   }
   private static void displayCode(Map<String,List<Integer>> asciiMap, String hex) {
       System.out.printf("%nTest string in hex:%n%s%n%n", hex);
       String bin = new BigInteger(hex,16).toString(2);
       //  Zero pad in front as needed
       int length = 0;
       for ( String code : asciiMap.keySet() ) {
           List<Integer> pos = asciiMap.get(code);
           length += pos.get(1) - pos.get(0) + 1;
       }
       while ( length > bin.length() ) {
           bin = "0" + bin;
       }
       System.out.printf("Test string in binary:%n%s%n%n", bin);
       System.out.printf("Name      Size  Bit Pattern%n");
       System.out.printf("-------- -----  -----------%n");
       for ( String code : asciiMap.keySet() ) {
           List<Integer> pos = asciiMap.get(code);
           int start = pos.get(0);
           int end   = pos.get(1);
           System.out.printf("%-8s    %2d  %s%n", code, end-start+1, bin.substring(start, end+1));
       }
   }


   private static void display(String ascii) {
       System.out.printf("%nDiagram:%n%n");
       for ( String s : TEST.split("\\r\\n") ) {
           System.out.println(s);
       }
   }
   private static void displayMap(Map<String,List<Integer>> asciiMap) {
       System.out.printf("%nDecode:%n%n");


       System.out.printf("Name      Size  Start    End%n");
       System.out.printf("-------- -----  -----  -----%n");
       for ( String code : asciiMap.keySet() ) {
           List<Integer> pos = asciiMap.get(code);
           System.out.printf("%-8s    %2d     %2d     %2d%n", code, pos.get(1)-pos.get(0)+1, pos.get(0), pos.get(1));
       }
   }
   private static Map<String,List<Integer>> decode(String ascii) {
       Map<String,List<Integer>> map = new LinkedHashMap<>();
       String[] split = TEST.split("\\r\\n");
       int size = split[0].indexOf("+", 1) - split[0].indexOf("+");
       int length = split[0].length() - 1;
       for ( int i = 1 ; i < split.length ; i += 2 ) {
           int barIndex = 1;
           String test = split[i];
           int next;
           while ( barIndex < length && (next = test.indexOf("|", barIndex)) > 0 ) {
               //  List is start and end of code.
               List<Integer> startEnd = new ArrayList<>();
               startEnd.add((barIndex/size) + (i/2)*(length/size));
               startEnd.add(((next-1)/size) + (i/2)*(length/size));
               String code = test.substring(barIndex, next).replace(" ", "");
               map.put(code, startEnd);
               //  Next bar
               barIndex = next + 1;
           }
       }
       return map;
   }
   private static void validate(String ascii) {
       String[] split = TEST.split("\\r\\n");
       if ( split.length % 2 != 1 ) {
           throw new RuntimeException("ERROR 1:  Invalid number of input lines.  Line count = " + split.length);
       }
       int size = 0;
       for ( int i = 0 ; i < split.length ; i++ ) {
           String test = split[i];
           if ( i % 2 == 0 ) {
               //  Start with +, an equal number of -, end with +
               if ( ! test.matches("^\\+([-]+\\+)+$") ) {
                   throw new RuntimeException("ERROR 2:  Improper line format.  Line = " + test);
               }
               if ( size == 0 ) {
                   int firstPlus = test.indexOf("+");
                   int secondPlus = test.indexOf("+", 1);
                   size = secondPlus - firstPlus;
               }
               if ( ((test.length()-1) % size) != 0 ) {
                   throw new RuntimeException("ERROR 3:  Improper line format.  Line = " + test);
               }
               //  Equally spaced splits of +, -
               for ( int j = 0 ; j < test.length()-1 ; j += size ) {
                   if ( test.charAt(j) != '+' ) {
                       throw new RuntimeException("ERROR 4:  Improper line format.  Line = " + test);
                   }
                   for ( int k = j+1 ; k < j + size ; k++ ) {
                       if ( test.charAt(k) != '-' ) {
                           throw new RuntimeException("ERROR 5:  Improper line format.  Line = " + test);
                       }
                   }
               }
           }
           else {
               //  Vertical bar, followed by optional spaces, followed by name, followed by optional spaces, followed by vdrtical bar
               if ( ! test.matches("^\\|(\\s*[A-Za-z]+\\s*\\|)+$") ) {
                   throw new RuntimeException("ERROR 6:  Improper line format.  Line = " + test);
               }
               for ( int j = 0 ; j < test.length()-1 ; j += size ) {
                   for ( int k = j+1 ; k < j + size ; k++ ) {
                       //  Vertical bar only at boundaries
                       if ( test.charAt(k) == '|' ) {
                           throw new RuntimeException("ERROR 7:  Improper line format.  Line = " + test);
                       }
                   }
               }
           }
       }
   }

} </lang>

Output:

Diagram:

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Decode:

Name      Size  Start    End
-------- -----  -----  -----
ID          16      0     15
QR           1     16     16
Opcode       4     17     20
AA           1     21     21
TC           1     22     22
RD           1     23     23
RA           1     24     24
Z            3     25     27
RCODE        4     28     31
QDCOUNT     16     32     47
ANCOUNT     16     48     63
NSCOUNT     16     64     79
ARCOUNT     16     80     95

Test string in hex:
78477bbf5496e12e1bf169a4

Test string in binary:
011110000100011101111011101111110101010010010110111000010010111000011011111100010110100110100100

Name      Size  Bit Pattern
-------- -----  -----------
ID          16  0111100001000111
QR           1  0
Opcode       4  1111
AA           1  0
TC           1  1
RD           1  1
RA           1  1
Z            3  011
RCODE        4  1111
QDCOUNT     16  0101010010010110
ANCOUNT     16  1110000100101110
NSCOUNT     16  0001101111110001
ARCOUNT     16  0110100110100100

JavaScript

<lang javascript>// ------------------------------------------------------------[ Boilerplate ]-- const trimWhitespace = s => s.trim(); const isNotEmpty = s => s !== ; const stringLength = s => s.length; const hexToBin4 = s => parseInt(s, 16).toString(2).padStart(4, '0'); const concatHexToBin = (binStr, hexStr) => binStr.concat(, hexToBin4(hexStr)); const alignRight = n => s => `${s}`.padStart(n, ' '); const alignLeft = n => s => `${s}`.padEnd(n, ' '); const repeatChar = c => n => c.padStart(n, c); const printDiagramInfo = map => {

 const pName = alignLeft(8);
 const p5 = alignRight(5);
 const line = repeatChar('-');
 const res = [];
 res.push([pName('Name'), p5('Size'), p5('Start'), p5('End')].join(' '));
 res.push([line(8), line(5), line(5), line(5)].join(' '));
 [...map.values()].forEach(({label, bitLength, start, end}) => {
   res.push([pName(label), p5(bitLength), p5(start), p5(end)].join(' '));
 })
 return res;

}

// -------------------------------------------------------------------[ Main ]-- const parseDiagram = dia => {

 const arr = dia.split('\n').map(trimWhitespace).filter(isNotEmpty);
 const hLine = arr[0];
 const bitTokens = hLine.split('+').map(trimWhitespace).filter(isNotEmpty);
 const bitWidth = bitTokens.length;
 const bitTokenWidth = bitTokens[0].length;
 const fields = arr.filter(e => e !== hLine);
 const allFields = fields.reduce((p, c) => [...p, ...c.split('|')], [])
     .filter(isNotEmpty);
 const lookupMap = Array(bitWidth).fill().reduce((p, c, i) => {
   const v = i + 1;
   const stringWidth = (v * bitTokenWidth) + (v - 1);
   p.set(stringWidth, v);
   return p;
 }, new Map())
 const fieldMetaMap = allFields.reduce((p, e, i) => {
   const bitLength = lookupMap.get(e.length);
   const label = trimWhitespace(e);
   const start = i ? p.get(i - 1).end + 1 : 0;
   const end = start - 1 + bitLength;
   p.set(i, {label, bitLength, start, end})
   return p;
 }, new Map());
 return hexStr => {
   const pName = alignLeft(8);
   const pBit = alignRight(5);
   const pPat = alignRight(16);
   const line = repeatChar('-')
   const binString = [...hexStr].reduce(concatHexToBin, );
   const res = printDiagramInfo(fieldMetaMap);
   res.unshift(['Diagram:', ...arr, '\n'].join('\n'));
   res.push(['\n', 'Test string in hex:', hexStr].join('\n'));
   res.push(['Test string in binary:', binString, '\n'].join('\n'));
   res.push([pName('Name'), pBit('Size'), pPat('Pattern')].join(' '));
   res.push([line(8), line(5), line(16)].join(' '));
   [...fieldMetaMap.values()].forEach(({label, bitLength, start, end}) => {
     res.push([pName(label), pBit(bitLength), pPat(binString.substr(start, bitLength))].join(' '))
   })
   return res.join('\n');
 }

}

// --------------------------------------------------------------[ Run tests ]--

const dia = ` +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| Opcode |AA|TC|RD|RA| Z | RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ `;

const parser = parseDiagram(dia);

parser('78477bbf5496e12e1bf169a4');</lang>

Output:
Diagram:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+


Name      Size Start   End
-------- ----- ----- -----
ID          16     0    15
QR           1    16    16
Opcode       4    17    20
AA           1    21    21
TC           1    22    22
RD           1    23    23
RA           1    24    24
Z            3    25    27
RCODE        4    28    31
QDCOUNT     16    32    47
ANCOUNT     16    48    63
NSCOUNT     16    64    79
ARCOUNT     16    80    95


Test string in hex:
78477bbf5496e12e1bf169a4
Test string in binary:
011110000100011101111011101111110101010010010110111000010010111000011011111100010110100110100100


Name      Size          Pattern
-------- ----- ----------------
ID          16 0111100001000111
QR           1                0
Opcode       4             1111
AA           1                0
TC           1                1
RD           1                1
RA           1                1
Z            3              011
RCODE        4             1111
QDCOUNT     16 0101010010010110
ANCOUNT     16 1110000100101110
NSCOUNT     16 0001101111110001
ARCOUNT     16 0110100110100100

Julia

The validator() function can be customized. The one used only checks length. <lang julia>diagram = """

        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                      ID                       |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    QDCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    ANCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    NSCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    ARCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+"""

testhexdata = "78477bbf5496e12e1bf169a4"

struct BitField

   name::String
   bits::Int
   fieldstart::Int
   fieldend::Int

end

function diagramtostruct(txt)

   bitfields = Vector{BitField}()
   lines = map(strip, split(txt, "\n"))
   for row in 1:2:length(lines)-1
       nbits = sum(x -> x == '+', lines[row]) - 1
       fieldpos = findall(x -> x == '|', lines[row + 1])
       bitaccum = div(row, 2) * nbits
       for (i, field) in enumerate(fieldpos[1:end-1])
           endfield = fieldpos[i + 1]
           bitsize = div(endfield - field, 3)
           bitlabel = strip(lines[row + 1][field+1:endfield-1])
           bitstart = div(field - 1, 3) + bitaccum
           bitend = bitstart + bitsize - 1
           push!(bitfields, BitField(bitlabel, bitsize, bitstart, bitend))
       end
   end
   bitfields

end

binbyte(c) = string(parse(UInt8, c, base=16), base=2, pad=8) hextobinary(s) = reduce(*, map(binbyte, map(x -> s[x:x+1], 1:2:length(s)-1))) validator(binstring, fields) = length(binstring) == sum(x -> x.bits, fields)

function bitreader(bitfields, hexdata)

   println("\nEvaluation of hex data $hexdata as bitfields:")
   println("Name     Size          Bits\n-------  ----  ----------------")
   b = hextobinary(hexdata)
   @assert(validator(b, bitfields))
   for bf in bitfields
       pat = b[bf.fieldstart+1:bf.fieldend+1]
       println(rpad(bf.name, 9), rpad(bf.bits, 6), lpad(pat, 16))
   end

end

const decoded = diagramtostruct(diagram)

println("Diagram as bit fields:\nName Bits Start End\n------ ---- ----- ---") for bf in decoded

   println(rpad(bf.name, 8), rpad(bf.bits, 6), rpad(bf.fieldstart, 6), lpad(bf.fieldend, 4))

end

bitreader(decoded, testhexdata)

</lang>

Output:
Diagram as bit fields:
Name    Bits  Start  End
------  ----  -----  ---
ID      16    0       15
QR      1     16      16
Opcode  4     17      20
AA      1     21      21
TC      1     22      22
RD      1     23      23
RA      1     24      24
Z       3     25      27
RCODE   4     28      31
QDCOUNT 16    32      47
ANCOUNT 16    48      63
NSCOUNT 16    64      79
ARCOUNT 16    80      95

Evaluation of hex data 78477bbf5496e12e1bf169a4 as bitfields:
Name     Size          Bits
-------  ----  ----------------
ID       16    0111100001000111
QR       1                    0
Opcode   4                 1111
AA       1                    0
TC       1                    1
RD       1                    1
RA       1                    1
Z        3                  011
RCODE    4                 1111
QDCOUNT  16    0101010010010110
ANCOUNT  16    1110000100101110
NSCOUNT  16    0001101111110001
ARCOUNT  16    0110100110100100

Nim

The program is composed of three parts.

In the first part, a parser takes a diagram (string composed of lines) and produce a raw structure which describes the fields. The parser may be executed at compile time (if the input string is a constant) or at runtime. It performs a full validation of the structure, detecting syntax errors and structure errors.

The second part of the module produces a type declaration from the raw structure. This operation is done at compile time by calling a macro which constructs an abstract syntax tree to represent the structure. Nim describes bit fields in a way not very different of the C way by specifying the field length in a pragma, for instance “{.bitsize: 5.}.

After the type declaration has been produced, it is possible to declare variables and work with them as if the structure had been explicitly described using Nim syntax rather than using ASCII-art diagram.

The third part of the module produces a description at runtime. As Nim is statically typed, it is not possible to produce a new type and declare variables of this type. So, starting from the raw structure built by the parser, we produce a structure containing a storage area and a table of field positions and we provide two procedures to get and set a field value. These procedures uses masks and logical operations to read and write the values.

So, the behavior when using structures built at compile time and structure built at runtime is different. For a variable “s”, reading the content of field “f” is done using the normal syntax “s.f” in the first case and using “s.get("f")” in the second case.

Of course, the second mechanism may be used for all cases. The first mechanism is mainly interesting to show how it is possible, using Nim powerful macro system, to create a type ex nihilo.

<lang Nim>import macros import strutils import tables

const Diagram = """

        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                      ID                       |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    QDCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    ANCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    NSCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
        |                    ARCOUNT                    |
        +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+"""
  1. Exceptions.

type

 # Exceptions.
 SyntaxError = object of CatchableError
 StructureError = object of CatchableError
 FieldError = object of CatchableError
  1. ---------------------------------------------------------------------------------------------------

proc raiseException(exc: typedesc; linenum: int; message: string) =

 ## Raise an exception with a message including the line number.
 raise newException(exc, "Line $1: $2".format(linenum, message))


  1. Parser.

type

 # Allowed tokens.
 Token = enum tkSpace, tkPlus, tkMinus2, tkVLine, tkIdent, tkEnd, tkError
 # Lexer description.
 Lexer = object
   line: string      # Line to parse.
   linenum: int      # Line number.
   pos: int          # Current position.
   token: Token      # Current token.
   value: string     # Associated value (for tkIdent).
 # Description of a field.
 Field = tuple[name: string, length: int]
 # Description of fields in a row.
 RowFields = seq[Field]
 # Structure to describe fields.
 RawStructure = object
   size: int                # Size of a row in bits.
   rows: seq[RowFields]     # List of rows.
  1. ---------------------------------------------------------------------------------------------------

proc getNextToken(lexer: var Lexer) =

 ## Search next toke/* Nim */ n and update lexer state accordingly.
 doAssert(lexer.pos < lexer.line.high)
 inc lexer.pos
 let ch = lexer.line[lexer.pos]
 case ch
 of ' ':
   lexer.token = tkSpace
 of '+':
   lexer.token = tkPlus
 of '-':
   inc lexer.pos
   lexer.token = if lexer.line[lexer.pos] == '-': tkMinus2 else: tkError
 of '|':
   lexer.token = tkVLine
 of Letters:
   # Beginning of an identifier.
   lexer.value = $ch
   inc lexer.pos
   while lexer.pos < lexer.line.high and lexer.line[lexer.pos] in IdentChars:
     lexer.value.add(lexer.line[lexer.pos])
     inc lexer.pos
   dec lexer.pos
   lexer.token = tkIdent
 of '\n':
   # End of the line.
   lexer.token = tkEnd
 else:
   lexer.token = tkError
  1. ---------------------------------------------------------------------------------------------------

proc initLexer(line: string; linenum: int): Lexer =

 ## Initialize a lexer.
 result.line = line & '\n'   # Add a sentinel.
 result.linenum = linenum
 result.pos = -1
  1. ---------------------------------------------------------------------------------------------------

proc parseSepLine(lexer: var Lexer): int =

 ## Parse a separation line. Return the corresponding size in bits.
 lexer.getNextToken()
 while lexer.token != tkEnd:
   if lexer.token != tkMinus2:
     raiseException(SyntaxError, lexer.linenum, "“--” expected")
   lexer.getNextToken()
   if lexer.token != tkPlus:
     raiseException(SyntaxError, lexer.linenum, "“+” expected")
   inc result
   lexer.getNextToken()
  1. ---------------------------------------------------------------------------------------------------

proc parseFieldLine(lexer: var Lexer; structure: var RawStructure) =

 ## Parse a field description line and update the structure accordingly.
 var rowFields: RowFields    # List of fields.
 var prevpos = 0             # Previous position.
 var size = 0                # Total size.
 lexer.getNextToken()
 while lexer.token != tkEnd:
   # Parse a field.
   while lexer.token == tkSpace:
     lexer.getNextToken()
   if lexer.token != tkIdent:
     raiseException(SyntaxError, lexer.linenum, "Identifier expected")
   let id = lexer.value
   lexer.getNextToken()
   while lexer.token == tkSpace:
     lexer.getNextToken()
   if lexer.token != tkVLine:
     raiseException(SyntaxError, lexer.linenum, "“|” expected")
   if lexer.pos mod 3 != 0:
     raiseException(SyntaxError, lexer.linenum, "wrong position for “|”")
   # Build a field description.
   let fieldLength = (lexer.pos - prevpos) div 3
   rowFields.add((id, fieldLength))
   inc size, fieldLength
   prevpos = lexer.pos
   lexer.getNextToken()
 # Add the row fields description to the structure.
 if size != structure.size:
   raiseException(StructureError, lexer.linenum, "total size of fields doesn’t fit")
 structure.rows.add(rowFields)
  1. ---------------------------------------------------------------------------------------------------

proc parseLine(line: string; linenum: Positive; structure: var RawStructure) =

 ## Parse a line.
 # Eliminate spaces at beginning and at end, and ignore empty lines.
 let line = line.strip()
 if line.len == 0: return
 var lexer = initLexer(line, linenum)
 lexer.getNextToken()
 if lexer.token == tkPlus:
   # Separator line.
   let size = parseSepLine(lexer)
   if size notin {8, 16, 32, 64}:
     raiseException(StructureError, linenum,
                    "wrong structure size; got $1, expected 8, 16, 32 or 64".format(size))
   if structure.size > 0:
     # Not the first separation line.
     if size != structure.size:
       raiseException(StructureError, linenum,
                      "inconsistent size; got $1, expected $2".format(size, structure.size))
   else:
     structure.size = size
 elif lexer.token == tkVLine:
   # Fields description line.
   parseFieldLine(lexer, structure)
 else:
   raiseException(SyntaxError, linenum, "“+” or “|” expected")
  1. ---------------------------------------------------------------------------------------------------

proc parse(diagram: string): RawStructure =

 ## Parse a diagram describing a structure.
 var linenum = 0
 for line in diagram.splitLines():
   inc linenum
   parseLine(line, linenum, result)


  1. Generation of a structure type at compile time.
  2. Access to fields is done directly without getter or setter.

macro createStructType*(diagram, typeName: static string): untyped =

 ## Create a type from "diagram" whose name is given by "typeName".
 # C types to use as units.
 const Ctypes = {8: "cuchar", 16: "cushort", 32: "cuint", 64: "culong"}.toTable
 # Check that the type name is a valid identifier.
 if not typeName.validIdentifier():
   error("Invalid type name: " & typeName)
   return
 # Parse the diagram.
 var struct: RawStructure
 try:
   struct = parse(diagram)
 except SyntaxError, StructureError:
   error(getCurrentExceptionMsg())
   return
 # Build the beginning: "type <typeName> = object".
 # For now, the list of fields is empty.
 let ctype = Ctypes[struct.size]
 let recList = newNimNode(nnkRecList)
 result = nnkStmtList.newTree(
            nnkTypeSection.newTree(
              nnkTypeDef.newTree(
                ident(typeName),
                newEmptyNode(),
                nnkObjectTy.newTree(
                  newEmptyNode(),
                  newEmptyNode(),
                  recList))))
 # Add the fields.
 for row in struct.rows:
   if row.len == 1:
     # Single field in a unit. No need to specify the length.
     recList.add(newIdentDefs(
                   nnkPostfix.newTree(
                     ident("*"),
                     ident(row[0].name)),
                   ident(ctype)))
   else:
     # Several fields. Use pragma "bitsize".
     for field in row:
       let fieldNode = nnkPragmaExpr.newTree(
                         nnkPostfix.newTree(
                           ident("*"),
                           ident(field.name)),
                         nnkPragma.newTree(
                           nnkExprColonExpr.newTree(
                             ident("bitsize"),
                             newIntLitNode(field.length))))
       recList.add(newIdentDefs(fieldNode, ident(ctype)))


  1. Generation of a structure at runtime.
  2. Access to fields must be done via a specific getter or setter.

type

 # Unit to use.
 Unit = enum unit8, unit16, unit32, unit64
 # Position of a field in a unit.
 FieldPosition = tuple[row, start, length: int]
 # Description of the structure.
 Structure* = object
   names: seq[string]                        # Original names.
   positions: Table[string, FieldPosition]   # Mapping name (in lower case) => Position.
   # Storage.
   case unit: Unit:
   of unit8:
     s8: seq[uint8]
   of unit16:
     s16: seq[uint16]
   of unit32:
     s32: seq[uint32]
   of unit64:
     s64: seq[uint64]
  1. ---------------------------------------------------------------------------------------------------

proc createStructVar*(diagram: string): Structure =

 ## Create a variable for the structure described by "diagram".
 var rawStruct = parse(diagram)
 # Allocate the storage for the structure.
 case rawStruct.size
 of 8:
   result = Structure(unit: unit8)
   result.s8.setLen(rawStruct.rows.len)
 of 16:
   result = Structure(unit: unit16)
   result.s16.setLen(rawStruct.rows.len)
 of 32:
   result = Structure(unit: unit32)
   result.s32.setLen(rawStruct.rows.len)
 of 64:
   result = Structure(unit: unit64)
   result.s64.setLen(rawStruct.rows.len)
 else:
   raise newException(ValueError, "Internal error")
 # Build the table mapping field names to positions.
 for i, row in rawStruct.rows:
   var offset = 0
   for field in row:
     result.names.add(field.name)
     result.positions[field.name.toLower] = (row: i, start: offset, length: field.length)
     inc offset, field.length
  1. ---------------------------------------------------------------------------------------------------

proc get*(struct: Structure; fieldName: string): uint64 =

 ## Return the value of field "fieldName" in a structure.
 ## The value type is "uint64" and should be converted to another type if needed.
 # Get the position of the field.
 var row, start, length: int
 try:
   (row, start, length) = struct.positions[fieldName.toLower]
 except KeyError:
   raise newException(FieldError, "Invalid field: " & fieldName)
 let mask = 1 shl length - 1
 let endpos = start + length - 1
 case struct.unit
 of unit8:
   result = (struct.s8[row] and mask.uint8 shl (7 - endpos)) shr (7 - endpos)
 of unit16:
   result = (struct.s16[row] and mask.uint16 shl (15 - endpos)) shr (15 - endpos)
 of unit32:
   result = (struct.s32[row] and mask.uint32 shl (31 - endpos)) shr (31 - endpos)
 of unit64:
   result = (struct.s64[row] and mask.uint64 shl (63 - endpos)) shr (63 - endpos)
  1. ---------------------------------------------------------------------------------------------------

proc set*(struct: var Structure; fieldName: string; value: SomeInteger) =

 ## Set the value of the field "fieldName" in a structure.
 # Get the position of the field.
 var row, start, length: int
 try:
   (row, start, length) = struct.positions[fieldName.toLower]
 except KeyError:
   raise newException(FieldError, "Invalid field: " & fieldName)
 let mask = 1 shl length - 1
 let endpos = start + length - 1
 let value = value and mask   # Make sure that the value fits in the field.
 case struct.unit
 of unit8:
   struct.s8[row] =
     struct.s8[row] and not (mask.uint8 shl (7 - endpos)) or (value.uint8 shl (7 - endpos))
 of unit16:
   struct.s16[row] =
     struct.s16[row] and not (mask.uint16 shl (15 - endpos)) or (value.uint16 shl (15 - endpos))
 of unit32:
   struct.s32[row] =
     struct.s32[row] and not (mask.uint32 shl (31 - endpos)) or (value.uint32 shl (31 - endpos))
 of unit64:
   struct.s64[row] =
     struct.s64[row] and not (mask.uint64 shl (63 - endpos)) or (value.uint64 shl (63 - endpos))
  1. ---------------------------------------------------------------------------------------------------

iterator fields*(struct: Structure): uint64 =

 ## Yield the values of the successive fields of a structure
 for name in struct.positions.keys:
   yield struct.get(name)
  1. ---------------------------------------------------------------------------------------------------

iterator fieldPairs*(struct: Structure): tuple[key: string, val: uint64] =

 ## Yield names and values of the successive fields of a structure
 for name in struct.names:
   yield (name, struct.get(name))
  1. ---------------------------------------------------------------------------------------------------

proc `$`*(struct: Structure): string =

 ## Produce a representation of a structure.
 result = "("
 for name in struct.names:
   result.addSep(", ", 1)
   result.add(name & ": " & $struct.get(name))
 result.add(')')
  1. ---------------------------------------------------------------------------------------------------

proc toHex(struct: Structure): string =

 ## Return the hexadecimal representation of a structure.
 case struct.unit
 of unit8:
   for row in struct.s8:
     result.add(row.toHex(2))
 of unit16:
   for row in struct.s16:
     result.add(row.toHex(4))
 of unit32:
   for row in struct.s32:
     result.add(row.tohex(8))
 of unit64:
   for row in struct.s64:
     result.add(row.toHex(16))
  1. ———————————————————————————————————————————————————————————————————————————————————————————————————

when isMainModule:

 # Creation of a structure at compile time.
 # ----------------------------------------
 # Create the type "Header" to represent the structure described by "Diagram".
 createStructType(Diagram, "Header")
 # Declare a variable of type Header and initialize its fields.
 var header1 = Header(ID: 30791, QR: 0, Opcode: 15, AA: 0, TC: 1, RD: 1, RA: 1, Z: 3, RCODE:15,
                      QDCOUNT: 21654, ANCOUNT: 57646, NSCOUNT: 7153, ARCOUNT: 27044)
 echo "Header from a structure defined at compile time:"
 echo header1
 echo ""
 # Of course, it is possible to loop on the fields.
 echo "Same fields/values retrieved with an iterator:"
 for name, value in header1.fieldPairs:
   echo name, ": ", value
 echo ""
 # Hexadecimal representation.
 var h = ""
 var p = cast[ptr UncheckedArray[typeof(header1.ID)]](addr(header1))
 for i in 0..<(sizeof(header1) div sizeof(typeof(header1.ID))):
   h.add(p[i].toHex(4))
 echo "Hexadecimal representation: ", h
 echo ""


 # Creation of a structure at runtime.
 # -----------------------------------
 # Declare a variable initalized with the structure created at runtime.
 var header2 = createStructVar(Diagram)
 header2.set("ID", 30791)
 header2.set("QR", 0)
 header2.set("Opcode", 15)
 header2.set("AA", 0)
 header2.set("TC", 1)
 header2.set("RD", 1)
 header2.set("RA", 1)
 header2.set("Z", 3)
 header2.set("RCODE", 15)
 header2.set("QDCOUNT", 21654)
 header2.set("ANCOUNT", 57646)
 header2.set("NSCOUNT", 7153)
 header2.set("ARCOUNT", 27044)
 echo "Header from a structure defined at runtime: "
 echo header2
 echo ""
 # List fields using the "fieldPairs" iterator.
 echo "Same fields/values retrieved with an iterator:"
 for name, val in header2.fieldPairs():
   echo name, ": ", val
 echo ""
 # Hexadecimal representation.
 echo "Hexadecimal representation: ", header2.toHex()</lang>
Output:
Header from a structure defined at compile time:
(ID: 30791, QR: 0, Opcode: 15, AA: 0, TC: 1, RD: 1, RA: 1, Z: 3, RCODE: 15, QDCOUNT: 21654, ANCOUNT: 57646, NSCOUNT: 7153, ARCOUNT: 27044)

Same fields/values retrieved with an iterator:
ID: 30791
QR: 0
Opcode: 15
AA: 0
TC: 1
RD: 1
RA: 1
Z: 3
RCODE: 15
QDCOUNT: 21654
ANCOUNT: 57646
NSCOUNT: 7153
ARCOUNT: 27044

Hexadecimal representation: 7847F7DE5496E12E1BF169A4

Header from a structure defined at runtime: 
(ID: 30791, QR: 0, Opcode: 15, AA: 0, TC: 1, RD: 1, RA: 1, Z: 3, RCODE: 15, QDCOUNT: 21654, ANCOUNT: 57646, NSCOUNT: 7153, ARCOUNT: 27044)

Same fields/values retrieved with an iterator:
ID: 30791
QR: 0
Opcode: 15
AA: 0
TC: 1
RD: 1
RA: 1
Z: 3
RCODE: 15
QDCOUNT: 21654
ANCOUNT: 57646
NSCOUNT: 7153
ARCOUNT: 27044

Hexadecimal representation: 78477BBF5496E12E1BF169A4

Perl

<lang perl>#!/usr/bin/perl

use strict; use warnings;

$_ = <<END;

   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                      ID                       |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    QDCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ANCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    NSCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ARCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

END

my $template; my @names; while( /\| *(\w+) */g )

 {
 printf "%10s is %2d bits\n", $1, my $length = length($&) / 3;
 push @names, $1;
 $template .= "A$length ";
 }

my $input = '78477bbf5496e12e1bf169a4'; # as hex

my %datastructure; @datastructure{ @names } = unpack $template, unpack 'B*', pack 'H*', $input;

print "\ntemplate = $template\n\n"; use Data::Dump 'dd'; dd 'datastructure', \%datastructure;</lang>

Output:
        ID is 16 bits
        QR is  1 bits
    Opcode is  4 bits
        AA is  1 bits
        TC is  1 bits
        RD is  1 bits
        RA is  1 bits
         Z is  3 bits
     RCODE is  4 bits
   QDCOUNT is 16 bits
   ANCOUNT is 16 bits
   NSCOUNT is 16 bits
   ARCOUNT is 16 bits

template = A16 A1 A4 A1 A1 A1 A1 A3 A4 A16 A16 A16 A16

(
  "datastructure",
  {
    AA => 0,
    ANCOUNT => 1110000100101110,
    ARCOUNT => "0110100110100100",
    ID => "0111100001000111",
    NSCOUNT => "0001101111110001",
    Opcode => 1111,
    QDCOUNT => "0101010010010110",
    QR => 0,
    RA => 1,
    RCODE => 1111,
    RD => 1,
    TC => 1,
    Z => "011",
  },
)

Phix

Should work on any width, but didn't actually test, or verify width is 8/16/32/64. <lang Phix>function interpret(sequence lines)

   if remainder(length(lines),2)!=1 then
       crash("missing header/footer?")
   end if
   string l1 = lines[1]
   integer w = length(l1)
   integer bits = (w-1)/3  -- sug: check this is 8/16/32/64
   if l1!=join(repeat("+",bits+1),"--") then
       crash("malformed header?")
   end if
   sequence res = {}
   integer offset = 0
   for i=1 to length(lines) do
       string li = lines[i]
       if remainder(i,2) then
           if li!=l1 then
               crash("missing separator (line %d)?",{i})
           end if
       else
           if li[1]!='|' or li[w]!='|' then
               crash("missing separator on line %d",{i})
           end if
           integer k = 1
           while true do
               integer l = find('|',li,k+1)
               string desc = trim(li[k+1..l-1])
               {k,l} = {l,(l-k)/3}
               res = append(res,{desc,l,offset})
               offset += l
               if k=w then exit end if
           end while
       end if
   end for
   res = append(res,{"total",0,offset})
   return res

end function

procedure unpack(string data, sequence res)

   if length(data)*8!=res[$][3] then
       crash("wrong length")
   end if
   string bin = ""
   for i=1 to length(data) do
       bin &= sprintf("%08b",data[i])
   end for
   printf(1,"\n\nTest bit string:\n%s\n\nUnpacked:\n",{bin})
   for i=1 to length(res)-1 do
       {string name, integer bits, integer offset} = res[i]
       printf(1,"%7s, %02d bits: %s\n",{name,bits,bin[offset+1..offset+bits]})
   end for

end procedure

function trimskip(string diagram) -- -- split's ",no_empty:=true)" is not quite enough here. -- Note that if copy/paste slips in any tab characters, -- it will most likely trigger a length mismatch error. --

   sequence lines = split(diagram,'\n')
   integer prevlli = 0
   for i=length(lines) to 1 by -1 do
       string li = trim(lines[i])
       integer lli = length(li)
       if lli then
           if prevlli=0 then
               prevlli = lli
           elsif lli!=prevlli then
               crash("mismatching lengths")
           end if
           lines[i] = li
       else
           lines[i..i] = {}
       end if
   end for
   return lines

end function

constant diagram = """

   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                      ID                       |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    QDCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ANCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    NSCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ARCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

"""

sequence lines = trimskip(diagram) sequence res = interpret(lines) printf(1,"--Name-- Size Offset\n") for i=1 to length(res) do

   printf(1," %-7s   %2d  %5d\n",res[i])

end for

unpack(x"78477bbf5496e12e1bf169a4",res)</lang>

Output:
--Name--  Size  Offset
 ID        16      0
 QR         1     16
 Opcode     4     17
 AA         1     21
 TC         1     22
 RD         1     23
 RA         1     24
 Z          3     25
 RCODE      4     28
 QDCOUNT   16     32
 ANCOUNT   16     48
 NSCOUNT   16     64
 ARCOUNT   16     80
 total      0     96


Test bit string:
011110000100011101111011101111110101010010010110111000010010111000011011111100010110100110100100

Unpacked:
     ID, 16 bits: 0111100001000111
     QR, 01 bits: 0
 Opcode, 04 bits: 1111
     AA, 01 bits: 0
     TC, 01 bits: 1
     RD, 01 bits: 1
     RA, 01 bits: 1
      Z, 03 bits: 011
  RCODE, 04 bits: 1111
QDCOUNT, 16 bits: 0101010010010110
ANCOUNT, 16 bits: 1110000100101110
NSCOUNT, 16 bits: 0001101111110001
ARCOUNT, 16 bits: 0110100110100100

Python

<lang python> """ http://rosettacode.org/wiki/ASCII_art_diagram_converter

Python example based off Go example:

http://rosettacode.org/wiki/ASCII_art_diagram_converter#Go

"""

def validate(diagram):

   # trim empty lines
   
   rawlines = diagram.splitlines()
   lines = []
   for line in rawlines:
       if line != :
           lines.append(line)
           
   # validate non-empty lines
           
   if len(lines) == 0:
       print('diagram has no non-empty lines!')
       return None
       
   width = len(lines[0])
   cols = (width - 1) // 3
   
   if cols not in [8, 16, 32, 64]: 
       print('number of columns should be 8, 16, 32 or 64')
       return None
       
   if len(lines)%2 == 0:
       print('number of non-empty lines should be odd')
       return None
   
   if lines[0] != (('+--' * cols)+'+'):
           print('incorrect header line')
           return None
   for i in range(len(lines)):
       line=lines[i]
       if i == 0:
           continue
       elif i%2 == 0:
           if line != lines[0]:
               print('incorrect separator line')
               return None
       elif len(line) != width:
           print('inconsistent line widths')
           return None
       elif line[0] != '|' or line[width-1] != '|':
           print("non-separator lines must begin and end with '|'")    
           return None
   
   return lines

"""

results is list of lists like:

[[name, bits, start, end],...

"""

def decode(lines):

   print("Name     Bits  Start  End")
   print("=======  ====  =====  ===")
   
   startbit = 0
   
   results = []
   
   for line in lines:
       infield=False
       for c in line:
           if not infield and c == '|':
               infield = True
               spaces = 0
               name = 
           elif infield:
               if c == ' ':
                   spaces += 1
               elif c != '|':
                   name += c
               else:
                   bits = (spaces + len(name) + 1) // 3
                   endbit = startbit + bits - 1
                   print('{0:7}    {1:2d}     {2:2d}   {3:2d}'.format(name, bits, startbit, endbit))
                   reslist = [name, bits, startbit, endbit]
                   results.append(reslist)
                   spaces = 0
                   name = 
                   startbit += bits
                   
   return results
                       

def unpack(results, hex):

   print("\nTest string in hex:")
   print(hex)
   print("\nTest string in binary:")
   bin = f'{int(hex, 16):0>{4*len(hex)}b}'
   print(bin)
   print("\nUnpacked:\n")
   print("Name     Size  Bit pattern")
   print("=======  ====  ================")
   for r in results:
       name = r[0]
       size = r[1]
       startbit = r[2]
       endbit = r[3]
       bitpattern = bin[startbit:endbit+1]
       print('{0:7}    {1:2d}  {2:16}'.format(name, size, bitpattern))


diagram = """ +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| Opcode |AA|TC|RD|RA| Z | RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

"""

lines = validate(diagram)

if lines == None:

   print("No lines returned")

else:

   print(" ")
   print("Diagram after trimming whitespace and removal of blank lines:")
   print(" ")
   for line in lines:
       print(line)
       
   print(" ")
   print("Decoded:")
   print(" ")
   results = decode(lines)    
   
   # test string
   
   hex = "78477bbf5496e12e1bf169a4" 
   
   unpack(results, hex)

</lang>

Output:
 
Diagram after trimming whitespace and removal of blank lines:
 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
 
Decoded:
 
Name     Bits  Start  End
=======  ====  =====  ===
ID         16      0   15
QR          1     16   16
Opcode      4     17   20
AA          1     21   21
TC          1     22   22
RD          1     23   23
RA          1     24   24
Z           3     25   27
RCODE       4     28   31
QDCOUNT    16     32   47
ANCOUNT    16     48   63
NSCOUNT    16     64   79
ARCOUNT    16     80   95

Test string in hex:
78477bbf5496e12e1bf169a4

Test string in binary:
011110000100011101111011101111110101010010010110111000010010111000011011111100010110100110100100

Unpacked:

Name     Size  Bit pattern
=======  ====  ================
ID         16  0111100001000111
QR          1  0               
Opcode      4  1111            
AA          1  0               
TC          1  1               
RD          1  1               
RA          1  1               
Z           3  011             
RCODE       4  1111            
QDCOUNT    16  0101010010010110
ANCOUNT    16  1110000100101110
NSCOUNT    16  0001101111110001
ARCOUNT    16  0110100110100100

Racket

Three files:

  • ascii-art-parser.rkt: provides the function ascii-art->struct, which converts ASCII art from a string (or input port) to a list of word-number, bit range and id
  • ascii-art-reader.rkt: uses this to provide a sytntax define-ascii-art-structure which defines a structure using the art work
  • test-ascci-art-reader.rkt: gives it all a rigourousish going over

Note that if you want to extend the word width too 32-bits (or more) add multiples of eight bit blocks horizontally (i.e. --+--+--+--+--+--+--+--+). IMO Having the diagrams 16-bits wide reflects the choice of a 16-bit word as the natural word size of the interface. If it were 32 or 64, the blocks would have to be wider.

ascii-art-parser.rkt Note that this is in the racket/base language so it doesn't overburden the modules that import it, especially since they're at the suntax phase. <lang racket>#lang racket/base (require (only-in racket/list drop-right)

        (only-in racket/string string-trim))

(provide ascii-art->struct)

reads ascii art from a string or input-port
returns
list of (word-number highest-bit lowest-bit name-symbol)
bits per word

(define (ascii-art->struct art)

 (define art-inport
   (cond
     [(string? art) (open-input-string art)]
     [(input-port? art) art]
     [else (raise-argument-error 'ascii-art->struct
                                 "(or/c string? input-port?)"
                                 art)]))
 (define lines
   (for/list ((l (in-port (lambda (p)
                            (define pk (peek-char p))
                            (case pk ((#\+ #\|) (read-line p))
                              (else eof)))
                          art-inport)))
     l))
 (when (null? lines)
   (error 'ascii-art->struct "no lines"))
 (define bit-re #px"[|+]([^|+]*)")
 (define cell-re #px"[|]([^|]*)")
 (define bit-boundaries (regexp-match-positions* bit-re (car lines)))
 (define bits/word (sub1 (length bit-boundaries)))
 (unless (zero? (modulo bits/word 8))
   (error 'ascii-art->struct "diagram is not a multiple of 8 bits wide"))
 (define-values (pos->bit-start# pos->bit-end#)
   (for/fold ((s# (hash)) (e# (hash)))
             ((box (in-range bits/word))
              (boundary (in-list bit-boundaries)))
     (define bit (- bits/word box 1))
     (values (hash-set s# (car boundary) bit)
             (hash-set e# (cdr boundary) bit))))
 (define fields
   (apply append
          (for/list ((line-number (in-naturals))
                     (line (in-list lines))
                     #:when (odd? line-number))
            (define word (quotient line-number 2))
            (define cell-positions (regexp-match-positions* cell-re line))
            (define cell-contents (regexp-match* cell-re line))
            (for/list ((cp (in-list (drop-right cell-positions 1)))
                       (cnt (in-list cell-contents)))
              (define cell-start-bit (hash-ref pos->bit-start# (car cp)))
              (define cell-end-bit (hash-ref pos->bit-end# (cdr cp)))
              (list word cell-start-bit cell-end-bit (string->symbol (string-trim (substring cnt 1))))))))
 (values fields bits/word))</lang>

ascii-art-reader.rkt <lang racket>#lang racket (require (for-syntax "ascii-art-parser.rkt")) (require (for-syntax racket/syntax))

(provide (all-defined-out))

(define-syntax (define-ascii-art-structure stx)

 (syntax-case stx ()
   [(_ id art)
    (let*-values (((all-fields bits/word) (ascii-art->struct (syntax-e #'art))))
      (with-syntax
          ((bytes->id (format-id stx "bytes->~a" #'id))
           (id->bytes (format-id stx "~a->bytes" #'id))
           (word-size (add1 (car (for/last ((f all-fields)) f))))
           (fld-ids (map cadddr all-fields))
           (fld-setters
            (cons
             #'id
             (for/list ((fld (in-list all-fields)))
               (let* ((bytes/word (quotient bits/word 8))
                      (start-byte (let ((word-no (car fld))) (* word-no bytes/word))))
                 `(bitwise-bit-field (integer-bytes->integer bs
                                                             #f
                                                             (system-big-endian?)
                                                             ,start-byte
                                                             ,(+ start-byte bytes/word))
                                     ,(caddr fld)
                                     ,(add1 (cadr fld)))))))
           (set-fields-bits
            (list*
             'begin
             (for/list ((fld (in-list all-fields)))
               (define val (cadddr fld))
               (define start-bit (cadr fld))
               (define end-bit (caddr fld))
               (define start-byte (let ((word-no (car fld))) (* word-no (quotient bits/word 8))))
               (define fld-bit-width (- start-bit end-bit -1))
               (define aligned?/width (and (= end-bit 0)
                                           (= (modulo start-bit 8) 7)
                                           (quotient fld-bit-width 8)))
               (case aligned?/width
                 [(2 4)
                  `(integer->integer-bytes ,val
                                           ,aligned?/width
                                           #f
                                           (system-big-endian?)
                                           rv
                                           ,start-byte)]
                 [else
                  (define the-byte (+ start-byte (quotient end-bit 8)))
                  `(bytes-set! rv
                               ,the-byte
                               (bitwise-ior (arithmetic-shift (bitwise-bit-field ,val 0 ,fld-bit-width)
                                                              ,(modulo end-bit 8))
                                            (bytes-ref rv ,the-byte)))])))))
        #`(begin
            (struct id fld-ids #:mutable)
            (define (bytes->id bs)
              fld-setters)
            (define (id->bytes art-in)
              (match-define (id #,@#'fld-ids) art-in)
              (define rv (make-bytes (* word-size #,(quotient bits/word 8))))
              set-fields-bits
              rv))))]))</lang>

test-ascii-art-reader.rkt <lang racket>#lang racket (require "ascii-art-reader.rkt") (require "ascii-art-parser.rkt") (require tests/eli-tester)

(define rfc-1035-header-art

 #<<EOS

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| Opcode |AA|TC|RD|RA| Z | RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ EOS

 )

(define-values (rslt rslt-b/w) (ascii-art->struct rfc-1035-header-art))

(test

rslt-b/w => 16
rslt =>
'((0 15  0 ID)
  (1 15 15 QR)
  (1 14 11 Opcode)
  (1 10 10 AA)
  (1  9  9 TC)
  (1  8  8 RD)
  (1  7  7 RA)
  (1  6  4 Z)
  (1  3  0 RCODE)
  (2 15  0 QDCOUNT)
  (3 15  0 ANCOUNT)
  (4 15  0 NSCOUNT)
  (5 15  0 ARCOUNT)))

(define-ascii-art-structure rfc-1035-header #<<EOS +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| Opcode |AA|TC|RD|RA| Z | RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ EOS

                    )

(define h-bytes

 (bytes-append
  (integer->integer-bytes #x1234 2 #f)
  (integer->integer-bytes #x5678 2 #f)
  (integer->integer-bytes #x9abc 2 #f)
  (integer->integer-bytes #xdef0 2 #f)
  (integer->integer-bytes #xfedc 2 #f)
  (integer->integer-bytes #xba98 2 #f)))

(define h-bytes~

 (bytes-append
  (integer->integer-bytes #x1234 2 #f (not (system-big-endian?)))
  (integer->integer-bytes #x5678 2 #f (not (system-big-endian?)))
  (integer->integer-bytes #x9abc 2 #f (not (system-big-endian?)))
  (integer->integer-bytes #xdef0 2 #f (not (system-big-endian?)))
  (integer->integer-bytes #xfedc 2 #f (not (system-big-endian?)))
  (integer->integer-bytes #xba98 2 #f (not (system-big-endian?)))))

(define h (bytes->rfc-1035-header h-bytes)) (define bytes-h (rfc-1035-header->bytes h))

(define h~ (bytes->rfc-1035-header h-bytes~)) (define bytes-h~ (rfc-1035-header->bytes h~))

(test

(rfc-1035-header-ID h) => #x1234
(rfc-1035-header-ARCOUNT h) => #xBA98
(rfc-1035-header-RCODE h) => 8
(rfc-1035-header-ID h~) => #x3412
(rfc-1035-header-ARCOUNT h~) => #x98BA
(rfc-1035-header-RCODE h~) => 6
h-bytes => bytes-h
h-bytes~ => bytes-h~)

(set-rfc-1035-header-RA! h 0)

(set-rfc-1035-header-Z! h 7) (test

(rfc-1035-header-Z (bytes->rfc-1035-header (rfc-1035-header->bytes h))) => 7
(rfc-1035-header-RA (bytes->rfc-1035-header (rfc-1035-header->bytes h))) => 0)

(set-rfc-1035-header-Z! h 15) ;; naughty -- might splat RA (test

(rfc-1035-header-Z (bytes->rfc-1035-header (rfc-1035-header->bytes h))) => 7
(rfc-1035-header-RA (bytes->rfc-1035-header (rfc-1035-header->bytes h))) => 0)</lang>
Output:

Nothing much to see... all tests pass

Raku

(formerly Perl 6)

Works with: Rakudo version 2018.05

<lang perl6>grammar RFC1025 {

   rule  TOP {  <.line-separator> [<line> <.line-separator>]+ }
   rule  line-separator { <.ws> '+--'+ '+' }
   token line  { <.ws> '|' +%% <field>  }
   token field  { \s* <label> \s* }
   token label { \w+[\s+\w+]* }

}

sub bits ($item) { ($item.chars + 1) div 3 }

sub deconstruct ($bits, %struct) {

   map { $bits.substr(.<from>, .<bits>) }, @(%struct<fields>);

}

sub interpret ($header) {

   my $datagram = RFC1025.parse($header);
   my %struct;
   for $datagram.<line> -> $line {
       FIRST %struct<line-width> = $line.&bits;
       state $from = 0;
       %struct<fields>.push: %(:bits(.&bits), :ID(.<label>.Str), :from($from.clone), :to(($from+=.&bits)-1))
         for $line<field>;
   }
   %struct

}

use experimental :pack;

my $diagram = q:to/END/;

   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                      ID                       |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    QDCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ANCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    NSCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |                    ARCOUNT                    |
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

END

my %structure = interpret($diagram);

say 'Line width: ', %structure<line-width>, ' bits'; printf("Name: %7s, bit count: %2d, bit %2d to bit %2d\n", .<ID>, .<bits>, .<from>, .<to>) for @(%structure<fields>); say "\nGenerate a random 12 byte \"header\""; say my $buf = Buf.new((^0xFF .roll) xx 12); say "\nShow it converted to a bit string"; say my $bitstr = $buf.unpack('C*')».fmt("%08b").join; say "\nAnd unpack it"; printf("%7s, %02d bits: %s\n", %structure<fields>[$_]<ID>, %structure<fields>[$_]<bits>,

 deconstruct($bitstr, %structure)[$_]) for ^@(%structure<fields>);</lang>
Output:
Line width: 16 bits
Name:      ID, bit count: 16, bit  0 to bit 15
Name:      QR, bit count:  1, bit 16 to bit 16
Name:  Opcode, bit count:  4, bit 17 to bit 20
Name:      AA, bit count:  1, bit 21 to bit 21
Name:      TC, bit count:  1, bit 22 to bit 22
Name:      RD, bit count:  1, bit 23 to bit 23
Name:      RA, bit count:  1, bit 24 to bit 24
Name:       Z, bit count:  3, bit 25 to bit 27
Name:   RCODE, bit count:  4, bit 28 to bit 31
Name: QDCOUNT, bit count: 16, bit 32 to bit 47
Name: ANCOUNT, bit count: 16, bit 48 to bit 63
Name: NSCOUNT, bit count: 16, bit 64 to bit 79
Name: ARCOUNT, bit count: 16, bit 80 to bit 95

Generate a random 12 byte "header"
Buf:0x<78 47 7b bf 54 96 e1 2e 1b f1 69 a4>

Show it converted to a bit string
011110000100011101111011101111110101010010010110111000010010111000011011111100010110100110100100

And unpack it
     ID, 16 bits: 0111100001000111
     QR, 01 bits: 0
 Opcode, 04 bits: 1111
     AA, 01 bits: 0
     TC, 01 bits: 1
     RD, 01 bits: 1
     RA, 01 bits: 1
      Z, 03 bits: 011
  RCODE, 04 bits: 1111
QDCOUNT, 16 bits: 0101010010010110
ANCOUNT, 16 bits: 1110000100101110
NSCOUNT, 16 bits: 0001101111110001
ARCOUNT, 16 bits: 0110100110100100

REXX

Some code was added to the REXX program to validate the input file. <lang rexx>/*REXX program interprets an ASCII art diagram for names and their bit length(s).*/ numeric digits 100 /*be able to handle large numbers. */ er= '***error*** illegal input txt' /*a literal used for error messages. */ parse arg iFID test . /*obtain optional input─FID & test─data*/ if iFID== | iFID=="," then iFID= 'ASCIIART.TXT' /*use the default iFID.*/ if test== | test=="," then test= 'cafe8050800000808080000a' /* " " " data.*/ w= 0; wb= 0;  !.= 0; $= /*W (max width name), bits, names. */ @.= 0; @.0= 1 /*!.α is structure bit position. */

                                                /* [↓]  read the input text file (iFID)*/
   do j=1  while lines(iFID)\==0;     q= linein(iFID);             say  '■■■■■text►'q
   q= strip(q);          if q==  then iterate /*strip leading and trailing blanks.   */
   _L= left(q, 1);       _R= right(q, 1)        /*get extreme left and right characters*/
                                                /* [↓]  is this record an "in-between"?*/
   if _L=='+'  then do;  if verify(q, '+-')\==0  then say er    "(invalid grid):"     q
                         iterate                /*skip this record, it's a single "+". */
                    end
   if _L\=='|'  |  _R\=="|"   then do;   say er  '(boundary): '   q;   iterate
                                   end
      do  until q=='|';  parse var  q    '|'  x  "|"  -1  q   /*parse record for names.*/
      n= strip(x);       w= max(w, length(n) );   if n==  then leave     /*is N null?*/
      if words(n)\==1         then do;  say er '(invalid name): '  n;     iterate j
                                   end          /* [↑]  add more name validations.     */
      $$= $;     nn= n;  upper $$ n             /*$$ and N  could be a mixed─case name.*/
      if wordpos(nn, $$)\==0  then do;  say er '(dup name):'       n;     iterate j
                                   end
      $= $ n                                    /*add the   N   (name)  to the $ list. */
      #= words($);     !.#= (length(x) + 1) % 3 /*assign the number of bits for  N.    */
      wb= max(wb, !.#)                          /*max # of bits; # names prev. to this.*/
      prev= # - 1;     @.#= @.prev + !.prev     /*number of names previous to this name*/
      end   /*until*/
   end      /*j*/

say if j==1 then do; say er ' (file not found): ' iFID; exit 12

             end
    do k=1  for words($)
    say right( word($, k), w)right(!.k, 4)        "bits,  bit position:"right(@.k, 5)
    end   /*k*/

say /* [↓] Any (hex) data to test? */ L= length(test); if L==0 then exit /*stick a fork in it, we're all done. */ bits= x2b(test) /*convert test data to a bit string. */ wm= length( x2d( b2x( copies(1, wb) ) ) ) + 1 /*used for displaying max width numbers*/ say 'test (hex)=' test " length=" L 'hexadecimal digits.' say

      do r=1  by 8+8  to L*4;   _1= substr(bits, r, 8, 0);    _2= substr(bits, r+8, 8, 0)
      say 'test (bit)='    _1   _2   "   hex="    lower( b2x(_1) )     lower( b2x(_2) )
      end   /*r*/

say

      do m=1  for words($)                      /*show some hexadecimal strings──►term.*/
      _= lower( b2x( substr( bits, @.m, !.m) )) /*show the hex string in lowercase.    */
      say right( word($, m), w+2)     '  decimal='right( x2d(_), wm)      "      hex="  _
      end   /*m*/

exit 0 /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ lower: l= 'abcdefghijklmnopqrstuvwxyz'; u=l; upper u; return translate( arg(1), l, u)</lang>

output   when using the default input:
■■■■■text►
■■■■■text►
■■■■■text►     +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
■■■■■text►     |                      ID                       |
■■■■■text►     +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
■■■■■text►     |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
■■■■■text►     +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
■■■■■text►     |                    QDCOUNT                    |
■■■■■text►     +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
■■■■■text►     |                    ANCOUNT                    |
■■■■■text►     +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
■■■■■text►     |                    NSCOUNT                    |
■■■■■text►     +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
■■■■■text►     |                    ARCOUNT                    |
■■■■■text►     +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
■■■■■text►
■■■■■text►

     ID  16 bits,  bit position:    1
     QR   1 bits,  bit position:   17
 OPCODE   4 bits,  bit position:   18
     AA   1 bits,  bit position:   22
     TC   1 bits,  bit position:   23
     RD   1 bits,  bit position:   24
     RA   1 bits,  bit position:   25
      Z   3 bits,  bit position:   26
  RCODE   4 bits,  bit position:   29
QDCOUNT  16 bits,  bit position:   33
ANCOUNT  16 bits,  bit position:   49
NSCOUNT  16 bits,  bit position:   65
ARCOUNT  16 bits,  bit position:   81

test (hex)= cafe8050800000808080000a     length= 24 hexadecimal digits.

test (bit)= 11001010 11111110    hex= CA FE
test (bit)= 10000000 01010000    hex= 80 50
test (bit)= 10000000 00000000    hex= 80 00
test (bit)= 00000000 10000000    hex= 00 80
test (bit)= 10000000 10000000    hex= 80 80
test (bit)= 00000000 00001010    hex= 00 0A

       ID   decimal= 51966       hex= cafe
       QR   decimal=     1       hex= 1
   OPCODE   decimal=     0       hex= 0
       AA   decimal=     0       hex= 0
       TC   decimal=     0       hex= 0
       RD   decimal=     0       hex= 0
       RA   decimal=     0       hex= 0
        Z   decimal=     5       hex= 5
    RCODE   decimal=     0       hex= 0
  QDCOUNT   decimal= 32768       hex= 8000
  ANCOUNT   decimal=   128       hex= 0080
  NSCOUNT   decimal= 32896       hex= 8080
  ARCOUNT   decimal=    10       hex= 000a

Rust

The solution implements a few additional features:

  • The input is thoroughly validated.
  • The width is intentionally not restricted, therefore diagrams of any width can be processed.
  • The parser allows omitting border for fields that occupy multiple full lines.
  • Fields can wrap over an end of a line.
  • When extracting field values, truncated input is recognized and unfilled fields are reported.


See the output below the source code.

<lang Rust>use std::{borrow::Cow, io::Write};

pub type Bit = bool;

  1. [derive(Clone, Debug)]

pub struct Field {

   name: String,
   from: usize,
   to: usize,

}

impl Field {

   pub fn new(name: String, from: usize, to: usize) -> Self {
       assert!(from < to);
       Self { name, from, to }
   }
   pub fn name(&self) -> &str {
       &self.name
   }
   pub fn from(&self) -> usize {
       self.from
   }
   pub fn to(&self) -> usize {
       self.to
   }
   pub fn size(&self) -> usize {
       self.to - self.from
   }
   pub fn extract_bits<'a>(
       &self,
       bytes: &'a [u8],
   ) -> Option<impl Iterator<Item = (usize, Bit)> + 'a> {
       if self.to <= bytes.len() * 8 {
           Some((self.from..self.to).map(move |index| {
               let byte = bytes[index / 8];
               let bit_index = 7 - (index % 8);
               let bit_value = (byte >> bit_index) & 1 == 1;
               (index, bit_value)
           }))
       } else {
           None
       }
   }
   fn extend(&mut self, new_to: usize) {
       assert!(self.to <= new_to);
       self.to = new_to;
   }

}

trait Consume: Iterator {

   fn consume(&mut self, value: Self::Item) -> Result<Self::Item, Option<Self::Item>>
   where
       Self::Item: PartialEq,
   {
       match self.next() {
           Some(v) if v == value => Ok(v),
           Some(v) => Err(Some(v)),
           None => Err(None),
       }
   }

}

impl<T: Iterator> Consume for T {}

  1. [derive(Clone, Copy, Debug)]

enum ParserState {

   Uninitialized,
   ExpectBorder,
   ExpectField,
   AllowEmpty,

}

  1. [derive(Clone, Copy, Debug)]

pub enum ParserError {

   ParsingFailed,
   UnexpectedEnd,
   InvalidBorder,
   WrongLineWidth,
   FieldExpected,
   BadField,

}

  1. [derive(Debug)]

pub(crate) struct Parser {

   state: Option<ParserState>,
   width: usize,
   from: usize,
   fields: Vec<Field>,

}

impl Parser {

   #[allow(clippy::new_without_default)]
   pub fn new() -> Self {
       Self {
           state: Some(ParserState::Uninitialized),
           width: 0,
           from: 0,
           fields: Vec::new(),
       }
   }
   pub fn accept(&mut self, line: &str) -> Result<(), ParserError> {
       if let Some(state) = self.state.take() {
           let line = line.trim();
           if !line.is_empty() {
               self.state = Some(match state {
                   ParserState::Uninitialized => self.parse_border(line)?,
                   ParserState::ExpectBorder => self.accept_border(line)?,
                   ParserState::ExpectField => self.parse_fields(line)?,
                   ParserState::AllowEmpty => self.extend_field(line)?,
               });
           }
           Ok(())
       } else {
           Err(ParserError::ParsingFailed)
       }
   }
   pub fn finish(self) -> Result<Vec<Field>, ParserError> {
       match self.state {
           Some(ParserState::ExpectField) => Ok(self.fields),
           _ => Err(ParserError::UnexpectedEnd),
       }
   }
   fn parse_border(&mut self, line: &str) -> Result<ParserState, ParserError> {
       self.width = Parser::border_columns(line).map_err(|_| ParserError::InvalidBorder)?;
       Ok(ParserState::ExpectField)
   }
   fn accept_border(&mut self, line: &str) -> Result<ParserState, ParserError> {
       match Parser::border_columns(line) {
           Ok(width) if width == self.width => Ok(ParserState::ExpectField),
           Ok(_) => Err(ParserError::WrongLineWidth),
           Err(_) => Err(ParserError::InvalidBorder),
       }
   }
   fn parse_fields(&mut self, line: &str) -> Result<ParserState, ParserError> {
       let mut slots = line.split('|');
       // The first split result is the space outside of the schema
       slots.consume("").map_err(|_| ParserError::FieldExpected)?;
       let mut remaining_width = self.width * Parser::COLUMN_WIDTH;
       let mut fields_found = 0;
       loop {
           match slots.next() {
               Some(slot) if slot.is_empty() => {
                   // The only empty slot is the last one
                   if slots.next().is_some() || remaining_width != 0 {
                       return Err(ParserError::BadField);
                   }
                   break;
               }
               Some(slot) => {
                   let slot_width = slot.chars().count() + 1; // Include the slot separator
                   if remaining_width < slot_width || slot_width % Parser::COLUMN_WIDTH != 0 {
                       return Err(ParserError::BadField);
                   }
                   let name = slot.trim();
                   if name.is_empty() {
                       return Err(ParserError::BadField);
                   }
                   // An actual field slot confirmed
                   remaining_width -= slot_width;
                   fields_found += 1;
                   let from = self.from;
                   let to = from + slot_width / Parser::COLUMN_WIDTH;
                   // If the slot belongs to the same field as the last one, just extend it
                   if let Some(f) = self.fields.last_mut().filter(|f| f.name() == name) {
                       f.extend(to);
                   } else {
                       self.fields.push(Field::new(name.to_string(), from, to));
                   }
                   self.from = to;
               }
               _ => return Err(ParserError::BadField),
           }
       }
       Ok(if fields_found == 1 {
           ParserState::AllowEmpty
       } else {
           ParserState::ExpectBorder
       })
   }
   fn extend_field(&mut self, line: &str) -> Result<ParserState, ParserError> {
       let mut slots = line.split('|');
       // The first split result is the space outside of the schema
       if slots.consume("").is_ok() {
           if let Some(slot) = slots.next() {
               if slots.consume("").is_ok() {
                   let slot_width = slot.chars().count() + 1;
                   let remaining_width = self.width * Parser::COLUMN_WIDTH;
                   if slot_width == remaining_width && slot.chars().all(|c| c == ' ') {
                       self.from += self.width;
                       self.fields.last_mut().unwrap().extend(self.from);
                       return Ok(ParserState::AllowEmpty);
                   }
               }
           }
       }
       self.accept_border(line)
   }
   const COLUMN_WIDTH: usize = 3;
   fn border_columns(line: &str) -> Result<usize, Option<char>> {
       let mut chars = line.chars();
       // Read the first cell, which is mandatory
       chars.consume('+')?;
       chars.consume('-')?;
       chars.consume('-')?;
       chars.consume('+')?;
       let mut width = 1;
       loop {
           match chars.consume('-') {
               Err(Some(c)) => return Err(Some(c)),
               Err(None) => return Ok(width),
               Ok(_) => {}
           }
           chars.consume('-')?;
           chars.consume('+')?;
           width += 1;
       }
   }

}

pub struct Fields(pub Vec<Field>);

  1. [derive(Clone, Debug)]

pub struct ParseFieldsError {

   pub line: Option<String>,
   pub kind: ParserError,

}

impl ParseFieldsError {

   fn new(line: Option<String>, kind: ParserError) -> Self {
       Self { line, kind }
   }

}

impl std::str::FromStr for Fields {

   type Err = ParseFieldsError;
   fn from_str(s: &str) -> Result<Self, Self::Err> {
       let mut parser = Parser::new();
       for line in s.lines() {
           parser
               .accept(line)
               .map_err(|e| ParseFieldsError::new(Some(line.to_string()), e))?;
       }
       parser
           .finish()
           .map(Fields)
           .map_err(|e| ParseFieldsError::new(None, e))
   }

}

impl Fields {

   pub fn print_schema(&self, f: &mut dyn Write) -> std::io::Result<()> {
       writeln!(f, "Name          Bits    Start   End")?;
       writeln!(f, "=================================")?;
       for field in self.0.iter() {
           writeln!(
               f,
               "{:<12} {:>5}      {:>3}   {:>3}",
               field.name(),
               field.size(),
               field.from(),
               field.to() - 1 // Range is exclusive, but display it as inclusive
           )?;
       }
       writeln!(f)
   }
   pub fn print_decode(&self, f: &mut dyn Write, bytes: &[u8]) -> std::io::Result<()> {
       writeln!(f, "Input (hexadecimal octets): {:x?}", bytes)?;
       writeln!(f)?;
       writeln!(f, "Name          Size    Bit pattern")?;
       writeln!(f, "=================================")?;
       for field in self.0.iter() {
           writeln!(
               f,
               "{:<12} {:>5}    {}",
               field.name(),
               field.size(),
               field
                   .extract_bits(&bytes)
                   .map(|it| it.fold(String::new(), |mut acc, (index, bit)| {
                       // Instead of simple collect, let's print it rather with
                       // byte boundaries visible as spaces
                       if index % 8 == 0 && !acc.is_empty() {
                           acc.push(' ');
                       }
                       acc.push(if bit { '1' } else { '0' });
                       acc
                   }))
                   .map(Cow::Owned)
                   .unwrap_or_else(|| Cow::Borrowed("N/A"))
           )?;
       }
       writeln!(f)
   }

}

fn normalize(diagram: &str) -> String {

   diagram
       .lines()
       .map(|line| line.trim())
       .filter(|line| !line.is_empty())
       .fold(String::new(), |mut acc, x| {
           if !acc.is_empty() {
               acc.push('\n');
           }
           acc.push_str(x);
           acc
       })

}

fn main() {

   let diagram = r"
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                      ID                       |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    QDCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    ANCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    NSCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    ARCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                   OVERSIZED                   |
       |                                               |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       | OVERSIZED |           unused                  |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       ";
   let data = b"\x78\x47\x7b\xbf\x54\x96\xe1\x2e\x1b\xf1\x69\xa4\xab\xcd\xef\xfe\xdc";
   // Normalize and print the input, there is no need and no requirement to
   // generate it from the parsed representation
   let diagram = normalize(diagram);
   println!("{}", diagram);
   println!();
   match diagram.parse::<Fields>() {
       Ok(fields) => {
           let mut stdout = std::io::stdout();
           fields.print_schema(&mut stdout).ok();
           fields.print_decode(&mut stdout, data).ok();
       }
       Err(ParseFieldsError {
           line: Some(line),
           kind: e,
       }) => eprintln!("Invalid input: {:?}\n{}", e, line),
       Err(ParseFieldsError {
           line: _,
           kind: e,
       }) => eprintln!("Could not parse the input: {:?}", e),
   }

}</lang>

Output:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                   OVERSIZED                   |
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| OVERSIZED |           unused                  |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Name          Bits    Start   End
=================================
ID              16        0    15
QR               1       16    16
Opcode           4       17    20
AA               1       21    21
TC               1       22    22
RD               1       23    23
RA               1       24    24
Z                3       25    27
RCODE            4       28    31
QDCOUNT         16       32    47
ANCOUNT         16       48    63
NSCOUNT         16       64    79
ARCOUNT         16       80    95
OVERSIZED       36       96   131
unused          12      132   143

Input (hexadecimal octets): [78, 47, 7b, bf, 54, 96, e1, 2e, 1b, f1, 69, a4, ab, cd, ef, fe, dc]

Name          Size    Bit pattern
=================================
ID              16    01111000 01000111
QR               1    0
Opcode           4    1111
AA               1    0
TC               1    1
RD               1    1
RA               1    1
Z                3    011
RCODE            4    1111
QDCOUNT         16    01010100 10010110
ANCOUNT         16    11100001 00101110
NSCOUNT         16    00011011 11110001
ARCOUNT         16    01101001 10100100
OVERSIZED       36    10101011 11001101 11101111 11111110 1101
unused          12    N/A

Tcl

This example is in need of improvement:

This example is *incorrect*. It relies on an assumption that sequential bitfields in the same byte can be parsed by the [binary] command, which is not the case. The test "appears" correct because encode and decode suffer the same bug and hence round-trip works. A wrapper which doesn't disturb the below code too much is in progress.

This is a nice task to illustrate a couple of important concepts in Tcl:

 * using dictionaries, taking advantage of their ordering properties
 * the binary command
 * using (semi-)structured text as part of your source code

In this implementation, parse produces a dictionary from names to bit-lengths. encode and decode use these to produce appropriate binary format strings, and then do what they say on the tin. As implemented, this is limited to unsigned numeric values in fields. Supporting unsigned values, strings and enums would require parsing a more complex annotation than only the ASCII art packet structure, but ought not take much more code. <lang Tcl> namespace eval asciipacket {

   proc assert {expr} {    ;# for "static" assertions that throw nice errors
       if {![uplevel 1 [list expr $expr]]} {
           raise {ASSERT ERROR} "{$expr} {[subst -noc $expr]}"
       }
   }
   proc b2h {data} {       ;# format a binary string in hex digits
       binary scan $data H* hex; set hex
   }
   proc parse {s} {
       set result {}                       ;# we will return a dictionary
       set s [string trim $s]              ;# remove whitespace
       set s [split $s \n]                 ;# split into lines
       set s [lmap x $s {string trim $x}]  ;# trim whitespace from each line
       set s [lassign $s border0]          ;# pop off top border row
                                           ;# calculate chars per row, chars per bit
       set rowlen [llength [string map {+ \ } $border0]]
       set bitlen [expr {([string length $border0] - 1) / $rowlen}]
       assert {$bitlen * $rowlen + 1 == [string length $border0]}
       foreach {row border} $s {
           assert {$border eq $border0}
           set row [string trim $row |]
           foreach field [split $row |] {
               set len [string length |$field]
               assert {$len % $bitlen == 0}
               set name [string trim $field]
               set nbits [expr {$len / $bitlen}]
               assert {![dict exists $result $name]}
               dict set result $name $nbits
           }
       }
       return $result
   }
   proc encode {schema values} {
       set bincodes {1 B 8 c 16 S 32 W}    ;# see binary(n)
       set binfmt ""                       ;# format string
       set binargs ""                      ;# positional args
       dict for {name bitlen} $schema {
           set val [dict get $values $name]
           if {[dict exists $bincodes $bitlen]} {
               set fmt "[dict get $bincodes $bitlen]"
           } else {
               set val [format %0${bitlen}b $val]
               set fmt "B${bitlen}"
           }
           append binfmt $fmt
           lappend binargs $val
       }
       binary format $binfmt {*}$binargs
   }


   proc decode {schema data} {
       set result   {}                     ;# we will return a dict
       set bincodes {1 B 8 c 16 S 32 W}    ;# see binary(n)
       set binfmt   ""                     ;# format string
       set binargs  ""                     ;# positional args
       dict for {name bitlen} $schema {
           if {[dict exists $bincodes $bitlen]} {
               set fmt "[dict get $bincodes $bitlen]u" ;# note unsigned
           } else {
               set fmt "B${bitlen}"
           }
           append binfmt $fmt
           lappend binargs $name
       }
       binary scan $data $binfmt {*}$binargs
       foreach _ $binargs {
           dict set result $_ [set $_]
       }
       return $result
   }

} </lang> And here is how to use it with the original test data: <lang Tcl> proc test {} {

   set header {
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                      ID                       |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    QDCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    ANCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    NSCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |                    ARCOUNT                    |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   }
   set schema [asciipacket::parse $header]
   set values {
       ID 0xcafe
       QR 1
       Opcode 5
       AA 1
       TC 0
       RD 0
       RA 1
       Z  4
       RCODE 8
       QDCOUNT 0x00a5
       ANCOUNT 0x0a50
       NSCOUNT 0xa500
       ARCOUNT 0x500a
   }
   set pkt [asciipacket::encode $schema $values]
   puts "encoded packet (hex): [asciipacket::b2h $pkt]"
   array set decoded [asciipacket::decode $schema $pkt]
   parray decoded

} test </lang>

Output:
encoded packet (hex): cafe805080000080808000a50a50a500500a
decoded(AA)      = 1
decoded(ANCOUNT) = 2640
decoded(ARCOUNT) = 20490
decoded(ID)      = 51966
decoded(NSCOUNT) = 42240
decoded(Opcode)  = 0101
decoded(QDCOUNT) = 165
decoded(QR)      = 1
decoded(RA)      = 1
decoded(RCODE)   = 1000
decoded(RD)      = 0
decoded(TC)      = 0
decoded(Z)       = 100