Function frequency: Difference between revisions

From Rosetta Code
Content added Content deleted
(Explained intent)
(Added symbol and number frequencies)
Line 104: Line 104:
cadr 80
cadr 80
or 76</pre>
or 76</pre>
If the condition in the 5th line (getd X) is replaced with (sym? X), then all symbols are counted, and the output is
<pre>X 566
quote 310
car 236
cdr 181
C 160
N 157
L 155
Lst 152
setq 148
T 144</pre>
And if it is replaced with (num? X), it is
<pre>1 71
0 38
2 27
3 17
7 9
-1 9
100 8
48 6
43 6
12 6</pre>


=={{header|Tcl}}==
=={{header|Tcl}}==

Revision as of 10:59, 11 December 2011

Function frequency is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Display - for a program or runtime environment (whatever suites the style of your language) - the top ten most frequently occurring functions (or also identifiers or tokens, if preferred).

This is a static analysis: The question is not how often each function is actually executed at runtime, but how often it is used by the programmer.

Besides its practical usefulness, the intent of this task is to show how to do self-inspection within the language.

Go

Only crude approximation is currently easy in Go. The following parses source code, looks for function call syntax (an expression followed by an argument list) and prints the expression. <lang go>package main

import (

   "fmt"
   "go/ast"
   "go/parser"
   "go/token"
   "io/ioutil"
   "os"
   "sort"

)

func main() {

   if len(os.Args) != 2 {
       fmt.Println("usage ff <go source filename>")
       return
   }
   src, err := ioutil.ReadFile(os.Args[1])
   if err != nil {
       fmt.Println(err)
       return
   }
   fs := token.NewFileSet()
   a, err := parser.ParseFile(fs, os.Args[1], src, 0)
   if err != nil {
       fmt.Println(err)
       return
   }
   f := fs.File(a.Pos())
   m := make(map[string]int)
   ast.Inspect(a, func(n ast.Node) bool {
       if ce, ok := n.(*ast.CallExpr); ok {
           start := f.Offset(ce.Pos())
           end := f.Offset(ce.Lparen)
           m[string(src[start:end])]++
       }
       return true
   })
   cs := make(calls, 0, len(m))
   for k, v := range m {
       cs = append(cs, &call{k, v})
   }
   sort.Sort(cs)
   for i, c := range cs {
       fmt.Printf("%-20s %4d\n", c.expr, c.count)
       if i == 9 {
           break
       }
   }

}

type call struct {

   expr  string
   count int

} type calls []*call

func (c calls) Len() int { return len(c) } func (c calls) Swap(i, j int) { c[i], c[j] = c[j], c[i] } func (c calls) Less(i, j int) bool { return c[i].count > c[j].count }</lang> Output, when run on source code above:

len                     3
fmt.Println             3
f.Offset                2
make                    2
fmt.Printf              1
ioutil.ReadFile         1
a.Pos                   1
string                  1
token.NewFileSet        1
append                  1

PicoLisp

<lang PicoLisp>(let Freq NIL

  (for "L" (filter pair (extract getd (all)))
     (for "F"
        (filter atom
           (fish '((X) (or (circ? X) (getd X)))
              "L" ) )
        (accu 'Freq "F" 1) ) )
  (for X (head 10 (flip (by cdr sort Freq)))
     (tab (-7 4) (car X) (cdr X)) ) )</lang>

Output, for the system in debug mode plus the above code:

quote   310
car     236
cdr     181
setq    148
let     136
if      127
and     124
cons    110
cadr     80
or       76

If the condition in the 5th line (getd X) is replaced with (sym? X), then all symbols are counted, and the output is

X       566
quote   310
car     236
cdr     181
C       160
N       157
L       155
Lst     152
setq    148
T       144

And if it is replaced with (num? X), it is

1        71
0        38
2        27
3        17
7         9
-1        9
100       8
48        6
43        6
12        6

Tcl

<lang tcl>package require Tcl 8.6

proc examine {filename} {

   global cmds
   set RE "(?:^|\[\[\{\])\[\\w:.\]+"
   set f [open $filename]
   while {[gets $f line] >= 0} {

set line [string trim $line] if {$line eq "" || [string match "#*" $line]} { continue } foreach cmd [regexp -all -inline $RE $line] { incr cmds([string trim $cmd "\{\["]) }

   }
   close $f

}

  1. Parse each file on the command line

foreach filename $argv {

   examine $filename

}

  1. Get the command list in order of frequency

set cmdinfo [lsort -stride 2 -index 1 -integer -decreasing [array get cmds]]

  1. Print the top 10 (two list items per entry, so 0-19, not 0-9)

foreach {cmd count} [lrange $cmdinfo 0 19] {

   puts [format "%-20s%d" $cmd $count]

}</lang> Sample run (note that the commands found are all standard Tcl commands; they're just commands so it is natural to expect them to be found):

bash$ tclsh8.6 RosettaCode/cmdfreq.tcl RosettaCode/*.tcl 
set                 2374
expr                846
if                  775
puts                558
return              553
proc                549
incr                485
foreach             432
lindex              406
lappend             351