User talk:Paddy3118: Difference between revisions

From Rosetta Code
Content added Content deleted
(Sample output on 12 July 2009)
Line 429: Line 429:
703 69 +0 Send an email
703 69 +0 Send an email
725 43 +0 Topological (dependency) sort</pre>
725 43 +0 Topological (dependency) sort</pre>

===Sample output on 24 July 2009===


== Poly thanks ==
== Poly thanks ==

Revision as of 05:57, 24 July 2009

RC_vanity_search.py

Sad, but I like to see how my tasks are faring over time:

<lang python> Rosetta Code Vanity search:

   How many new pages has someone created?

import urllib, re

user = 'Paddy3118'

site = 'http://www.rosettacode.org' nextpage = site + '/wiki/Special:Contributions/' + user nextpage_re = re.compile(

   r'<a href="([^"]+)" title="[^"]+" rel="next">older ')

newpages = [] pagecount = 0 while nextpage:

   page = urllib.urlopen(nextpage)
   pagecount +=1
   nextpage = 
   for line in page:
       if not nextpage:
           # Search for URL to next page of results for download
           nextpage_match = re.search(nextpage_re, line)
           if nextpage_match:
               nextpage = (site + nextpage_match.groups()[0]).replace('&', '&')
               #print nextpage
               npline=line
       if '' in line:
           # extract N page name from title
           newpages.append(line.partition(' title="')[2].partition('"')[0])
   page.close()

nontalk = [p for p in newpages if not ':' in p]

print "User: %s has created %i new pages of which %i were not Talk: pages, from approx %i edits" % (

   user, len(newpages), len(nontalk), pagecount*50 )

print "New pages created, in order, are:\n ", print "\n ".join(nontalk[::-1])



nextpage = site + '/w/index.php?title=Special:PopularPages' nextpage_re = re.compile(

   r'<a href="([^"]+)" class="mw-nextlink">next ')

data_re = re.compile(

r'^

  • <a href="[^"]+" title="([^"]+)".*</a>.*\(([0-9,]+) views\)' ) title2rankviews = {} rank = 1 pagecount = 0 while nextpage: page = urllib.urlopen(nextpage) pagecount +=1 nextpage = for line in page: if not nextpage: # Search for URL to next page of results for download nextpage_match = re.search(nextpage_re, line) if nextpage_match: nextpage = (site + nextpage_match.groups()[0]).replace('&', '&') # print nextpage npline=line datamatch = re.search(data_re, line) if datamatch: title, views = datamatch.groups() views = int(views.replace(',', )) title2rankviews[title] = [rank, views] rank += 1 page.close() print "\n\n Highest page Ranks for user pages:" fmt = "  %-4s %-6s %s" # rank, views, title print fmt % ('RANK', 'VIEWS', 'TITLE') highrank = [title2rankviews.get(t,[99999, 0]) + [t] for t in nontalk] highrank.sort() for x in highrank: print fmt % tuple(x) </lang>

    Sample output on 21:28, 4 June 2009

    User: Paddy3118 has created 52 new pages of which 27 were not Talk: pages, from approx 500 edits
    New pages created, in order, are:
      Spiral
      Monty Hall simulation
      Web Scraping
      Sequence of Non-squares
      Anagrams
      Max Licenses In Use
      One dimensional cellular automata
      Conway's Game of Life
      Data Munging
      Data Munging 2
      Column Aligner
      Probabilistic Choice
      Knapsack Problem
      Yuletide Holiday
      Common number base conversions
      Octal
      Integer literals
      Command Line Interpreter
      First-class functions
      Y combinator
      Functional Composition
      Exceptions Through Nested Calls
      Look-and-say sequence
      Mutual Recursion
      Bulls and Cows
      Testing a Function
      Select
    
    
     Highest page Ranks for user pages:
      RANK VIEWS  TITLE
      102  2442   Monty Hall simulation
      106  2294   Knapsack Problem
      109  2234   Conway's Game of Life
      141  1798   Anagrams
      214  1131   Web Scraping
      218  1087   Max Licenses In Use
      230  1022   Spiral
      231  997    One dimensional cellular automata
      257  825    Sequence of Non-squares
      258  823    Yuletide Holiday
      274  762    Column Aligner
      314  645    Data Munging 2
      318  627    Data Munging
      320  623    Probabilistic Choice
      322  620    Y combinator
      323  614    First-class functions
      374  494    Command Line Interpreter
      385  446    Functional Composition
      403  412    Integer literals
      412  404    Mutual Recursion
      417  388    Bulls and Cows
      438  336    Look-and-say sequence
      439  336    Common number base conversions
      450  293    Octal
      468  250    Exceptions Through Nested Calls
      661  75     Select
      677  56     Testing a Function
    >>> 

    From which I deduce their must be a lot of students out there being set a knapsack like problem ;-)

    Sample output on 21:24, 17 June 2009 (UTC)

    User: Paddy3118 has created 57 new pages of which 29 were not Talk: pages, from approx 550 edits
    New pages created, in order, are:
      Spiral
      Monty Hall simulation
      Web Scraping
      Sequence of Non-squares
      Anagrams
      Max Licenses In Use
      One dimensional cellular automata
      Conway's Game of Life
      Data Munging
      Data Munging 2
      Column Aligner
      Probabilistic Choice
      Knapsack Problem
      Yuletide Holiday
      Common number base formatting
      Octal
      Integer literals
      Command Line Interpreter
      First-class functions
      Y combinator
      Functional Composition
      Exceptions Through Nested Calls
      Look-and-say sequence
      Mutual Recursion
      Bulls and Cows
      Testing a Function
      Select
      Sort stability
      Moving Average
    
    
     Highest page Ranks for user pages:
      RANK VIEWS  TITLE
      101  2501   Knapsack Problem
      102  2499   Monty Hall simulation
      108  2309   Conway's Game of Life
      137  1876   Anagrams
      201  1283   One dimensional cellular automata
      214  1162   Web Scraping
      220  1114   Max Licenses In Use
      226  1073   Spiral
      258  844    Yuletide Holiday
      259  842    Sequence of Non-squares
      271  804    Column Aligner
      297  707    Y combinator
      310  685    Data Munging 2
      311  683    First-class functions
      315  666    Data Munging
      320  649    Probabilistic Choice
      364  525    Command Line Interpreter
      377  487    Mutual Recursion
      378  480    Functional Composition
      391  456    Bulls and Cows
      398  447    Integer literals
      425  386    Common number base formatting
      439  361    Look-and-say sequence
      449  320    Octal
      458  288    Exceptions Through Nested Calls
      518  184    Sort stability
      556  146    Testing a Function
      577  122    Select
      724  4      Moving Average

    So in two weeks I've created two new tasks, and the Knapsack Problem has finally moved to become my top viewed task.

    Sample output on 07:06, 25 June 2009 (UTC)

    I have added no new pages, but I have broken into the top 100 list of page views!!:

     Highest page Ranks for user pages:
      RANK VIEWS  TITLE
      98   2595   Knapsack Problem
      102  2542   Monty Hall simulation
      108  2361   Conway's Game of Life
      133  1937   Anagrams
      199  1312   One dimensional cellular automata
      215  1179   Web Scraping
      220  1133   Max Licenses In Use
      222  1113   Spiral
      257  872    Yuletide Holiday
      260  870    Sequence of Non-squares
      267  829    Column Aligner
      296  739    Y combinator
      300  720    First-class functions
      311  695    Data Munging 2
      316  684    Data Munging
      320  664    Probabilistic Choice
      355  563    Command Line Interpreter
      373  519    Mutual Recursion
      376  517    Functional Composition
      380  496    Bulls and Cows
      394  468    Integer literals
      418  421    Common number base formatting
      435  378    Look-and-say sequence
      450  336    Octal
      460  305    Exceptions Through Nested Calls
      504  214    Sort stability
      543  169    Testing a Function
      563  146    Select
      582  130    Moving Average

    I did like writing the story around that Knapsack task.


    Vanity Search Updated

    After the 4th of July RC updates, My RC Vanity Searcher needed to be updated due to HTML changes. I decided to add a modification, that shows the movement between the order a page was created in, vs the max views order called +/-.

    <lang python> Rosetta Code Vanity search:

       How many new pages has someone created?
    

    import urllib, re, pdb

    user = 'Paddy3118'

    site = 'http://www.rosettacode.org' nextpage = site + '/wiki/Special:Contributions/' + user nextpage_re = re.compile(

       #r'<a href="([^"]+)" title="[^"]+" rel="next">older '
       r'<a href="([^"]+)" title="[^"]+" rel="next"[^>]*>older '
       )
    

    newpages = [] pagecount = 0 while nextpage:

       page = urllib.urlopen(nextpage)
       pagecount +=1
       nextpage = 
       for line in page:
           if not nextpage:
               # Search for URL to next page of results for download
               nextpage_match = re.search(nextpage_re, line)
               if nextpage_match:
                   nextpage = (site + nextpage_match.groups()[0]).replace('&', '&')
                   #print nextpage
                   npline=line
           if '' in line:
               # extract N page name from title
               newpages.append(line.partition(' title="')[2].partition('"')[0])
       page.close()
    

    nontalk = [p for p in newpages if not ':' in p] nontalk.reverse()

    print "User: %s has created %i new pages of which %i were not Talk: pages, from approx %i edits" % (

       user, len(newpages), len(nontalk), pagecount*50 )
    

    print "New pages created, in order, are:\n ", print "\n ".join(nontalk)



    nextpage = site + '/w/index.php?title=Special:PopularPages' nextpage_re = re.compile(

       #r'<a href="([^"]+)" class="mw-nextlink">next '
       r'<a href="([^"]+)"[^>]* class="mw-nextlink"[^>]*>next'
       )
    

    data_re = re.compile(

    r'^
  • <a href="[^"]+" title="([^"]+)".*</a>.*\(([0-9,]+) views\)' ) title2rankviews = {} rank = 1 pagecount = 0 while nextpage: page = urllib.urlopen(nextpage) pagecount +=1 nextpage = for line in page: if not nextpage: # Search for URL to next page of results for download nextpage_match = re.search(nextpage_re, line) if nextpage_match: nextpage = (site + nextpage_match.groups()[0]).replace('&', '&') # print nextpage npline=line datamatch = re.search(data_re, line) if datamatch: title, views = datamatch.groups() views = int(views.replace(',', )) title2rankviews[title] = [rank, views] rank += 1 page.close() print "\n\n Highest page Ranks for user pages:\n" fmt = "  %-4s %-6s %-3s %s" # rank, views, +/- title print fmt % ('RANK', 'VIEWS', '+/-', 'TITLE') fmt = " %4s %6s %+3i %s" # rank, views, +/- title highrank = [title2rankviews.get(t,[99999, 0]) + [t] for t in nontalk] highrank.sort() for i,x in enumerate(highrank): rank, view, title = x movement = nontalk.index(title) - i print fmt % (rank, view, movement, title) </lang>

    Sample output on 12 July 2009

    Since the 25th of June I've added two tasks: 'Send an email' and 'Topological (dependency) sort'.

    With the new +/- column showing views w.r.t. creation order, you can see that the Knapsack Problem has captured the readers imagination. The Y combinator has a strong showing too.

    Is it possible to get views in the last three months rather than total views? That would allow me to discount some of the bias due to date created.

    User: Paddy3118 has created 63 new pages of which 31 were not Talk: pages, from approx 600 edits
    New pages created, in order, are:
      Spiral
      Monty Hall simulation
      Web Scraping
      Sequence of Non-squares
      Anagrams
      Max Licenses In Use
      One dimensional cellular automata
      Conway's Game of Life
      Data Munging
      Data Munging 2
      Column Aligner
      Probabilistic Choice
      Knapsack Problem
      Yuletide Holiday
      Common number base formatting
      Octal
      Integer literals
      Command Line Interpreter
      First-class functions
      Y combinator
      Functional Composition
      Exceptions Through Nested Calls
      Look-and-say sequence
      Mutual Recursion
      Bulls and Cows
      Testing a Function
      Select
      Sort stability
      Moving Average
      Send an email
      Topological (dependency) sort
    
    
     Highest page Ranks for user pages:
    
      RANK VIEWS  +/- TITLE
        84   2792 +12 Knapsack Problem
       100   2629  +0 Monty Hall simulation
       107   2431  +5 Conway's Game of Life
       132   2013  +1 Anagrams
       199   1368  +2 One dimensional cellular automata
       211   1233  -3 Web Scraping
       218   1189  -1 Max Licenses In Use
       221   1158  -7 Spiral
       256    919  -5 Sequence of Non-squares
       261    900  +4 Yuletide Holiday
       267    866  +0 Column Aligner
       281    809  +8 Y combinator
       290    780  +6 First-class functions
       303    732  -4 Data Munging 2
       309    718  -6 Data Munging
       315    706  -4 Probabilistic Choice
       332    642  +1 Command Line Interpreter
       351    598  +6 Mutual Recursion
       365    568  +6 Bulls and Cows
       366    566  +1 Functional Composition
       388    501  -4 Integer literals
       407    463  -7 Common number base formatting
       418    436  +0 Look-and-say sequence
       446    375  -8 Octal
       455    347  -3 Exceptions Through Nested Calls
       493    248  +2 Sort stability
       525    207  -1 Testing a Function
       548    186  -1 Select
       559    174  +0 Moving Average
       703     69  +0 Send an email
       725     43  +0 Topological (dependency) sort

    Sample output on 24 July 2009

    Poly thanks

    Thanks but if you needed you see the example to get it clearer... it means the pseudocode is not too much clear:D More over, implemeting Octave solution, I've discovered it's more convenient to keep the higher power first (in the "vectors"), so likely I am going to rewrite the pseudocode ;) ... --ShinTakezou 16:31, 18 June 2009 (UTC)