User talk:Paddy3118

Revision as of 21:30, 4 June 2009 by rosettacode>Paddy3118 (RC_vanity_search.py)

RC_vanity_search.py

Sad, but I like to see how my tasks are faring over time:

<lang python> Rosetta Code Vanity search:

   How many new pages has someone created?

import urllib, re

user = 'Paddy3118'

site = 'http://www.rosettacode.org' nextpage = site + '/wiki/Special:Contributions/' + user nextpage_re = re.compile(

   r'<a href="([^"]+)" title="[^"]+" rel="next">older ')

newpages = [] pagecount = 0 while nextpage:

   page = urllib.urlopen(nextpage)
   pagecount +=1
   nextpage = 
   for line in page:
       if not nextpage:
           # Search for URL to next page of results for download
           nextpage_match = re.search(nextpage_re, line)
           if nextpage_match:
               nextpage = (site + nextpage_match.groups()[0]).replace('&', '&')
               #print nextpage
               npline=line
       if '' in line:
           # extract N page name from title
           newpages.append(line.partition(' title="')[2].partition('"')[0])
   page.close()

nontalk = [p for p in newpages if not ':' in p]

print "User: %s has created %i new pages of which %i were not Talk: pages, from approx %i edits" % (

   user, len(newpages), len(nontalk), pagecount*50 )

print "New pages created, in order, are:\n ", print "\n ".join(nontalk[::-1])



nextpage = site + '/w/index.php?title=Special:PopularPages' nextpage_re = re.compile(

   r'<a href="([^"]+)" class="mw-nextlink">next ')

data_re = re.compile(

r'^

  • <a href="[^"]+" title="([^"]+)".*</a>.*\(([0-9,]+) views\)' ) title2rankviews = {} rank = 1 pagecount = 0 while nextpage: page = urllib.urlopen(nextpage) pagecount +=1 nextpage = for line in page: if not nextpage: # Search for URL to next page of results for download nextpage_match = re.search(nextpage_re, line) if nextpage_match: nextpage = (site + nextpage_match.groups()[0]).replace('&', '&') # print nextpage npline=line datamatch = re.search(data_re, line) if datamatch: title, views = datamatch.groups() views = int(views.replace(',', )) title2rankviews[title] = [rank, views] rank += 1 page.close() print "\n\n Highest page Ranks for user pages:" fmt = "  %-4s %-6s %s" # rank, views, title print fmt % ('RANK', 'VIEWS', 'TITLE') highrank = [title2rankviews.get(t,[99999, 0]) + [t] for t in nontalk] highrank.sort() for x in highrank: print fmt % tuple(x) </lang> Sample output on 21:28, 4 June 2009
    User: Paddy3118 has created 52 new pages of which 27 were not Talk: pages, from approx 500 edits
    New pages created, in order, are:
      Spiral
      Monty Hall simulation
      Web Scraping
      Sequence of Non-squares
      Anagrams
      Max Licenses In Use
      One dimensional cellular automata
      Conway's Game of Life
      Data Munging
      Data Munging 2
      Column Aligner
      Probabilistic Choice
      Knapsack Problem
      Yuletide Holiday
      Common number base conversions
      Octal
      Integer literals
      Command Line Interpreter
      First-class functions
      Y combinator
      Functional Composition
      Exceptions Through Nested Calls
      Look-and-say sequence
      Mutual Recursion
      Bulls and Cows
      Testing a Function
      Select
    
    
     Highest page Ranks for user pages:
      RANK VIEWS  TITLE
      102  2442   Monty Hall simulation
      106  2294   Knapsack Problem
      109  2234   Conway's Game of Life
      141  1798   Anagrams
      214  1131   Web Scraping
      218  1087   Max Licenses In Use
      230  1022   Spiral
      231  997    One dimensional cellular automata
      257  825    Sequence of Non-squares
      258  823    Yuletide Holiday
      274  762    Column Aligner
      314  645    Data Munging 2
      318  627    Data Munging
      320  623    Probabilistic Choice
      322  620    Y combinator
      323  614    First-class functions
      374  494    Command Line Interpreter
      385  446    Functional Composition
      403  412    Integer literals
      412  404    Mutual Recursion
      417  388    Bulls and Cows
      438  336    Look-and-say sequence
      439  336    Common number base conversions
      450  293    Octal
      468  250    Exceptions Through Nested Calls
      661  75     Select
      677  56     Testing a Function
    >>> 
  • Return to the user page of "Paddy3118".