User:Bukzor: Difference between revisions

From Rosetta Code
Content added Content deleted
mNo edit summary
mNo edit summary
Line 27: Line 27:
# save a report of pages->scores
# save a report of pages->scores


--[[User:Bukzor|Bukzor]]
--[[User:Bukzor|Bukzor]] 03:31, 26 April 2010 (UTC)

Revision as of 03:31, 26 April 2010

My Favorite Languages
Language Proficiency
Python Expert
UNIX Shell Very Active
SQL Active
JavaScript Semi-Active
C Rusty
Perl Hacker/Hater
C++ Rusty

Right now I'm just having fun improving other people's python. That's probably not what this is all about, but I like it.

Automatic pylint

The current state of this project can be found here: rosetta_pylint.py


  1. use the MediaWiki API to grab a list of the pages in Category:Python
    • The mediawiki API is pretty straightforward. I feel done with that part.
  2. grab the HTML for those pages, put them into a DOM
    • I'm having trouble getting any of the builtin html or xml parsers to give me a DOM. htmlparser is just a ghetto little state machine, and the xml parsers are too strict (  is an 'unknown entity').
    • I've posted a stackoverflow question on this subject here. --Bukzor 16:31, 20 April 2010 (UTC)
    • Despite everyone agreeing that Python doesn't have a builtin HTML->DOM parser, I've parsed the site A-Z with ElementTree with minimal effort. I had to fix a bunch of inavalid HTML though. Look at my edits for the previous couple days for details.
  3. select for "python" as a CSS class, and get lumps of Python code.
    • Now I have ~700 python snippets that I'm working on pylint'ing and analyzing. --Bukzor 01:29, 24 April 2010 (UTC)
  4. automate feeding that code through pylint
  5. save a report of pages->scores

--Bukzor 03:31, 26 April 2010 (UTC)