Rosetta Code/Fix code tags

From Rosetta Code
Revision as of 23:54, 29 January 2009 by rosettacode>Guga360 (New page: {{task|Text processing}} Fix Rosetta Code deprecated code tags, with these rules: <pre>Change <%s> to <lang %s> Change </%s> to </lang> Change <code %s> to <lang %s> Change </code> to </...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Task
Rosetta Code/Fix code tags
You are encouraged to solve this task according to the task description, using any language you may know.

Fix Rosetta Code deprecated code tags, with these rules:

Change <%s> to <lang %s>
Change </%s> to </lang>
Change <code %s> to <lang %s>
Change </code> to </lang>

Usage:

cat wikisource.txt | ./convert.py > converted.txt

Python

Syntax highlighting does not work, because </code> is inside the code.

# coding: utf-8

import sys
import re

langs = ["<"+i+">" for i in ['ada', 'cpp-qt', 'pascal', 'lscript', 'z80', 'visualprolog', 'html4strict', 'cil', 'objc', 'asm', 'progress', 'teraterm', 'hq9plus', 'genero', 'tsql', 'email', 'pic16', 'tcl', 'apt_sources', 'io', 'apache', 'vhdl', 'avisynth', 'winbatch', 'vbnet', 'ini', 'scilab', 'ocaml-brief', 'sas', 'actionscript3', 'qbasic', 'perl', 'bnf', 'cobol', 'powershell', 'php', 'kixtart', 'visualfoxpro', 'mirc', 'make', 'javascript', 'cpp', 'sdlbasic', 'cadlisp', 'php-brief', 'rails', 'verilog', 'xml', 'csharp', 'actionscript', 'nsis', 'bash', 'typoscript', 'freebasic', 'dot', 'applescript', 'haskell', 'dos', 'oracle8', 'cfdg', 'glsl', 'lotusscript', 'mpasm', 'latex', 'sql', 'klonec', 'ruby', 'ocaml', 'smarty', 'python', 'oracle11', 'caddcl', 'robots', 'groovy', 'smalltalk', 'diff', 'fortran', 'cfm', 'lua', 'modula3', 'vb', 'autoit', 'java', 'text', 'scala', 'lotusformulas', 'pixelbender', 'reg', '_div', 'whitespace', 'providex', 'asp', 'css', 'lolcode', 'lisp', 'inno', 'mysql', 'plsql', 'matlab', 'oobas', 'vim', 'delphi', 'xorg_conf', 'gml', 'prolog', 'bf', 'per', 'scheme', 'mxml', 'd', 'basic4gl', 'm68k', 'gnuplot', 'idl', 'abap', 'intercal', 'c_mac', 'thinbasic', 'java5', 'xpp', 'boo', 'klonecpp', 'blitzbasic', 'eiffel', 'povray', 'c', 'gettext']]

text = sys.stdin.read()

for i in langs:
 i2 = i.replace("<","<lang ")
 text = text.replace(i,i2)

for i in [i.replace("<","</") for i in langs]: 
 text = text.replace(i,"</lang>")

for i in re.findall("<code.*>",text):
 i2 = i.replace("code","lang")
 text = text.replace(i, i2)

text = text.replace("</code>","</lang>")

print text