Talk:Rosetta Code/Rank languages by popularity: Difference between revisions

From Rosetta Code
Content added Content deleted
m (→‎unicode characters in languages: added another note concering adding support for other unicode programming lanuages. -- ~~~~)
m (→‎incorrect sample: added comments about the task description sample being incorrect. -- ~~~~)
Line 2: Line 2:


Is it to be assummed that part of the task requirments are to list a certain number of the top ranked Rosetta Code languages? -- [[User:Gerard Schildberger|Gerard Schildberger]] 00:02, 23 July 2012 (UTC)
Is it to be assummed that part of the task requirments are to list a certain number of the top ranked Rosetta Code languages? -- [[User:Gerard Schildberger|Gerard Schildberger]] 00:02, 23 July 2012 (UTC)

==incorrect sample==

The sample output (as part of the task description) is incorrect (as of March 31<sup>st</sup>, 2013).
<br>It shows that the 6<sup>th</sup> and 7<sup>th</sup> entries as having the same number of entries (i.e., are tied), but one is ranked 6<sup>th</sup>, the other is ranked 7<sup>th</sup>. &nbsp; There shouldn't be a 7<sup>th</sup> (place) entry, instead there should be two 6<sup>th</sup> place entries, and both should be marked as ''tied'' or somesuch indicator.
<br><br>Similarly, all but one of the examples from the various languages are also incorrect in this regard.
<br><br>This is exactly like a race, where there're two (tied) 1<sup>st</sup>-place winners (gold), and no 2<sup>nd</sup>-place winner (silver). &nbsp; First place is shared. Next to cross the finish line is 3<sup>rd</sup> place (bronze).
<br> -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:53, 1 April 2013 (UTC)


==Incorrect examples==
==Incorrect examples==

Revision as of 18:54, 1 April 2013

task clarification

Is it to be assummed that part of the task requirments are to list a certain number of the top ranked Rosetta Code languages? -- Gerard Schildberger 00:02, 23 July 2012 (UTC)

incorrect sample

The sample output (as part of the task description) is incorrect (as of March 31st, 2013).
It shows that the 6th and 7th entries as having the same number of entries (i.e., are tied), but one is ranked 6th, the other is ranked 7th.   There shouldn't be a 7th (place) entry, instead there should be two 6th place entries, and both should be marked as tied or somesuch indicator.

Similarly, all but one of the examples from the various languages are also incorrect in this regard.

This is exactly like a race, where there're two (tied) 1st-place winners (gold), and no 2nd-place winner (silver).   First place is shared. Next to cross the finish line is 3rd place (bronze).
-- Gerard Schildberger (talk) 18:53, 1 April 2013 (UTC)

Incorrect examples

ALL examples, except Python "Working Solution" are not working. Please fix. --Guga360 03:49, 29 July 2009 (UTC)

How, exactly, are all other solutions "not working"? Please elaborate. --glennj 10:12, 2 August 2009 (UTC)
Attempting the Python "Working Solution", I'm getting this error using python 2.5.2 --glennj 10:14, 2 August 2009 (UTC)
 $ python rosetta_popular.py
Traceback (most recent call last):
  File "rosetta_popular.py", line 18, in <module>
    for n, i in enumerate(sorted(result,key=lambda x: x[1],reverse=True),start=1):
TypeError: 'start' is an invalid keyword argument for this function
You need Python 2.6. "Not working" examples are only grabbing 500 categories, programming languages like Tcl or Visual Basic don't get in top 10. --Guga360 16:48, 2 August 2009 (UTC)
Actually, the Python implementation isn't correct, either. It omits, for example, AutoHotkey and LotusScript. Compare it with the Perl output. —Underscore 16:16, 30 October 2009 (UTC)
I tried the following:
AutoHotkey is working.
Python is not working
Ruby is not working
TCL webscraping is working
perl wikipedia api has a lot of dependencies, I couldn't get URI to build on my macbook --Tinku99 21:01, 16 May 2010 (UTC)

ALGOL 68

I tested ALGOL 68 (I'm using release algol68g-mk16.win32), but it did not worked. Do I need a library? --Guga360 22:19, 18 April 2009 (UTC)

PS: I'm currently on Windows, i'll try to restart in Ubuntu later.

Error:

39 http content (reply, "www.rosettacode.org", "http://www.rosettacode.org/w /index.php?title=Special:Categories&limit=500", 0); 1 a68g16.exe: error: 1: tag "httpcontent" has not been declared properly (detected in c onditional-clause starting at "IF" in line 37).

I think it's really a library. But i never programmed in ALGOL, what should i do?

Guga360 08:24, 19 April 2009

Hi Guya360,

This ALGOL 68 implementation is an interpretor, and it would require the library being linked into the a68g.exe binary. Hence the .exe you have definitely does not have "http content" linked in.

I ran it on Fedora9 OK. I just checked a68g-manual.pdf and it says:

Mark 8, July 2005

  1. Adds procedure http content for fetching web page contents (UNIX)
  2. Adds procedure tcp request for sending requests via TCP (UNIX)
  3. Adds procedure grep in string for matching regular expressions in a string (UNIX)

The key point being the (UNIX) at the end of the line. I am guessing that the tcp library for windows was enough different from the tcp library for Linux/Unix that the "http content" routine remains both un-ported and broken.

Now I have not tried Algol68g on Ubuntu. I hack around on Fedora. So you MAY be able to use the pre-compiled algol68g RPM for ubuntu (Does Ubuntu support RPMs?)

BUT the algol68g source tar ball will definitely work on ubuntu. (Maybe not on a 64bit Ubuntu as the libraries have moved to /lib64 - if so let me know and I'll sort/hack you up a 64bit update)

The compile should be as easy as installing gcc and configure (and if you need them then install postgres or curses or ... ) then:

tar -xvf /tmp/download/algol68g-mk16.tgz
./configure --threads
make 
# as root
make install
# as user
a68g Sort_most_popular_programming_languages.a68

BTW: you are the first person to feed back on running the ALGOL 68 rosettacode code snippets... I am rather impressed. ThanX - (blush)

Is this snippet the first ALGOL 68 that you have tried?

NevilleDNZ 08:40, 19 April 2009 (UTC)

Yes, it's my first try.
I just found that example intersting because it does not use [[1]].
I compiled algol68g right now. (./configure && make && sudo make install)
It compiled perfectly.
But i runned that example, and nothing happened. It's just "loading".
Any suggestions? --Guga360 16:08, 19 April 2009 (UTC)

I am guessing that you program is running, just really slow, give it 2 minutes to run. Basically the routine re split used to parse the HTML is really slow. re split's performance is the order of O2.

I just recoded the ALGOL 68 version to use a linked list, it is a huge improvement:

[nevilled@november rosettacode]$ time  a68g Sort_most_popular_programming_languages_slow.a68 
1. 233 - Python
2. 222 - Ada
3. 204 - C
4. 203 - OCaml
5. 201 - Perl
6. 193 - Haskell
7. 182 - Java
8. 179 - D
9. 178 - ALGOL 68
10. 160 - Ruby

real	0m47.950s
user	0m44.363s
sys	0m0.080s

[nevilled@november rosettacode]$ time  a68g Sort_most_popular_programming_languages.a68 
1. 233 - Python
2. 222 - Ada
3. 204 - C
4. 203 - OCaml
5. 201 - Perl
6. 193 - Haskell
7. 182 - Java
8. 179 - D
9. 178 - ALGOL 68
10. 160 - Ruby

real	0m11.504s
user	0m3.228s
sys	0m0.068s

Sort_most_popular_programming_languages_slow - the original - would have been issuing thousands of calls to malloc.

re: [[2]] I should/could use this link. I hacked out a solution, c.f. the actual code for the re ignore values.

# hack: needs to be manually maintained #
  STRING re ignore ="Programming Tasks|WikiStubs|Maintenance/OmitCategoriesCreated|"+

Yes... it is a hack. I'll try to stitch this in shortly.

NevilleDNZ 20:04, 19 April 2009 (UTC)


Note: ALGOL 68 for Ubuntu now available

NevilleDNZ 09:19, 17 October 2009 (UTC)

Problem description insufficient?

I updated the Ruby solution today when I noticed there were more than 500 categories. Popular (top 10%) languages like Visual Basic .NET and Vedit macro language are left off most lists. --glennj 20:02, 15 June 2009 (UTC)

I found a better way to do this task.
"http://www.rosettacode.org/w/api.php?action=query&prop=categoryinfo&titles=Category:Python%7CCategory:Tcl%7C...", property "pages" should return category members. --Guga360 21:31, 15 June 2009 (UTC)
I completely missed the Perl/Python solutions: first filter on the Programming_Languages category, not all categories. --glennj
Actually, this does not solve the problem. We're still reading Special:Categories, if there is a language starting with Z, probably it will not get in the list. --Guga360 01:30, 16 June 2009 (UTC)
As of January 2013, there are five Z entries:
  • ZX Spectrum Basic
  • Z80 Assembly
  • ZPL
  • Zonnon
  • ZED

-- Gerard Schildberger 20:19, 7 January 2013 (UTC)

Ruby example question

I'm trying to run the Ruby example on Ruby 1.8.6 and it says that the "each_slice" method isn't defined. Is that part of 1.8.7? --Mwn3d 19:29, 26 June 2009 (UTC)

I don't know, probably it's for Ruby 1.9, but you can try replace <lang ruby>langs.each_slice(50) do</lang> with <lang ruby>langs[0..50].each do</lang> --Guga360 22:54, 26 June 2009 (UTC)
It's 1.8.7. I'll mark the example. The indexing solution didn't work, but I used a computer that has 1.8.7 and it works. Thanks for the suggestion, though. --Mwn3d 23:18, 26 June 2009 (UTC)
glennj 18:43, 27 June 2009 (UTC) -- Here's what the 1.8.7 docs say about Enumerable#each_slice
Iterates the given block for each slice of <n> elements
It could be implemented like<lang ruby>def each_slice(ary, n)
 (ary.length/n + 1).times {|i| yield ary[i*n,n]}

end</lang>

Redundant task?

I actually asked that on this page on Jan 26, however it was replaced without comment by Guga360. Therefore I'll ask again:

Isn't this task basically a combination of HTTP Request, Regular expression matching and Sorting Using a Custom Comparator? --Ce 11:55, 1 November 2009 (UTC)

Not Redundant — Don't be ridiculous. You might as well argue (by reduction to fundamentals) that it's a combination of basic operations and TCP socket handling. Which it is, but that's missing the whole point. This is a composite task that is focussed on the end goal rather than the specific technique used to achieve it, which is absolutely allowed in the RC rules. –Donal Fellows 13:32, 1 November 2009 (UTC)
BTW, I don't know why Guga360 replaced what you wrote. It seems rather rude to me to do so. –Donal Fellows 13:37, 1 November 2009 (UTC)
There are similarities in the question and answer given here. --Paddy3118 14:47, 1 November 2009 (UTC)
I'll copy the relevant portion of my reply from over there:
As for "redundant" tasks, it doesn't bother me. In fact, I can think of ways to take advantage of it, such as category tagging individual examples with techniques, features or principles they may illustrate, to offer another way to browse and search the site, and to be more illustrative of alternate approaches. (end quote) --Michael Mol 17:16, 1 November 2009 (UTC)


J

The neat thing about the J solution is that it's completely functional/declarative. That is, it contains no imperative code; no instructions on how to process data. It merely describes the result it wants, using four separate domain-specific functional/declarative syntaxes: J, XPath, Regular Expressions, and x2j.

Note: the J solution specifically uses XPath to precisely address the data it wants, rather than relying on a the textual format of the XML (e.g. by assuming the <li> will not break across lines), which is not guaranteed, and defeats the purpose of XML to an extent. The regular expressions are not strictly required; they are included simply to increase legibility (self-documentation) and to identify the data of interest (e.g. by excluding uninteresting <li> in the language list, based on their content).

But this doesn't mean the code is specific to this task (e.g. it doesn't implement a MediaWiki-specific interface). The great value of separating the concerns of addressing (XPath), identifying (regex), and transforming (J) the data is precisely that it is general and adaptable.

This declarative nature also explains why the solution is so concise (relatively). But also it omits the optional join on MostLinkedCategories and washes the list through pattern exclusion.

--DanBron 05:52, 26 November 2009 (UTC)

wanted: a complete list

I would like to see at least one (and probably only one) list to be complete. I also would like it updated, say, every month or quarter so we all can see the current state of all languages, not just the top ten or twenty languages. -- Gerard Schildberger 20:17, 21 March 2012 (UTC)

I tried to execute the Ruby example (with what little I knew of that lanuage), but it didn't execute (got an error) with my version of the Ruby language. I know so little that I don't know if the example depended on a certain version, or maybe the libraries that I have don't contain the necessary routines. In any case, I was hoping that someone could run any of the examples and provide a complete list. -- Gerard Schildberger 21:29, 29 June 2012 (UTC)

One reason to have a complete list is to ensure that the example programs are processing all of the languages (or for the lanuages). That isn't the main reason, but it would/could point out deficiencies in an example's process/algorithm. -- Gerard Schildberger 21:35, 29 June 2012 (UTC)


Since nobody re-ran their examples, I decided to write my own (using REXX) and included an almost full list of the rank of languages on Rosetta Code.

I'll try to update it once a month or so. -- Gerard Schildberger 23:53, 22 July 2012 (UTC)

If somebody else creates a more complete ranking, better filtering program, or an automated version (or more timely), I'll reduce the number of Rosetta Code languages ranked in the REXX output section. -- Gerard Schildberger 00:11, 23 July 2012 (UTC)

Currently, the REXX example reports on all 471 programming languages, but there is code to support the skipping of languages that have less than a certain (specified) number of examples. However, listing them all enabled me to find some languages that are "misspelled" as far as case goes (inconsistent upper/lower/mixed spellings). -- Gerard Schildberger 04:51, 5 September 2012 (UTC)

Now, there're 473 programming languages. -- Gerard Schildberger 17:23, 5 September 2012 (UTC)
Now, there're 475 programming languages. -- Gerard Schildberger 20:45, 26 January 2013 (UTC)

Now, there're 481 programming languages. -- Gerard Schildberger 19:41, 30 March 2013 (UTC)

The decrease in the number of programming languages is due to combining the

  • UC++
  • µC++
  • ╬£C++     (unicode version)

language into one:   µC++. -- Gerard Schildberger 20:45, 26 January 2013 (UTC)


In the complete list A+ is ranked 431 although (unfairly) it has no worked tasks. Xanadu, a language with which I am unfamiliar, has one worked task but is ranked 463. --Nigel Galloway 12:29, 21 December 2012 (UTC)

The way identically ranked languages (identical in the sense that they have the same number of entries) is sorted in the order in which they appear first in the Rosetta Code list.   Thus, some languages aren't ranked fairly because of a (weak) sorting artifact of having the same number (of entries).   Strictly speaking, if the following were true:
  • hog   97
  • dog   72
  • auk   72
  • ape     4
  • cow   72
  • gnu   72

The ranking should be:

  • 1           hog
  • 2 (tied)   dog
  • 2 (tied)   auk
  • 2 (tied)   cow
  • 2 (tied)   gnu
  • 6           ape

with all 2nd place names marked as tied for 2nd, and nobody marked as 3rd, 4th or 5th.
These duplicates (tied for placement) would make a good addition to this task (to rank languages correctly) -- or lacking that, a good Rosetta Code task that can stand by itself.
Note that the chicken was disqualified as it wouldn't cross the road. -- Gerard Schildberger 20:48, 7 January 2013 (UTC)

Update notice:   the above (the ranking of tied languages) has been implemented into the REXX program example. -- Gerard Schildberger 19:41, 30 March 2013 (UTC)

According to its task page A+ has no tasks implemented. It seems as if a language with no tasks impelemented is treated as if it has three
rank: 441 (3 entries) A+
--Nigel Galloway 14:51, 26 January 2013 (UTC)

I took the task's requirements quite literally:

Sort most popular programming languages based in number of members in Rosetta Code categories
(from http://www.rosettacode.org/mw/index.php?title=Special:Categories&limit=5000)

(The bold font was added by me.)   Note that it didn't say implementations, but members.
I think that's what most people (most likely) thought that's what was wanted, but there ya have it. -- Gerard Schildberger 19:45, 26 January 2013 (UTC)

case of names of programming languages

In producing a complete list (for the REXX example), I found several examples (errors?) in the case (upper/lower/mixed) of the names of some programming languages that somebody may want to correct (at least, make them consistent as far as their case).

  • ANT, ant
  • AutoIt, AutoIT
  • BASIC, Basic
  • Bc, BC
  • C sharp, C Sharp
  • F Sharp, F sharp
  • Gdl, GDL
  • HaXe, Haxe
  • Maple, MAPLE
  • MATLAB, Matlab
  • NewLISP, Newlisp
  • OoRexx, OOREXX
  • OpenEdge/Progress, Openedge/Progress
  • Run BASIC, Run Basic
  • UC++, µC++
  • (unicode µC++)     [previously displayed as   ╬£].

All the above names (both cases) appear as different entries in the categories section and have their own count.

Below are the specific counts for each case:

  • rank: 449 (3 entries) ANT
  • rank: 476 (1 entries) Ant
  • rank: 126 (39 entries) AutoIt
  • rank: 455 (2 entries) AutoIT
  • rank: 53 (186 entries) BASIC
  • rank: 424 (3 entries) Basic
  • rank: 117 (42 entries) Bc
  • rank: 463 (1 entries) BC
  • rank: 22 (412 entries) C sharp
  • rank: 482 (1 entries) C Sharp
  • rank: 50 (206 entries) F Sharp
  • rank: 470 (1 entries) F sharp
  • rank: 377 (3 entries) Gdl
  • rank: 475 (1 entries) GDL
  • rank: 178 (19 entries) Haxe
  • rank: 416 (3 entries) HaXe
  • rank: 266 (6 entries) Maple
  • rank: 480 (1 entries) MAPLE
  • rank: 43 (253 entries) MATLAB
  • rank: 468 (1 entries) Matlab
  • rank: 94 (67 entries) NewLISP
  • rank: 316 (4 entries) NewLisp
  • rank: 71 (128 entries) OoRexx
  • rank: 457 (2 entries) OOREXX
  • rank: 111 (46 entries) OpenEdge/Progress
  • rank: 454 (2 entries) Openedge/Progress
  • rank: 59 (178 entries) Run BASIC
  • rank: 471 (1 entries) Run Basic


(updated) -- Gerard Schildberger 01:25, 17 November 2012 (UTC)
(updated)--- Gerard Schildberger 17:30, 5 September 2012 (UTC)

Thanks Gerard. I think this kind of info is useful for people writing other site scraping tools. --Paddy3118 07:09, 5 September 2012 (UTC)
I was hoping for people to step up to the plate and fix the various cases as I lack the tools to locate the inconsistencies (or hell's bells, even the knowledge of which case it should be). I looked at each of the four ANT (or Ant?) examples, but within the comments of the programs, various cases were used, so I couldn't tell which one is correct (or even preferred), and I certainly don't want to be the one correcting something like that. I thought if I pointed out the problems, they would get magically get fixed (by the case fairy or something). Anybody with a magic wand to do a "presto, chango!" ? -- Gerard Schildberger 16:26, 5 September 2012 (UTC)
:-) but no magic wand, nor even pixie dust. --Paddy3118 16:47, 5 September 2012 (UTC)
You missed one: C Sharp/C sharp. There are 14 such pairs, which is why the BBC BASIC solution (which merges them) lists a total of 470 languages compared with REXX's 484. RichardRussell 00:59, 17 November 2012 (UTC)
That particular entry was most likely added after I ran the above run in August of 2012. I haven't re-did it to find more mixed-case ... er, cases. -- Gerard Schildberger 01:13, 17 November 2012 (UTC)
I have updated the (above) list. -- Gerard Schildberger 01:54, 17 November 2012 (UTC)

By the way, as the above get fixed/corrected (even partially), I'll whittle down the list (the list of incorrect cases of languages) so eventually (hopefully), there won't be a list anymore. I hope that changing the "incorrect case" list won't goof up the history of this section, but I hate to leave errors hanging around long after they get fixed/corrected. -- Gerard Schildberger 16:35, 5 September 2012 (UTC)

unicode characters in languages

The REXX program translates the unicode   µC++   into the ASCII-8   µC++.

Along with that, the REXX program also translates   UC++   into ASCII-8   µC++   to be consistant.

This reduces the language count by one.

Previously, the REXX program was displaying the unicode   µC++   as   ╬£. -- Gerard Schildberger 19:42, 24 January 2013 (UTC)

Also added the unicode translation of   ╨£╨Ü-61/52   (Cyrillic   МК-61/52)   into   MK-61/52. -- Gerard Schildberger 20:24, 15 February 2013 (UTC)

REXX code for other unicode versions of programming languages have been added since then. -- Gerard Schildberger 19:45, 30 March 2013 (UTC)