Talk:Commatizing numbers

several numbers in the input

shouldn't 2000 become 2,000 ? --Walterpachl (talk) 05:07, 11 April 2014 (UTC)

If you're referring to the 4^th string to be commatized ===US$0017440 millions'..., then no. The 1^st numberic string is 0017440, not 2000 (which is the 2^nd numeric string). -- Gerard Schildberger (talk) 05:26, 11 April 2014 (UTC)

Ambiguous task spec. Oddly specific for some things while it glosses over others completely.

It seems to me that this task is really 3 subtasks that have been mashed together. That isn't necesarily a problem, but there is no separation of subtasks in the task description. It is all or nothing. As I interpret it, the first, and ostensibly the primary task, is to add appropriate separators to to numbers to display them as numeric strings. That one I have no problem with. It is a fine, useful and commonly performed operation.

The second subtask, as I read it, is to algorithmically pick out numeric strings from mixed alphanumeric strings. Again, a fine, useful operation, though a lttle more touchy-feely and open to interpretation.

The third subtask, and the one I have the real problem with, is to algoritmically determine WHAT a particular numeric string represents so the "correct" arbitrary, inconsistent, seemingly pulled-out-of-the-air set of rules can be applied. I would be happier if the spec followed wikipedias guidelines for digit grouping (which incidentally doesn't mention scientific notation at all) but that's neither here nor there. Task writers are free to impose arbitrary restrictions.

My issue is that the task implys and assumes that there is some general algorithm that can be determined, but language is too messy and inconsistent to do that. Numeric strings VERY commonly are identifiers, not numbers: zip codes, phone numbers, credit card numbers, dates (it could be argued that 2014 is numeric, but really it just identifies a certain period of time to make it easier to refer to it). How about numbers in binary? hexidecimal? base 36?

Perhaps if somone could demonstrate how the current implementations "correctly" commatize the following strings:

   "The number 8675309 had to be taken out of service in every area code after Tommy Tutones song 'Jenny' was released"
   "Including in Beverly Hills 90210"
   "Use the credit card number 6011123456789876"
   "1946 was a good year for some; 6/9/1946 was a good day at most." 
   "hex 1234e56 = decimal 1091030"
   "bogus₍₃₆₎ = 1001010110101011001010100₍₂₎." 
   "123456789 is often used as an 'unused' demonstration social security number."

If the answer is "Well you need to know what the numeric string represents ahead of time and only use when appropriate.", then don't pretend that there is some general algorithm that can be applied. Knowing ahead of time is not an algorithm.--Thundergnat (talk) 15:54, 19 April 2014 (UTC)