Talk:Longest string challenge: Difference between revisions

← Older edit

Talk:Longest string challenge (view source)

Revision as of 14:18, 29 September 2011

9,099 bytes added , 12 years ago

m

→‎Conforms to the specs?: ya

Anonymous user

rosettacode>Dgamey

Revision as of 05:20, 17 August 2011 (view source) rosettacode>Ledrug (→‎bounds checking in C) ← Older edit		Latest revision as of 14:18, 29 September 2011 (view source) rosettacode>Dgamey m (→‎Conforms to the specs?: ya)
(20 intermediate revisions by 6 users not shown)
Line 73: Implement a solution to the basic problem that adheres to the spirit of the restrictions. Describe how you circumvented or got around these 'restrictions' and met the 'spirit' of the challenge. Your supporting description may need to describe any challenges to interpreting the restrictions and how you interpretation. IfYou ~~you~~should ~~make~~state any assumptions, ~~you~~warnings, ~~should~~or ~~state~~other ~~them~~relevant points. This task is likely to encourage multiple different types of solutions. They should be substantially different approaches. Line 104: :I quite like the rewrite. I wouldn't want to go so far as to add something like <nowiki>{{puzzle}}</nowiki> until we've all had a chance to see how things go with this task; and I think the name puzzle might be so easily miss-used on game tasks we have had in the past, that did not need the distinction. --[[User:Paddy3118\|Paddy3118]] 05:57, 16 August 2011 (UTC) :: Thanks Paddy. I think it needs a wee bit more work. I'm not sure if I should add the change to the main page so people don't miss it or give it a couple of days for feedback here. I also think Puzzle is the wrong name for that reason. It could be a stretch, a challenge, or we could add the word programming to it. --[[User:Dgamey\|Dgamey]] 11:51, 16 August 2011 (UTC) ===Ready to leave Draft?=== The task and talk page seems to have been stable for a while. I'm thinking it's about time to remove draft status. If I don't see more activity, I'm thinking that this should be done at the end of September. --[[User:Dgamey\|Dgamey]] 03:22, 9 September 2011 (UTC) ==Similar to [[Averages/Mode]]== Line 143 ⟶ 147: :::::::: I'm not there yet as I want to hear more. I am beginning to think that the intent isn't being stated in a positive way. Restrictions are by definition negative. What was the original intent here? Really, it was to get people to think outside of their particular box. This problem can be solved very conventionally and that's boring and pedestrian. The point perhaps is in your language of choice how would you solve this creatively showing off some of your language capabilities? This also lets people have a little bit of fun with it. The restrictions could be more guidance/example in this case. --[[User:Dgamey\|Dgamey]] 19:56, 14 August 2011 (UTC) ::::::::: If that is is the point, then rather than placing restrictions, just make that point in the task description, and let the implementer make the decision as to how this is best achieved. Maybe say "Demonstrate this creatively showing off some of your language capabilities." [[User:Markhobley\|Markhobley]] 07:46, 9 September 2011 (UTC) :::::::::: Borrowed from this idea in the last revision. I think there is a good balance between rigid restrictions and anarchy :) --[[User:Dgamey\|Dgamey]] 14:17, 29 September 2011 (UTC) ::::::::: (I find I am enjoying following the discussion and ultimately hope we ''can'' get the 'nuance' right to make this a good task). --[[User:Paddy3118\|Paddy3118]] 09:23, 15 August 2011 (UTC) Line 325 ⟶ 332: :::::: It could be replaced by a function which returns the next available size (implemented as a large switch statement). --[[User:Rdm\|Rdm]] 04:03, 17 August 2011 (UTC) ::::::: Heh. I said somewhere on this page before, this task is gimmicky, and I wrote some gimmicky code as a proof of concept -- for fun. Then people came around and started to pile common sense stuff on it: <code>const</code> pointers, <code>fgets</code>, <code>realloc</code>, etc. The problem is, this is not a common sense task. We know that the buffer can be overrun, but what are the consequences? That someone will put the code on a webserver with root priviledge and accept unsanitized data? Or will it be run as a system daemon? I highly doubt it. I'd be fine with a giant, red, flashing warning label about buffer overrun, but making it correct by burying the (somewhat) interesting part with two hundred lines of safety code is going out of hand. I would suggest just slap on a warning label and leave it at that. The task requires a handicapped program, and we have a handicapped program, it's fitting. --[[User:Ledrug\|Ledrug]] 05:20, 17 August 2011 (UTC) :::::::: Now you 've pointed it out, it does verge on the absurd :-)<br> There is also much more talk than task entries. --[[User:Paddy3118\|Paddy3118]] 05:59, 17 August 2011 (UTC) ::::::::: I think I'll post the revised task descriptions with a small tweak about warnings. The intent was for demonstration code with some documentation. It would be fine to point out that the code isn't secure in the description. Now we've got all kinds of secondary functions. I'd rather see comments on the line p = &p[1] and the while loop showing the reader with little C what's happening. ::::::::: A separate task or family of tasks showing various safe vs. unsafe practices might be worth considering. --[[User:Dgamey\|Dgamey]] 11:39, 17 August 2011 (UTC) ::In other words, this program will not block for input: <lang c>#include <stdio.h> main() { Line 333 ⟶ 345: ::Use functions with bounds-checking built-in, such as strncpy. (Or one of Microsoft's _s extensions). --[[User:Short Circuit\|Michael Mol]] 23:24, 15 August 2011 (UTC) ::Alternately, XOR your return value, and use C's "0 is false, nonzero is true" behavior in conditional expression evaluation. --[[User:Short Circuit\|Michael Mol]] 23:24, 15 August 2011 (UTC) === bounds checking in C, take 2 === :I do not think that C code with buffer overflows should ever be considered to be simpler than C code that guards against them. It might be faster, but any apparent simplicity is deceptive since the possibility of buffer overflows pushes complexity out onto the user. That said, gets() here requires strcat(), because gets() drops the end of line character. Meanwhile fgets() does not trim off that character. And bounds checking could be implemented using memset() and then using the implementation's cmp() between a reference and an appropriately choosen spot near the end of buffer. So if we are will to pay the minor increase in complexity to use gets instead of fgets, I am not sure why we are not willing to pay a few extra lines to get an implementation without buffer overflow. --[[User:Rdm\|Rdm]] 16:14, 17 August 2011 (UTC) :: 1) I added a second version without buffer issues. 2) With fgets, you still need to check if the line is too long and was truncated; if it is, we are pretty much screwed and need to be able to read in the rest of the line and append it to the line buffer, which is a lot of work because of the "no arithmetics" clause. The second method does handle arbitrarily long input, but see how much more complicated it is. :: It's not that I particularly love gets or buffer overruns, and "faster" is totally not a concern for this task, the key issue is how easy it is to read the code. The original C code was really just demonstrating how to compare the lengths of two strings without directly measuring them, and I deliberately did not pretend it was safe code (I've ''never'' used gets before); declaring const and fgets, to me, feel like false promise of safety. :: That said, if you can check bounds and truncation correctly without making it overly complicated, feel free: I can't think of a way to do it without integer math, like in the second solution. --[[User:Ledrug\|Ledrug]] 18:27, 17 August 2011 (UTC) ::: Given that the current version does not always work, I do not see anything wrong with a replacement version which does not always work. That said, I think I prefer "exit with an error code" over "crash" for behavior in adverse conditions. But, ok, I'll try for a bounds checking version. --[[User:Rdm\|Rdm]] 18:32, 17 August 2011 (UTC) :::: The code I posted above on this talk page does that: it aborts if buffer would be overrun or line is too long. The second code I posted on task page only aborts on realloc failure and should function correctly if that doesn't happen. ''None'' of the changes made by various people to the original code could garantee a correct result or a timely abort, which is why I don't feel it was worth the trouble. --[[User:Ledrug\|Ledrug]] 18:45, 17 August 2011 (UTC) ::::: What do you think of the version I added? --[[User:Rdm\|Rdm]] 21:02, 17 August 2011 (UTC) :::::: <code>if (!longer(bufend, line))</code> isn't really checking for accumulator overflow, is it? Also, if you use <code>memset</code> to mark unused buffer region, they should to be called every time the accumulator is reset (i.e. found a new longest line), or it would be possible that there's a null byte in the tail region of <code>buf</code> before the reset, but accumulated strings haven't reached there yet, and next <code>if (!longer(bufend, line))</code> can give a false positive on overflow. (I think) --[[User:Ledrug\|Ledrug]] 21:24, 17 August 2011 (UTC) :::::EDIT yes that does happen. Modified the code to this: (LINE_MAX is your 1000000, your buffer would have been <code>#define ACCU_MAX (11LINE_MAX + 1)</code>, but the point is the same:<lang c>#define LINE_MAX 6 #define ACCU_MAX (4LINE_MAX+1) int main() { char line[LINE_MAX]; char buf[ACCU_MAX]; char linend= &line[LINE_MAX - 1]; char bufend= &buf[ACCU_MAX - LINE_MAX - 1]; char last = buf; char next = buf; memset(line, 1, LINE_MAX); memset(buf, 1, ACCU_MAX); buf[0]= buf[ACCU_MAX - 1]= 0; while (fgets(line, LINE_MAX, stdin)) { if (!linend) exit(1); if (longer(last, line)) continue; if (!longer(bufend, line)) exit(1); if (longer(line, last)) next = buf; last = next; strcpy(next, line); while (next) next = &next[1]; }</lang> and it aborts on the input <code>echo "aa\nbb\ncc\ndd\nee\nff\naaa" \| ./a.out</code> but works if you remove any of the doubles. --[[User:Ledrug\|Ledrug]] 21:28, 17 August 2011 (UTC) :::::: I do not consider "false positives" to be a bug. That's a (very minor) efficiency issue, but the point was to avoid buffer overflows, not to squeeze every last ounce out of the buffer. So I have one extra byte in the line buffer, and an extra line buffer sized region in the big buffer. If you prefer, imagine that those particular bytes were part of other variables. --[[User:Rdm\|Rdm]] 21:35, 17 August 2011 (UTC) :::::: Eh. If it's garanteed to fail under some situations where it has the resource to succeed, it's not an efficiency issue, it's a correctness issue. It's not even hard to fix:<lang c> while (fgets(line, LINE_MAX, stdin)) { if (!linend) exit(1); if (longer(last, line)) continue; if (longer(line, last)) { memset(buf, 1, ACCU_MAX); buf[0]= buf[ACCU_MAX - 1] = 0; next = buf; } if (!longer(bufend, line)) exit(1); last = next; strcpy(next, line); while (next) next = &next[1]; }</lang> --[[User:Ledrug\|Ledrug]] 21:45, 17 August 2011 (UTC) ::::::: Bytes from <code>bufend</code> on were not "resources" they were "protection". So no, I do not agree that it's a correctness issue. If that's a correctness issue, you might as well argue that a program is incorrect because some memory on the heap has not been allocated. Your ACCU_MAX is the moral equivalent of my <code>bufend</code>. I could agree to calling this an efficiency issue, but not a correctness issue. --[[User:Rdm\|Rdm]] 22:23, 17 August 2011 (UTC) :::: (deindent) Ok I guess that's what's confusing about the code. You could check available buffer at a more natural place: <code>next+1</code>. That's where you'd copy the line string to to begin with:<lang c> char line[LINE_MAX] = {0}; char buf[ACCU_MAX] = {0}; char last = buf; char next = buf; while (fgets(line, LINE_MAX, stdin)) { if (line[LINE_MAX - 2]) exit(1); if (longer(last, line)) continue; if (longer(line, last)) { memset(buf, 1, ACCU_MAX); buf[0] = buf[ACCU_MAX - 1] = 0; next = buf; } if (line && longer(&line[1], &next[1])) exit(1); strcpy(next, line); for (last = next; next; next = &next[1]); }</lang> (I didn't define the LINE_MAX and ACCU_MAX just to be different, it's easier to change the values for testing. They ''are'' related to your <code>bufend</code>). There's still one issue: <code>echo -n "a\naa\naaa"\|./a.out</code> fails, which has nothing to do with the modifications I made. --[[User:Ledrug\|Ledrug]] 23:10, 17 August 2011 (UTC) ::::: The treatment of the <code>echo -n "a\naa\naaa"</code> case depends on the definition of a "line". Since "line" was not defined in the task, I am comfortable saying that that represents an ill-formed file and thus cannot represent a correctness issue. (If I cared about that case, I would exit(1) for any line that did not end with a newline character.) --[[User:Rdm\|Rdm]] 23:29, 17 August 2011 (UTC) == Boring solution v. restrictions == Line 344 ⟶ 423: every write(!L) end</lang> == Conforms to the specs? == Hoping to confirm that the AHK solution didn't break any rules : Looks like it meets the spirit just fine. --[[User:Dgamey\|Dgamey]] 14:18, 29 September 2011 (UTC)