Talk:Set of real numbers

Not making this a draft task yet: anyone see glaring problems? --Ledrug 04:32, 30 September 2011 (UTC)

Looks fine to me; I've marked as a draft. The only thing I'd note is that I wouldn't require the creation of a datatype; that's just one way of doing it — a particular abstraction, if you will — that not all languages would choose to do (for example, you could write a turing machine description which would do the operations, but which would not have datatypes, as it doesn't have those abstractions). That's really a minor quibble though. –Donal Fellows 09:57, 30 September 2011 (UTC)

The optional task seems to require a particular implementation. To do the compulsory part, a set could be represented by a boolean expression, and the set arithmatic by and, or or'ing the boolean expressions of input sets to create a boolean expression for the output set. Testing a number for inclusion would involve evaluation the boolean expression. I am unsure of the details of what the optional part requires, but if it needed the calculation of the sum of the range of all numbers included in the set then that ... ...may take extra thought. (Nice subject for a task though). --Paddy3118 11:00, 30 September 2011 (UTC)

A set over real domain is uncountable, so you definitely can't test insideness of individual numbers and sum the count. I specifically mentioned convex sets (simple ranges) because that's the most likely building block of more complex sets, and if you go that route, total length of a set is just the sum of all its contained non-overlapping range lengths.

@Dkf: I'll reword the datatype thing. --Ledrug 11:16, 30 September 2011 (UTC)

This depends on an implementation for "real numbers" and it's not clear whether floating point is considered adequate in this role. --Rdm 17:57, 30 September 2011 (UTC)

The name of the task is somewhat unfortunate, since only a (rare IMO) subset of the sets of real numbers are included. I'd think that "intervals of real numbers" would be more accurate. Any other ideas for names?

Otherwise the task seems fine to me.

CRGreathouse 08:17, 21 November 2011 (UTC)

Indeed, the number of sets of real numbers describable by this task is countable; whereas the number of sets of real numbers is uncountable. (In fact, the real numbers are already uncountable; its power set has an even bigger cardinality than the real numbers.) So, in fact, this task covers a measure-zero subset of the sets of real numbers. --Spoon! 10:07, 21 November 2011 (UTC)

Wolfram Mathworld and Perl

I copied one of the equations, |sin(pi x2)| > 1/2, 0 < x < 10 into mathworld and got the result:

sqrt(12 * n + 1) / sqrt(6) < x < sqrt(12 * n + 5) / sqrt(6), 0 <= n <= 49

For n = 49 I get the range component: (9.9079092984679, 9.941495528004495) which differs from the Perl: (9.96, 9.99). --Paddy3118 07:04, 2 October 2011 (UTC)

Aren't there two sets of solutions? --Ledrug 07:32, 2 October 2011 (UTC)

I took the two solutions as meaning that there were two ways of expressing the same thing rather than being parts of a whole as they use the word solutions plural. I've just checked and the other does give (9.958246164193104, 9.991663191547909), so all seems well. --Paddy3118 07:47, 2 October 2011 (UTC)

Er I'll pretend this is an American/British English thing, but does it mean "x^2 = 1 has two solutions: x = 1, x = -1" is counterintuitive? Also if you are familiar with Mathematica, you would have seen the composite solutions all the time:<lang>In[1]:= Solve[x x == 1, {x}]

Out[1]= {{x -> -1}, {x -> 1}}</lang> It's not supposed to repeat the same solution in a equivalent but different form. --Ledrug 07:59, 2 October 2011 (UTC)

Unfortunately I can't hide under that excuse. I don't use Mathematica, and don't often use their web site. It just looked like in their list of information on what I input that they had decided to give two solutions rather than one solution that has two components. Looking again at their web page, they don't link the two - they actually separate them with a horizontal bar the first level of visual grouping that includes them both has the heading solutions.
Your output from the Mathematica program that you give above is quit explicit in showing one answer with two parts by its use of brackets. I live and learn. Thanks. --Paddy3118 11:57, 2 October 2011 (UTC)

Extra Credit?

What is the point of the extra credit? Finding length seems to have little to do with set membership. --Rdm 20:24, 2 October 2011 (UTC)

It's mostly for the usefulness of the set implementation in practice:

The length sum is the set's Lebesgue measure, which gives you a sense of the set's size, analogous to number of elements in a countable set. It can be useful for numeric methods.
If your set implementation allows a straightforward calculation of the length, the same method might be easily extended to do other things such as integrating a function over a set. The length of set A itself can be thought as integrating constant function f(x) = 1 for x in A.
If for neither of the above, the optional goal can be used as a measure of efficiency of your implementation. It easily specifies a relative large number of continuous regions (100ish) that you need to deal with for the set arithmetic. --Ledrug 23:15, 2 October 2011 (UTC)

Nevertheless, finding the zeros of an arbitrary computation has little to do with this task. The easiest way of solving the extra credit in a language that does not already implement the required zero finding involves manipulation of the underlying expressions by the programmer -- something that can be easier to do outside of the context of set notation. Though it's true that the set implementation might be used to determine which of the regions bounded by the zeros are in the set and which of those reason are outside of the set.

Put differently, it's a modularity violation. Simple zero finding algorithms, like hill climbing, are going to be baffled by the interface provided by set membership -- there is no slope. So that pushes the implementation of this algorithm inside the set implementation. But zero finding becomes arbitrarily complex when presented with arbitrary computations. --Rdm 10:48, 3 October 2011 (UTC)

Actually, as written, the zero finding is part of the task, which I agree does seem like a large task in itself. Maybe the optional part of the task should be made easier?

How about the optional task being changed to a description of what the length is, followed by "find the length of the set formaed by <combination of ~5 simple sets>" ? --Paddy3118 16:10, 3 October 2011 (UTC)

The optional work requires a way to define the sets, it didn't ask you to systematically solve any equations to derive the set because that's irrelevant. The perl program I wrote didn't solve |sin(x)| = 1/2 in the program either. And, |sin(x)| = 1/2 means x = (n +/- 1/6) Pi, it's not rocket science. How much easier does it need to be? --Ledrug 00:34, 4 October 2011 (UTC)

The issue is not that the requested numbers are not easy to find -- it's easy to solve without using the set implementation at all. But an approach that meaningfully uses the set implementation to find the lengths is hard to imagine. --Rdm 01:06, 4 October 2011 (UTC)

It depends on your implementation. If it's not useful for finding the length of, or "iterating" the set, don't do the optional. This is expected and the optional work is optional for a reason. --Ledrug 01:13, 4 October 2011 (UTC)

In answer to: "How much easier does it need to be?"

See: How about the optional task being changed to a description of what the length is, followed by "find the length of the set formaed by <combination of ~5 simple sets>" ?

The maths in the optional part is largely nothing to do with the subject of the main task, forcing the little that is, to be obscured. Some competancy in programming in a language can be taken for granted in the audience, but that does not necessarily translate to a ready grasp of this branch of mathematics. You state that the answer to your equation is simple, it is not germane however, and not everyone thinks it is simple.

--Paddy3118 05:14, 4 October 2011 (UTC)

"Computer representation of real numbers as a set" is so much more complicated than "solving sin(x) = +/- 1/2 over real domain" that, if you don't have "a ready grasp of this branch of mathematics" (what branch, really?) you probably don't want to deal with this task. For the optional goal, I want an easy and unambiguous specification of a somewhat large number of disjoint regions of real numbers, that 1) is easy to program for but not easy, at least not pleasant, to write down completely by hand; 2) is not periodic, so you need a general approach instead of just calculate the first few numbers and multiply the result by 10; 3) whose intermediate results are easy to check, say, against a plotted curve; 4) does not require more than high school math. Also the required part of the task already asked for combinations of 2 simple sets, it's highly redundant to ask for 5. I don't think the above requirements are so very extraordinarily steep for a program that's supposed to deal with real numbers, and frankly I'm surprised that complaints so far have been about sin(x) = 1/2 instead of more crucial stuff like floating point precision, representibility of real sets on a computer, ambiguity of infinities, etc. --Ledrug 06:16, 4 October 2011 (UTC)

So, ok.. I have posted an implementation of the "Optional Work". Note that most of the computation was about finding the boundaries for the intervale. Note also that none of the "Optional Work" had anything to do with the implementation for the base task. Note also that (once I had the locations of the interval boundaries) the final computation was much simpler than the implementation of the base task. (I believe that this is because intervals are easier to represent as sequences than as sets, but also the initial implementation cared about inclusive vs. exclusive bounds and that's not relevant for the optional part. Another issue is that the base task gave us numeric bounds where the optional part required us to compute them.)

Typically, "extra credit" tasks incorporate the base task. Since that apparently that is neither desired nor expected here, perhaps the "Optional Work" section should have a note that we do not need to find the set implementation useful for the "Optional Work"? --Rdm 17:57, 4 October 2011 (UTC)

I see what's lacking in the mandetory part of the task now. It should have required an "is empty" check. Some normal set operations are not doable without it, such as subset (⊂). Empty check would have required some way to enumerate set content, and most of the complaints about the optional part would simply go away. As to the computing of boundary, I gave the relations in the optional task, you didn't really need to compute much other than n + 1/6. One curiosity: if intervals are easier to deal with than "sets", why not just impelement sets as intervals? --Ledrug 19:09, 4 October 2011 (UTC)

I am not sure I understand your question, but I was drawing a distinction between sequences of intervals and sets of intervals. My point was that intervals seem to me to be more tractable when treated as members of sequences than when treated as members of sets. But, yes, if you ask for the ability to determine if a set is equal to the empty set then the set implementation I was using would not be adequate. --Rdm 19:45, 4 October 2011 (UTC)

Added a requirement to check if set is empty. It really should have gone into the mandetory part, but that would invalidate the python example, which is probably not fair for now. --Ledrug 20:53, 4 October 2011 (UTC)

Ok, that gets it closer to the sort of data I would need, but it's still not really relevant. The issue is: to test for empty set I need a list of all the potential inflection points, but I do not care if they are currently valid interval bounds, so it was still simpler to just ignore the set data structures and use special case logic for computing lengths.

(Also, this sub-task could use some required examples, but I made some up instead.) --Rdm 00:56, 5 October 2011 (UTC)

I don't really want to specify a test case for empty test. To make available operations more or less complete, you need one of the three (besides ones already in the mandetory part): is empty, set equality, or is subset. Provide any one, and the other two can be easily constructed through other binary operators. It just seemed to me that empty test is likely the least amount of work. And since length calculation would require one of them (or prove me wrong), a test case isn't necessary. As to how you do any of the required or optional task, I'm not really concerned: if all the basic methods are available, the implementation should be reasonably usable, that's all that matters. --Ledrug 01:37, 5 October 2011 (UTC)

As I said before, I am doing my length calculation without any of those three. The data structure I use does not concern itself with the distinction between open and closed interval and thus is not capable of supporting is empty nor set equality nor is subset. That said, the version with empty set support does have enough information to compute length -- I'll try posting an implementation of that for comparison. --Rdm 10:39, 5 October 2011 (UTC)

No, but you are using a way to iterate through the innards of the set implementation, which is what you need to implement any of the three. The reason the empty test can't be done with only the "has element" test is that a real set is uncountable, so you can't even conceptually exhaustively test a list of candadites for insideness via a (potentially infinite) loop. --Ledrug 23:17, 5 October 2011 (UTC)

If partial innards, which are not sufficient to satisfy the requirements of the membership test part of the task, count, then yes. --Rdm 23:34, 5 October 2011 (UTC)

Ok, that's done. It's more than twice as much code, from the same starting point. The number in the answer is also slightly, which surprises me. I did not think that there were enough floating point subtractions involved to accrue that big of an error. --Rdm 20:14, 5 October 2011 (UTC)