Talk:Merge and aggregate datasets: Difference between revisions
(→Cleaned Note: The example does need completing though.) |
|||
(2 intermediate revisions by 2 users not shown) | |||
Line 18: | Line 18: | ||
::Agreed, I find it very useful to have demos that "just run", and like you I usually add comments that show how to read the exact same stuff from a file. I have also added some links to related tasks. --[[User:Petelomax|Pete Lomax]] ([[User talk:Petelomax|talk]]) 07:25, 8 December 2020 (UTC) |
::Agreed, I find it very useful to have demos that "just run", and like you I usually add comments that show how to read the exact same stuff from a file. I have also added some links to related tasks. --[[User:Petelomax|Pete Lomax]] ([[User talk:Petelomax|talk]]) 07:25, 8 December 2020 (UTC) |
||
::Of course that "just run" is more than just a little bit handy for repl-it, tio, and the like. --[[User:Petelomax|Pete Lomax]] ([[User talk:Petelomax|talk]]) 02:35, 10 December 2020 (UTC) |
::Of course that "just run" is more than just a little bit handy for repl-it, tio, and the like. --[[User:Petelomax|Pete Lomax]] ([[User talk:Petelomax|talk]]) 02:35, 10 December 2020 (UTC) |
||
:The task says "Either load the data from the .csv files or create the required data structures hard-coded." so I took that to mean it wasn't required. The current implementations cover the full spectrum. Go, SQL, Wren, and now C++ took the hard-coded approach. Perl and Raku parse a text block. Julia, Phix, and R work as-if they are reading a file. Python, REXX, and SPSS actually do read .csv files. To me the interesting part of this task is combining the tables - I think this is the only task to do that. Reading from a .csv is covered by the [[CSV data manipulation]] task. What should be required? |
|||
[[User:Garbanzo|Garbanzo]] ([[User talk:Garbanzo|talk]]) 03:49, 5 January 2021 (UTC) |
|||
== Cleaned Note == |
== Cleaned Note == |
||
Line 25: | Line 27: | ||
: If you don't complete the task by being able to read the files, then the C++ solution is not as comparable to the solutions that implement the task. Yes it is setup, but reading from csv files is a pretty common way of getting data for your "interesting bits". |
: If you don't complete the task by being able to read the files, then the C++ solution is not as comparable to the solutions that implement the task. Yes it is setup, but reading from csv files is a pretty common way of getting data for your "interesting bits". |
||
:If a very well known and easy to use source of C++ libraries, (Boost?), has a csv reader then you could employ that, but I'm not a great C++ programmer. --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 14:29, 4 January 2021 (UTC) |
:If a very well known and easy to use source of C++ libraries, (Boost?), has a csv reader then you could employ that, but I'm not a great C++ programmer. --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 14:29, 4 January 2021 (UTC) |
||
::'''My apologies''' - The task description ''does'' allow input from other than .csv files. --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 13:07, 5 January 2021 (UTC) |
|||
::: Thanks. The description is more clear now. [[User:Garbanzo|Garbanzo]] ([[User talk:Garbanzo|talk]]) 06:07, 6 January 2021 (UTC) |
Latest revision as of 06:07, 6 January 2021
Duplication of task goals if not task name
So... this task is pretty much an exact duplicate of CSV data manipulation which has been around for 7+ years and has some 85 entries. Admittedly this task has slightly better defined goals and is less trivial, but a large percentage of the code from there could be lifted and used unchanged here.
Some overlap of tasks is inevitable, and honestly I think this one is probably more useful to demonstrate working with real-world data than the other. I hesitate to make any unilateral decisions (unlike with the recent deluge of "Find words containing whatever" tasks that we've been hit with,) but I also don't want to needlessly proliferate trivial variations. Thoughts? --Thundergnat (talk) 19:16, 7 December 2020 (UTC)
- Missing fields in the CSV files. There might be a lot of overlap, but no "exact duplication", and handling of missing fields, although not highlighted, is a significant difference I think. --Paddy3118 (talk) 19:40, 7 December 2020 (UTC)
- My motivation to submit this task was that I recently was working with R-script for the first time. I'm reasonably experienced with programming but had quite a hard time getting it to work.
- The examples and tutorials on stackoverflow and other places are generally either too trivial, or too specific for one exact use-case. Merging, grouping and aggregating different datasets is a very common thing I encounter a lot for my work.
..."two datasets as provided in .csv files"...
Many examples don't read the csv from files. --Paddy3118 (talk) 19:42, 7 December 2020 (UTC)
- Loading from .csv file is shortest code and more practical, but for quickly copying and testing the code examples the hard-coded data is easier. So, when possible, I try to include both, and then comment out the .csv load lines, see the Python example code. --BdR (talk) 21:58, 7 December 2020 (UTC)
- Agreed, I find it very useful to have demos that "just run", and like you I usually add comments that show how to read the exact same stuff from a file. I have also added some links to related tasks. --Pete Lomax (talk) 07:25, 8 December 2020 (UTC)
- Of course that "just run" is more than just a little bit handy for repl-it, tio, and the like. --Pete Lomax (talk) 02:35, 10 December 2020 (UTC)
- The task says "Either load the data from the .csv files or create the required data structures hard-coded." so I took that to mean it wasn't required. The current implementations cover the full spectrum. Go, SQL, Wren, and now C++ took the hard-coded approach. Perl and Raku parse a text block. Julia, Phix, and R work as-if they are reading a file. Python, REXX, and SPSS actually do read .csv files. To me the interesting part of this task is combining the tables - I think this is the only task to do that. Reading from a .csv is covered by the CSV data manipulation task. What should be required?
Garbanzo (talk) 03:49, 5 January 2021 (UTC)
Cleaned Note
I removed the note about generalized programming languages. The solutions may not be as clean as a specialized language but it should still be possible. I also hard coded the data for the C++ entry. To me, the interesting part of this task is joining two tables and dealing with nulls. Garbanzo (talk) 03:05, 4 January 2021 (UTC)
- If you don't complete the task by being able to read the files, then the C++ solution is not as comparable to the solutions that implement the task. Yes it is setup, but reading from csv files is a pretty common way of getting data for your "interesting bits".
- If a very well known and easy to use source of C++ libraries, (Boost?), has a csv reader then you could employ that, but I'm not a great C++ programmer. --Paddy3118 (talk) 14:29, 4 January 2021 (UTC)