Bin given limits
You are encouraged to solve this task according to the task description, using any language you may know.
You are given a list of n ascending, unique numbers which are to form limits for n+1 bins which count how many of a large set of input numbers fall in the range of each bin.
(Assuming zero-based indexing)
bin[0] counts how many inputs are < limit[0] bin[1] counts how many inputs are >= limit[0] and < limit[1] .. bin[n-1] counts how many inputs are >= limit[n-2] and < limit[n-1] bin[n] counts how many inputs are >= limit[n-1]
- Task
The task is to create a function that given the ascending limits and a stream/ list of numbers, will return the bins; together with another function that given the same list of limits and the binning will print the limit of each bin together with the count of items that fell in the range.
Assume the numbers to bin are too large to practically sort.
- Task examples
Part 1: Bin using the following limits the given input data
limits = [23, 37, 43, 53, 67, 83] data = [95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55]
Part 2: Bin using the following limits the given input data
limits = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data = [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749]
Show output here, on this page.
11l
<lang 11l>F bisect_right(a, x)
V lo = 0 V hi = a.len L lo < hi V mid = (lo + hi) I/ 2 I x < a[mid] hi = mid E lo = mid + 1 R lo
F bin_it(limits, data)
‘Bin data according to (ascending) limits.’ V bins = [0] * (limits.len + 1) L(d) data bins[bisect_right(limits, d)]++ R bins
F bin_print(limits, bins)
print(‘ < #3 := #3’.format(limits[0], bins[0])) L(lo, hi, count) zip(limits, limits[1..], bins[1..]) print(‘>= #3 .. < #3 := #3’.format(lo, hi, count)) print(‘>= #3 := #3’.format(limits.last, bins.last))
print("RC FIRST EXAMPLE\n") V limits = [23, 37, 43, 53, 67, 83] V data = [95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47,
16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55]
V bins = bin_it(limits, data) bin_print(limits, bins)
print("\nRC SECOND EXAMPLE\n") limits = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data = [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,
416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749]
bins = bin_it(limits, data) bin_print(limits, bins)</lang>
- Output:
RC FIRST EXAMPLE < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 RC SECOND EXAMPLE < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
Ada
This example works with Ada 2012. The definition of the subtype Limits_Array employs a dynamic predicate to ensure that the limits array is sorted. The solution defines the binning types and operations within an Ada package, providing modularity and simplifying the code in the main procedure.
package specification: <lang Ada>package binning is
type Nums_Array is array (Natural range <>) of Integer; function Is_Sorted (Item : Nums_Array) return Boolean; subtype Limits_Array is Nums_Array with Dynamic_Predicate => Is_Sorted (Limits_Array); function Bins (Limits : Limits_Array; Data : Nums_Array) return Nums_Array; procedure Print (Limits : Limits_Array; Bin_Result : Nums_Array);
end binning; </lang> package body: <lang Ada>pragma Ada_2012; with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
package body binning is
--------------- -- Is_Sorted -- ---------------
function Is_Sorted (Item : Nums_Array) return Boolean is begin return (for all i in Item'First .. Item'Last - 1 => Item (i) < Item (i + 1)); end Is_Sorted;
---------- -- Bins -- ----------
function Bins (Limits : Limits_Array; Data : Nums_Array) return Nums_Array is Result : Nums_Array (Limits'First .. Limits'Last + 1) := (others => 0); Bin_Index : Natural; begin for value of Data loop Bin_Index := Result'First; for I in reverse Limits'Range loop if value >= Limits (I) then Bin_Index := I + 1; exit; end if; end loop; Result (Bin_Index) := Result (Bin_Index) + 1; end loop; return Result; end Bins;
----------- -- Print -- -----------
procedure Print (Limits : Limits_Array; Bin_Result : Nums_Array) is begin if Limits'Length = 0 then return; end if; Put (" < "); Put (Item => Limits (Limits'First), Width => 3); Put (": "); Put (Item => Bin_Result (Bin_Result'First), Width => 2); New_Line; for i in Limits'First + 1 .. Limits'Last loop Put (">= "); Put (Item => Limits (i - 1), Width => 3); Put (" and < "); Put (Item => Limits (i), Width => 3); Put (": "); Put (Item => Bin_Result (i), Width => 2); New_Line; end loop; Put (">= "); Put (Item => Limits (Limits'Last), Width => 3); Put (" : "); Put (Item => Bin_Result (Bin_Result'Last), Width => 2); New_Line; end Print;
end binning; </lang> main procedure: <lang Ada>with Ada.Text_IO; use Ada.Text_IO; with binning; use binning;
procedure Main is
Limits_1 : Limits_Array := (23, 37, 43, 53, 67, 83); Data_1 : Nums_Array := (95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55); Limits_2 : Limits_Array := (14, 18, 249, 312, 389, 392, 513, 591, 634, 720); Data_2 : Nums_Array := (445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749); Bin_1 : Nums_Array := Bins (Limits => Limits_1, Data => Data_1); Bin_2 : Nums_Array := Bins (Limits => Limits_2, Data => Data_2);
begin
Put_Line ("Example 1:"); Print (Limits => Limits_1, Bin_Result => Bin_1); New_Line; Put_Line ("Example 2:"); Print (Limits => Limits_2, Bin_Result => Bin_2);
end Main; </lang> {output}
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
AutoHotkey
<lang AutoHotkey>Bin_given_limits(limits, data){
bin := [], counter := 0 for i, val in data { if (limits[limits.count()] <= val) bin["∞", ++counter] := val else for j, limit in limits if (limits[j-1] <= val && val < limits[j]) bin[limit, ++counter] := val }
for j, limit in limits { output .= (prevlimit ? prevlimit : "-∞") ", " limit " : " ((x:=bin[limit].Count())?x:0) "`n" prevlimit := limit } return output .= (prevlimit ? prevlimit : "-∞") ", ∞ : " ((x:=bin["∞"].Count())?x:0) "`n"
}</lang> Examples:<lang AutoHotkey>limits := [23, 37, 43, 53, 67, 83] data := [95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47,16 , 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55] MsgBox, 262144, , % Bin_given_limits(limits, data)
limits := [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data := [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933 ,416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306 ,655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247 ,346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123 ,345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97 ,854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395 ,787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692 ,698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237 ,605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791 ,466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749] MsgBox, 262144, , % Bin_given_limits(limits, data) return</lang>
- Output:
-∞, 23 : 11 23, 37 : 4 37, 43 : 2 43, 53 : 6 53, 67 : 9 67, 83 : 5 83, ∞ : 13 --------------------------- -∞, 14 : 3 14, 18 : 0 18, 249 : 44 249, 312 : 10 312, 389 : 16 389, 392 : 2 392, 513 : 28 513, 591 : 16 591, 634 : 6 634, 720 : 16 720, ∞ : 59
C
<lang c>#include <stdio.h>
- include <stdlib.h>
size_t upper_bound(const int* array, size_t n, int value) {
size_t start = 0; while (n > 0) { size_t step = n / 2; size_t index = start + step; if (value >= array[index]) { start = index + 1; n -= step + 1; } else { n = step; } } return start;
}
int* bins(const int* limits, size_t nlimits, const int* data, size_t ndata) {
int* result = calloc(nlimits + 1, sizeof(int)); if (result == NULL) return NULL; for (size_t i = 0; i < ndata; ++i) ++result[upper_bound(limits, nlimits, data[i])]; return result;
}
void print_bins(const int* limits, size_t n, const int* bins) {
if (n == 0) return; printf(" < %3d: %2d\n", limits[0], bins[0]); for (size_t i = 1; i < n; ++i) printf(">= %3d and < %3d: %2d\n", limits[i - 1], limits[i], bins[i]); printf(">= %3d : %2d\n", limits[n - 1], bins[n]);
}
int main() {
const int limits1[] = {23, 37, 43, 53, 67, 83}; const int data1[] = {95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55};
printf("Example 1:\n"); size_t n = sizeof(limits1) / sizeof(int); int* b = bins(limits1, n, data1, sizeof(data1) / sizeof(int)); if (b == NULL) { fprintf(stderr, "Out of memory\n"); return EXIT_FAILURE; } print_bins(limits1, n, b); free(b);
const int limits2[] = {14, 18, 249, 312, 389, 392, 513, 591, 634, 720}; const int data2[] = { 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749};
printf("\nExample 2:\n"); n = sizeof(limits2) / sizeof(int); b = bins(limits2, n, data2, sizeof(data2) / sizeof(int)); if (b == NULL) { fprintf(stderr, "Out of memory\n"); return EXIT_FAILURE; } print_bins(limits2, n, b); free(b);
return EXIT_SUCCESS;
}</lang>
- Output:
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
C++
<lang cpp>#include <algorithm>
- include <cassert>
- include <iomanip>
- include <iostream>
- include <vector>
std::vector<int> bins(const std::vector<int>& limits,
const std::vector<int>& data) { std::vector<int> result(limits.size() + 1, 0); for (int n : data) { auto i = std::upper_bound(limits.begin(), limits.end(), n); ++result[i - limits.begin()]; } return result;
}
void print_bins(const std::vector<int>& limits, const std::vector<int>& bins) {
size_t n = limits.size(); if (n == 0) return; assert(n + 1 == bins.size()); std::cout << " < " << std::setw(3) << limits[0] << ": " << std::setw(2) << bins[0] << '\n'; for (size_t i = 1; i < n; ++i) std::cout << ">= " << std::setw(3) << limits[i - 1] << " and < " << std::setw(3) << limits[i] << ": " << std::setw(2) << bins[i] << '\n'; std::cout << ">= " << std::setw(3) << limits[n - 1] << " : " << std::setw(2) << bins[n] << '\n';
}
int main() {
const std::vector<int> limits1{23, 37, 43, 53, 67, 83}; const std::vector<int> data1{ 95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55};
std::cout << "Example 1:\n"; print_bins(limits1, bins(limits1, data1));
const std::vector<int> limits2{14, 18, 249, 312, 389, 392, 513, 591, 634, 720}; const std::vector<int> data2{ 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749};
std::cout << "\nExample 2:\n"; print_bins(limits2, bins(limits2, data2));
}</lang>
- Output:
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
C#
<lang csharp>using System;
public class Program {
static void Main() { PrintBins(new [] { 23, 37, 43, 53, 67, 83 }, 95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55 ); Console.WriteLine();
PrintBins(new [] { 14, 18, 249, 312, 389, 392, 513, 591, 634, 720 }, 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,416,589,930,373,202, 253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306,655,267,248,477,549,238, 62,678, 98,534, 622,907,406,714,184,391,913, 42,560,247,346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458, 945,733,507,916,123,345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395,787,942,456,242,759, 898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692,698,765,331,487,251,600,879,342,982,527, 736,795,585, 40, 54,901,408,359,577,237,605,847,353,968,832,205,838,427,876,959,686,646,835,127,621, 892,443,198,988,791,466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749); }
static void PrintBins(int[] limits, params int[] data) { int[] bins = Bins(limits, data); Console.WriteLine($"-∞ .. {limits[0]} => {bins[0]}"); for (int i = 0; i < limits.Length-1; i++) { Console.WriteLine($"{limits[i]} .. {limits[i+1]} => {bins[i+1]}"); } Console.WriteLine($"{limits[^1]} .. ∞ => {bins[^1]}"); }
static int[] Bins(int[] limits, params int[] data) { Array.Sort(limits); int[] bins = new int[limits.Length + 1]; foreach (int n in data) { int i = Array.BinarySearch(limits, n); i = i < 0 ? ~i : i+1; bins[i]++; } return bins; }
}</lang>
- Output:
-∞ .. 23 => 11 23 .. 37 => 4 37 .. 43 => 2 43 .. 53 => 6 53 .. 67 => 9 67 .. 83 => 5 83 .. ∞ => 13 -∞ .. 14 => 3 14 .. 18 => 0 18 .. 249 => 44 249 .. 312 => 10 312 .. 389 => 16 389 .. 392 => 2 392 .. 513 => 28 513 .. 591 => 16 591 .. 634 => 6 634 .. 720 => 16 720 .. ∞ => 59
Factor
Factor provides the bisect-right
word in the sorting.extras
vocabulary. See the implementation here.
<lang factor>USING: assocs formatting grouping io kernel math math.parser
math.statistics sequences sequences.extras sorting.extras ;
- bin ( data limits -- seq )
dup length 1 + [ 0 ] replicate -rot [ bisect-right over [ 1 + ] change-nth ] curry each ;
- .bin ( {lo,hi} n i -- )
swap "%3d members in " printf zero? "(" "[" ? write "%s, %s)\n" vprintf ;
- .bins ( data limits -- )
dup [ number>string ] map "-∞" prefix "∞" suffix 2 clump -rot bin [ .bin ] 2each-index ;
"First example:" print
{
95 21 94 12 99 4 70 75 83 93 52 80 57 5 53 86 65 17 92 83 71 61 54 58 47 16 8 9 32 84 7 87 46 19 30 37 96 6 98 40 79 97 45 64 60 29 49 36 43 55
} { 23 37 43 53 67 83 } .bins nl
"Second example:" print {
445 814 519 697 700 130 255 889 481 122 932 77 323 525 570 219 367 523 442 933 416 589 930 373 202 253 775 47 731 685 293 126 133 450 545 100 741 583 763 306 655 267 248 477 549 238 62 678 98 534 622 907 406 714 184 391 913 42 560 247 346 860 56 138 546 38 985 948 58 213 799 319 390 634 458 945 733 507 916 123 345 110 720 917 313 845 426 9 457 628 410 723 354 895 881 953 677 137 397 97 854 740 83 216 421 94 517 479 292 963 376 981 480 39 257 272 157 5 316 395 787 942 456 242 759 898 576 67 298 425 894 435 831 241 989 614 987 770 384 692 698 765 331 487 251 600 879 342 982 527 736 795 585 40 54 901 408 359 577 237 605 847 353 968 832 205 838 427 876 959 686 646 835 127 621 892 443 198 988 791 466 23 707 467 33 670 921 180 991 396 160 436 717 918 8 374 101 684 727 749
} { 14 18 249 312 389 392 513 591 634 720 } .bins</lang>
- Output:
First example: 11 members in (-∞, 23) 4 members in [23, 37) 2 members in [37, 43) 6 members in [43, 53) 9 members in [53, 67) 5 members in [67, 83) 13 members in [83, ∞) Second example: 3 members in (-∞, 14) 0 members in [14, 18) 44 members in [18, 249) 10 members in [249, 312) 16 members in [312, 389) 2 members in [389, 392) 28 members in [392, 513) 16 members in [513, 591) 6 members in [591, 634) 16 members in [634, 720) 59 members in [720, ∞)
FreeBASIC
<lang freebasic>sub binlims( dat() as integer, limits() as integer, bins() as uinteger )
dim as uinteger n = ubound(limits), j, i for i = 0 to ubound(dat) if dat(i)<limits(0) then bins(0) += 1 elseif dat(i) >= limits(n) then bins(n+1) += 1 else for j = 1 to n if dat(i)<limits(j) then bins(j) += 1 exit for end if next j end if next i
end sub 'example 1 dim as integer limits1(0 to ...) = {23, 37, 43, 53, 67, 83} dim as integer dat1(0 to ...) = {95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47,_
16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55}
dim as uinteger bins1(0 to ubound(limits1)+1) binlims( dat1(), limits1(), bins1() ) print "=====EXAMPLE ONE=====" print "< ";limits1(0);": ";bins1(0) for i as uinteger = 1 to ubound(limits1)
print ">= ";limits1(i-1);" and < ";limits1(i);": ";bins1(i)
next i print ">= ";limits1(ubound(limits1));": ";bins1(ubound(bins1)) print
'example 2 dim as integer limits2(0 to ...) = {14, 18, 249, 312, 389, 392, 513, 591, 634, 720} dim as integer dat2(0 to ...) = {445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,_
416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306,_ 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247,_ 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123,_ 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97,_ 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395,_ 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692,_ 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237,_ 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791,_ 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749}
redim as uinteger bins2(0 to ubound(limits2)+1)
binlims( dat2(), limits2(), bins2() ) print "=====EXAMPLE TWO=====" print "< ";limits2(0);": ";bins2(0) for i as uinteger = 1 to ubound(limits2)
print ">= ";limits2(i-1);" and < ";limits2(i);": ";bins2(i)
next i print ">= ";limits2(ubound(limits2));": ";bins2(ubound(bins2))</lang>
- Output:
=====EXAMPLE ONE===== < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83: 13 =====EXAMPLE TWO===== < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720: 59
Go
<lang go>package main
import (
"fmt" "sort"
)
func getBins(limits, data []int) []int {
n := len(limits) bins := make([]int, n+1) for _, d := range data { index := sort.SearchInts(limits, d) // uses binary search if index < len(limits) && d == limits[index] { index++ } bins[index]++ } return bins
}
func printBins(limits, bins []int) {
n := len(limits) fmt.Printf(" < %3d = %2d\n", limits[0], bins[0]) for i := 1; i < n; i++ { fmt.Printf(">= %3d and < %3d = %2d\n", limits[i-1], limits[i], bins[i]) } fmt.Printf(">= %3d = %2d\n", limits[n-1], bins[n]) fmt.Println()
}
func main() {
limitsList := [][]int{ {23, 37, 43, 53, 67, 83}, {14, 18, 249, 312, 389, 392, 513, 591, 634, 720}, }
dataList := [][]int{ { 95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55, }, { 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749, }, }
for i := 0; i < len(limitsList); i++ { fmt.Println("Example", i+1, "\b\n") bins := getBins(limitsList[i], dataList[i]) printBins(limitsList[i], bins) }
}</lang>
- Output:
Example 1 < 23 = 11 >= 23 and < 37 = 4 >= 37 and < 43 = 2 >= 43 and < 53 = 6 >= 53 and < 67 = 9 >= 67 and < 83 = 5 >= 83 = 13 Example 2 < 14 = 3 >= 14 and < 18 = 0 >= 18 and < 249 = 44 >= 249 and < 312 = 10 >= 312 and < 389 = 16 >= 389 and < 392 = 2 >= 392 and < 513 = 28 >= 513 and < 591 = 16 >= 591 and < 634 = 6 >= 634 and < 720 = 16 >= 720 = 59
Haskell
Splitting the data into bins may be done using the monadic nature of a tuple. Here tuple plays role of the Writer monad, so that sequential partitioning by each bin boundary adds new bin contents.
<lang haskell>import Control.Monad (foldM) import Data.List (partition)
binSplit :: Ord a => [a] -> [a] -> a binSplit lims ns = counts ++ [rest]
where (counts, rest) = foldM split ns lims split l i = let (a, b) = partition (< i) l in ([a], b)
binCounts :: Ord a => [a] -> [a] -> [Int] binCounts b = fmap length . binSplit b</lang>
λ> binSplit [2,4,7] [1,4,2,6,3,8,9,4,1,2,7,4,1,5,1] [[1,1,1,1],[2,3,2],[4,6,4,4,5],[8,9,7]] λ> binCounts [2,4,7] [1,4,2,6,3,8,9,4,1,2,7,4,1,5,1] [4,3,5,3]
More efficient binning procedure exploits the binary search tree. <lang haskell>{-# language DeriveFoldable #-}
import Data.Foldable (toList)
data BTree a b = Node a (BTree a b) (BTree a b)
| Val b deriving Foldable
-- assuming list is sorted. mkTree :: [a] -> BTree a [a] mkTree [] = Val [] mkTree [x] = Node x (Val []) (Val []) mkTree lst = Node x (mkTree l) (mkTree r)
where (l, x:r) = splitAt (length lst `div` 2) lst
binSplit :: Ord a => [a] -> [a] -> a binSplit lims = toList . foldr add (mkTree lims)
where add x (Val v) = Val (x:v) add x (Node y l r) = if x < y then Node y (add x l) r else Node y l (add x r)</lang>
Tasks examples
<lang haskell>import Text.Printf
task bs ns = mapM_ putStrLn
$ zipWith mkLine (binCounts bs ns) bins where bins :: [String] bins = [printf "(-∞, %v)" $ head bs] <> zipWith mkInterval bs (tail bs) <> [printf "[%v, ∞)" $ last bs]
mkLine = printf "%v\t in %s" mkInterval = printf "[%v, %v)"
bins1 = [23, 37, 43, 53, 67, 83] data1 = [ 95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57
, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16 , 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98 , 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55]
bins2 = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data2 = [ 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525
, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47 , 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267 , 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391 , 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213 , 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917 , 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137 , 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981 , 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898 , 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692 , 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40 , 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427 , 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23 , 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374 , 101, 684, 727, 749]</lang>
λ> task bins1 data1 11 in (-∞, 23) 4 in [23, 37) 2 in [37, 43) 6 in [43, 53) 9 in [53, 67) 5 in [67, 83) 13 in [83, ∞) λ> task bins2 data2 3 in (-∞, 14) 0 in [14, 18) 44 in [18, 249) 10 in [249, 312) 16 in [312, 389) 2 in [389, 392) 28 in [392, 513) 16 in [513, 591) 6 in [591, 634) 16 in [634, 720) 59 in [720, ∞)
J
Solution:
Using Idotr
from this JWiki page
<lang j>Idotr=: |.@[ (#@[ - I.) ] NB. reverses order of limits to obtain intervals closed on left, open on right (>= y <)
binnedData=: adverb define
bidx=. i.@>:@# x NB. indicies of bins x (Idotr (u@}./.)&(bidx&,) ]) y NB. apply u to data in each bin after dropping first value
)
require 'format/printf' printBinCounts=: verb define
counts =. y '%2d in [ -∞, %3d)' printf ({. counts) , {. x '%2d in [%3d, %3d)' printf (}.}: counts) ,. 2 ]\ x '%2d in [%3d, ∞]' printf ({: counts) , {: x
)</lang>
Required Examples: <lang j>limits1=: 23 37 43 53 67 83 data1=: , 0&".;._2 noun define 95 21 94 12 99 4 70 75 83 93 52 80 57 5 53 86 65 17 92 83 71 61 54 58 47 16 8 9 32 84 7 87 46 19 30 37 96 6 98 40 79 97 45 64 60 29 49 36 43 55 )
limits2=: 14 18 249 312 389 392 513 591 634 720 data2=: , 0&".;._2 noun define 445 814 519 697 700 130 255 889 481 122 932 77 323 525 570 219 367 523 442 933 416 589 930 373 202 253 775 47 731 685 293 126 133 450 545 100 741 583 763 306 655 267 248 477 549 238 62 678 98 534 622 907 406 714 184 391 913 42 560 247 346 860 56 138 546 38 985 948 58 213 799 319 390 634 458 945 733 507 916 123 345 110 720 917 313 845 426 9 457 628 410 723 354 895 881 953 677 137 397 97 854 740 83 216 421 94 517 479 292 963 376 981 480 39 257 272 157 5 316 395 787 942 456 242 759 898 576 67 298 425 894 435 831 241 989 614 987 770 384 692 698 765 331 487 251 600 879 342 982 527 736 795 585 40 54 901 408 359 577 237 605 847 353 968 832 205 838 427 876 959 686 646 835 127 621 892 443 198 988 791 466 23 707 467 33 670 921 180 991 396 160 436 717 918 8 374 101 684 727 749 )
limits1 < binnedData data1 NB. box/group binned data
┌──────────────────────────┬───────────┬─────┬─────────────────┬──────────────────────────┬──────────────┬──────────────────────────────────────┐ │21 12 4 5 17 16 8 9 7 19 6│32 30 29 36│37 40│52 47 46 45 49 43│57 53 65 61 54 58 64 60 55│70 75 80 71 79│95 94 99 83 93 86 92 83 84 87 96 98 97│ └──────────────────────────┴───────────┴─────┴─────────────────┴──────────────────────────┴──────────────┴──────────────────────────────────────┘
limits1 # binnedData data1 NB. tally binned data
11 4 2 6 9 5 13
limits2 printBinCounts limits2 # binnedData data2 3 in [ -∞, 14) 0 in [ 14, 18)
44 in [ 18, 249) 10 in [249, 312) 16 in [312, 389)
2 in [389, 392)
28 in [392, 513) 16 in [513, 591)
6 in [591, 634)
16 in [634, 720) 59 in [720, ∞]</lang>
Java
<lang java>import java.util.Arrays; import java.util.Collections; import java.util.List;
public class Bins {
public static <T extends Comparable<? super T>> int[] bins( List<? extends T> limits, Iterable<? extends T> data) { int[] result = new int[limits.size() + 1]; for (T n : data) { int i = Collections.binarySearch(limits, n); if (i >= 0) { // n == limits[i]; we put it in right-side bin (i+1) i = i+1; } else { // n is not in limits and i is ~(insertion point) i = ~i; } result[i]++; } return result; }
public static void printBins(List<?> limits, int[] bins) { int n = limits.size(); if (n == 0) { return; } assert n+1 == bins.length; System.out.printf(" < %3s: %2d\n", limits.get(0), bins[0]); for (int i = 1; i < n; i++) { System.out.printf(">= %3s and < %3s: %2d\n", limits.get(i-1), limits.get(i), bins[i]); } System.out.printf(">= %3s : %2d\n", limits.get(n-1), bins[n]); }
public static void main(String[] args) { List<Integer> limits = Arrays.asList(23, 37, 43, 53, 67, 83); List<Integer> data = Arrays.asList( 95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55);
System.out.println("Example 1:"); printBins(limits, bins(limits, data));
limits = Arrays.asList(14, 18, 249, 312, 389, 392, 513, 591, 634, 720); data = Arrays.asList( 445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749);
System.out.println(); System.out.println("Example 2:"); printBins(limits, bins(limits, data)); }
}</lang>
- Output:
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
jq
The following takes advantage of jq's built-in filter for conducting a binary search, `bsearch/1`, which returns a negative value giving the insertion point if the item is not already in the input array.
The "data" is assumed to be a stream of values (rather than an array), thus allowing an indefinitely large number of items to be processed. These items could, but need not, be presented one line at a time. <lang jq># input and output: {limits, count} where
- .limits holds an array defining the limits, and
- .count[$i] holds the count of bin $i, where bin[0] is the left-most bin
def bin($x):
(.limits | bsearch($x)) as $ix | (if $ix > -1 then $ix + 1 else -1 - $ix end) as $i | .count[$i] += 1;
- pretty-print for the structure defined at bin/1
def pp:
(.limits|length) as $length | (range(0;$length) as $i | "< \(.limits[$i]) => \(.count[$i] // 0)" ), ">= \(.limits[$length-1] ) => \(.count[$length] // 0)" ;
- Main program
reduce inputs as $x ({$limits, count: []}; bin($x)) | pp</lang>
- Output:
Invocation: <lang sh> < data.json jq -rn --argfile limits limits.json -f program.jq </lang>
Example 1: <lang>< 23 => 11 < 37 => 4 < 43 => 2 < 53 => 6 < 67 => 9 < 83 => 5 >= 83 => 13</lang>
Example 2: <lang>< 14 => 3 < 18 => 0 < 249 => 44 < 312 => 10 < 389 => 16 < 392 => 2 < 513 => 28 < 591 => 16 < 634 => 6 < 720 => 16 >= 720 => 59</lang>
Julia
<lang julia>"""Add the function Python has in its bisect library""" function bisect_right(array, x, low = 1, high = length(array) + 1)
while low < high middle = (low + high) ÷ 2 x < array[middle] ? (high = middle) : (low = middle + 1) end return low
end
""" Bin data according to (ascending) limits """ function bin_it(limits, data)
bins = zeros(Int, length(limits) + 1) # adds under/over range bins too for d in data bins[bisect_right(limits, d)] += 1 end return bins
end
""" Pretty print the resulting bins and counts """ function bin_print(limits, bins)
println(" < $(lpad(limits[1], 3)) := $(lpad(bins[1], 3))") for (lo, hi, count) in zip(limits, limits[2:end], bins[2:end]) println(">= $(lpad(lo, 3)) .. < $(lpad(hi, 3)) := $(lpad(count, 3))") end println(">= $(lpad(limits[end], 3)) := $(lpad(bins[end], 3))")
end
""" Test on data provided """ function testbins()
println("RC FIRST EXAMPLE:") limits = [23, 37, 43, 53, 67, 83] data = [95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55] bins = bin_it(limits, data) bin_print(limits, bins)
println("\nRC SECOND EXAMPLE:") limits = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data = [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749] bins = bin_it(limits, data) bin_print(limits, bins)
end
testbins()
</lang>
- Output:
RC FIRST EXAMPLE: < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 RC SECOND EXAMPLE: < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
Mathematica/Wolfram Language
<lang Mathematica>limits = {23, 37, 43, 53, 67, 83}; data = {95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86,
65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55};
limits = {{-\[Infinity]}~Join~limits~Join~{\[Infinity]}}; BinCounts[data, limits] MapThread[{#2, #1} &, {%, Partition[First[limits], 2, 1]}] // Grid
limits = {14, 18, 249, 312, 389, 392, 513, 591, 634, 720}; data = {445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77,
323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749};
limits = {{-\[Infinity]}~Join~limits~Join~{\[Infinity]}}; BinCounts[data, limits] MapThread[{#2, #1} &, {%, Partition[First[limits], 2, 1]}] // Grid</lang>
- Output:
{11, 4, 2, 6, 9, 5, 13} {-\[Infinity],23} 11 {23,37} 4 {37,43} 2 {43,53} 6 {53,67} 9 {67,83} 5 {83,\[Infinity]} 13 {3, 0, 44, 10, 16, 2, 28, 16, 6, 16, 59} {-\[Infinity],14} 3 {14,18} 0 {18,249} 44 {249,312} 10 {312,389} 16 {389,392} 2 {392,513} 28 {513,591} 16 {591,634} 6 {634,720} 16 {720,\[Infinity]} 59
Nim
<lang Nim>import algorithm, strformat
func binIt(limits, data: openArray[int]): seq[Natural] =
result.setLen(limits.len + 1) for d in data: inc result[limits.upperBound(d)]
proc binPrint(limits: openArray[int]; bins: seq[Natural]) =
echo &" < {limits[0]:3} := {bins[0]:3}" for i in 1..limits.high: echo &">= {limits[i-1]:3} .. < {limits[i]:3} := {bins[i]:3}" echo &">= {limits[^1]:3} := {bins[^1]:3}"
when isMainModule:
echo "Example 1:" const Limits1 = [23, 37, 43, 53, 67, 83] Data1 = [95, 21, 94, 12, 99, 4, 70, 75, 83, 93, 52, 80, 57, 5, 53, 86, 65, 17, 92, 83, 71, 61, 54, 58, 47, 16, 8, 9, 32, 84, 7, 87, 46, 19, 30, 37, 96, 6, 98, 40, 79, 97, 45, 64, 60, 29, 49, 36, 43, 55] let bins1 = binIt(Limits1, Data1) binPrint(Limits1, bins1)
echo "" echo "Example 2:" const Limits2 = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] Data2 = [445, 814, 519, 697, 700, 130, 255, 889, 481, 122, 932, 77, 323, 525, 570, 219, 367, 523, 442, 933, 416, 589, 930, 373, 202, 253, 775, 47, 731, 685, 293, 126, 133, 450, 545, 100, 741, 583, 763, 306, 655, 267, 248, 477, 549, 238, 62, 678, 98, 534, 622, 907, 406, 714, 184, 391, 913, 42, 560, 247, 346, 860, 56, 138, 546, 38, 985, 948, 58, 213, 799, 319, 390, 634, 458, 945, 733, 507, 916, 123, 345, 110, 720, 917, 313, 845, 426, 9, 457, 628, 410, 723, 354, 895, 881, 953, 677, 137, 397, 97, 854, 740, 83, 216, 421, 94, 517, 479, 292, 963, 376, 981, 480, 39, 257, 272, 157, 5, 316, 395, 787, 942, 456, 242, 759, 898, 576, 67, 298, 425, 894, 435, 831, 241, 989, 614, 987, 770, 384, 692, 698, 765, 331, 487, 251, 600, 879, 342, 982, 527, 736, 795, 585, 40, 54, 901, 408, 359, 577, 237, 605, 847, 353, 968, 832, 205, 838, 427, 876, 959, 686, 646, 835, 127, 621, 892, 443, 198, 988, 791, 466, 23, 707, 467, 33, 670, 921, 180, 991, 396, 160, 436, 717, 918, 8, 374, 101, 684, 727, 749] let bins2 = binIt(Limits2, Data2) binPrint(Limits2, bins2)</lang>
- Output:
Example 1: < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 Example 2: < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
Objective-C
<lang objc>#import <Foundation/Foundation.h>
NSArray<NSNumber *> *bins(NSArray<NSNumber *> *limits, NSArray<NSNumber *> *data) {
NSMutableArray<NSNumber *> *result = [[NSMutableArray alloc] initWithCapacity:[limits count] + 1]; for (NSInteger i = 0; i <= [limits count]; i++) { [result addObject:@0]; } for (NSNumber *n in data) { NSUInteger i = [limits indexOfObject:n inSortedRange:NSMakeRange(0, [limits count]) options:NSBinarySearchingInsertionIndex|NSBinarySearchingLastEqual usingComparator:^(NSNumber *x, NSNumber *y){ return [x compare: y]; }]; result[i] = @(result[i].integerValue + 1); } return result;
}
void printBins(NSArray<NSNumber *> *limits, NSArray<NSNumber *> *bins) {
NSUInteger n = [limits count]; if (n == 0) return; NSCAssert(n + 1 == [bins count], @"Wrong size of bins"); NSLog(@" < %3@: %2@", limits[0], bins[0]); for (NSInteger i = 1; i < n; i++) { NSLog(@">= %3@ and < %3@: %2@", limits[i-1], limits[i], bins[i]); } NSLog(@">= %3@ : %2@", limits[n-1], bins[n]);
}
int main(void) {
@autoreleasepool { NSArray<NSNumber *> *limits = @[@23, @37, @43, @53, @67, @83]; NSArray<NSNumber *> *data = @[ @95, @21, @94, @12, @99, @4, @70, @75, @83, @93, @52, @80, @57, @5, @53, @86, @65, @17, @92, @83, @71, @61, @54, @58, @47, @16, @8, @9, @32, @84, @7, @87, @46, @19, @30, @37, @96, @6, @98, @40, @79, @97, @45, @64, @60, @29, @49, @36, @43, @55];
NSLog(@"Example 1:"); printBins(limits, bins(limits, data));
limits = @[@14, @18, @249, @312, @389, @392, @513, @591, @634, @720]; data = @[ @445, @814, @519, @697, @700, @130, @255, @889, @481, @122, @932, @77, @323, @525, @570, @219, @367, @523, @442, @933, @416, @589, @930, @373, @202, @253, @775, @47, @731, @685, @293, @126, @133, @450, @545, @100, @741, @583, @763, @306, @655, @267, @248, @477, @549, @238, @62, @678, @98, @534, @622, @907, @406, @714, @184, @391, @913, @42, @560, @247, @346, @860, @56, @138, @546, @38, @985, @948, @58, @213, @799, @319, @390, @634, @458, @945, @733, @507, @916, @123, @345, @110, @720, @917, @313, @845, @426, @9, @457, @628, @410, @723, @354, @895, @881, @953, @677, @137, @397, @97, @854, @740, @83, @216, @421, @94, @517, @479, @292, @963, @376, @981, @480, @39, @257, @272, @157, @5, @316, @395, @787, @942, @456, @242, @759, @898, @576, @67, @298, @425, @894, @435, @831, @241, @989, @614, @987, @770, @384, @692, @698, @765, @331, @487, @251, @600, @879, @342, @982, @527, @736, @795, @585, @40, @54, @901, @408, @359, @577, @237, @605, @847, @353, @968, @832, @205, @838, @427, @876, @959, @686, @646, @835, @127, @621, @892, @443, @198, @988, @791, @466, @23, @707, @467, @33, @670, @921, @180, @991, @396, @160, @436, @717, @918, @8, @374, @101, @684, @727, @749];
NSLog(@""); NSLog(@"Example 2:"); printBins(limits, bins(limits, data)); } return 0;
}</lang>
- Output:
Example 1: < 23: 11 >= 23 and < 37: 4 >= 37 and < 43: 2 >= 43 and < 53: 6 >= 53 and < 67: 9 >= 67 and < 83: 5 >= 83 : 13 Example 2: < 14: 3 >= 14 and < 18: 0 >= 18 and < 249: 44 >= 249 and < 312: 10 >= 312 and < 389: 16 >= 389 and < 392: 2 >= 392 and < 513: 28 >= 513 and < 591: 16 >= 591 and < 634: 6 >= 634 and < 720: 16 >= 720 : 59
Perl
Borrowed bisect_right from Julia entry. <lang perl>use strict; use warnings; no warnings 'uninitialized'; use feature 'say'; use experimental 'signatures'; use constant Inf => 1e10;
my @tests = (
{ limits => [23, 37, 43, 53, 67, 83], data => [ 95,21,94,12,99,4,70,75,83,93,52,80,57, 5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96, 6,98,40,79,97,45,64,60,29,49,36,43,55 ] }, { limits => [14, 18, 249, 312, 389, 392, 513, 591, 634, 720], data => [ 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749 ] }
);
sub bisect_right ($x, $low, $high, @array) {
my ($middle); while ($low < $high) { $middle = ($low + $high) / 2; $x < $array[$middle] ? $high = $middle : ($low = $middle + 1) } $low-1
}
sub bin_it ($limits, $data) {
my @bins; ++$bins[ bisect_right($_, 0, @$limits-1, @$limits) ] for @$data; @bins
}
sub bin_format ($limits, @bins) {
my @lim = @$limits; my(@formatted); push @formatted, sprintf "[%3d, %3d) => %3d\n", $lim[$_], ($lim[$_+1] == Inf ? 'Inf' : $lim[$_+1]), $bins[$_] for 0..@lim-2; @formatted
}
for (0..$#tests) {
my @limits = (0, @{$tests[$_]{limits}}, Inf); say bin_format \@limits, bin_it(\@limits,\@{$tests[$_]{data}});
}</lang>
- Output:
[ 0, 23) => 11 [ 23, 37) => 4 [ 37, 43) => 2 [ 43, 53) => 6 [ 53, 67) => 9 [ 67, 83) => 5 [ 83, Inf) => 13 [ 0, 14) => 3 [ 14, 18) => 0 [ 18, 249) => 44 [249, 312) => 10 [312, 389) => 16 [389, 392) => 2 [392, 513) => 28 [513, 591) => 16 [591, 634) => 6 [634, 720) => 16 [720, Inf) => 59
But if we were to take to heart the warning that the input data was scary-big, then perhaps using a more efficient routine to classify the data into bins would be prudent (boilerplate/input/output same as above). <lang perl>use Math::SimpleHisto::XS;
for (@tests) {
my @lim = (0, @{$$_{limits}}, Inf); my $hist = Math::SimpleHisto::XS->new( bins => \@lim ); $hist->fill( \$$_{data}->@* ); my $data_bins = $hist->all_bin_contents; printf "[%3d, %3d) => %3d\n", $lim[$_], ($lim[$_+1] == Inf ? 'Inf' : $lim[$_+1]), $$data_bins[$_] for 0..@lim-2; print "\n";
}</lang>
Phix
<lang Phix>function bin_it(sequence limits, data)
-- Bin data according to (ascending) limits. sequence bins = repeat(0,length(limits)+1) -- adds under/over range bins too for i=1 to length(data) do integer bdx = binary_search(data[i],limits) bins[abs(bdx)+(bdx>0)] += 1 end for return bins
end function
procedure bin_print(sequence limits, bins)
printf(1," < %3d := %3d\n",{limits[1],bins[1]}) for i=2 to length(limits) do printf(1,">= %3d and < %3d := %3d\n",{limits[i-1],limits[i],bins[i]}) end for printf(1,">= %3d := %3d\n\n",{limits[$],bins[$]})
end procedure
sequence limits, data printf(1,"Example 1:\n") limits = {23, 37, 43, 53, 67, 83} data = {95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47,
16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55}
bin_print(limits, bin_it(limits, data))
printf(1,"Example 2:\n") limits = {14, 18, 249, 312, 389, 392, 513, 591, 634, 720} data = {445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,
416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749}
bin_print(limits, bin_it(limits, data))</lang>
- Output:
Example 1: < 23 := 11 >= 23 and < 37 := 4 >= 37 and < 43 := 2 >= 43 and < 53 := 6 >= 53 and < 67 := 9 >= 67 and < 83 := 5 >= 83 := 13 Example 2: < 14 := 3 >= 14 and < 18 := 0 >= 18 and < 249 := 44 >= 249 and < 312 := 10 >= 312 and < 389 := 16 >= 389 and < 392 := 2 >= 392 and < 513 := 28 >= 513 and < 591 := 16 >= 591 and < 634 := 6 >= 634 and < 720 := 16 >= 720 := 59
Python
This example uses binary search through the limits to assign each number to its bin, via standard module bisect.bisect_right.
The Counter module is not used as the number of bins is known allowing faster array access for incrementing bin counts versus dict lookup.
<lang python>from bisect import bisect_right
def bin_it(limits: list, data: list) -> list:
"Bin data according to (ascending) limits." bins = [0] * (len(limits) + 1) # adds under/over range bins too for d in data: bins[bisect_right(limits, d)] += 1 return bins
def bin_print(limits: list, bins: list) -> list:
print(f" < {limits[0]:3} := {bins[0]:3}") for lo, hi, count in zip(limits, limits[1:], bins[1:]): print(f">= {lo:3} .. < {hi:3} := {count:3}") print(f">= {limits[-1]:3} := {bins[-1]:3}")
if __name__ == "__main__":
print("RC FIRST EXAMPLE\n") limits = [23, 37, 43, 53, 67, 83] data = [95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55] bins = bin_it(limits, data) bin_print(limits, bins)
print("\nRC SECOND EXAMPLE\n") limits = [14, 18, 249, 312, 389, 392, 513, 591, 634, 720] data = [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749] bins = bin_it(limits, data) bin_print(limits, bins)</lang>
- Output:
RC FIRST EXAMPLE < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 RC SECOND EXAMPLE < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
R
This is R's bread and butter. Even with only the base library, the only thing stopping us from giving a one-line solution is the task's requirement of using two functions.
Code such as 0:length(limits) is generally considered bad practice, but it didn't cause any problems here. To my amazement, this code works even if limits is of size 0 or 1. Even the <= printing doesn't break! <lang r>limits1<-c(23, 37, 43, 53, 67, 83) data1<-c(95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47,
16,8,9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55)
limits2<-c(14, 18, 249, 312, 389, 392, 513, 591, 634, 720) data2<-c(445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,
416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749)
createBin<-function(limits,data) {
sapply(0:length(limits),function(x) sum(findInterval(data,limits)==x))
}
- Contains some unicode magic so that we can get the infinity symbol and <= to print nicely.
- Half of the battle here is making sure that we're not being thrown by R being 1-indexed.
- The other half is avoiding the mathematical sin of saying that anything can be >=infinity.
printBin<-function(limits,bin) {
invisible(sapply(0:length(limits),function(x) cat("Bin",x,"covers the range:", if(x==0){"-\U221E < x <"}else(paste(limits[x],"\U2264 x <")), if(x==length(limits)){"\U221E"}else(limits[x+1]), "and contains",bin[x+1],"elements.\n")))
}
- Showing off a one-line solution. Admittedly, calling the massive anonymous function "one-line" is generous.
oneLine<-function(limits,data) {
invisible(sapply(0:length(limits),function(x) cat("Bin",x,"covers the range:", if(x==0){"-\U221E < x <"}else(paste(limits[x],"\U2264 x <")), if(x==length(limits)){"\U221E"}else(limits[x+1]), "and contains",sum(findInterval(data,limits)==x), "elements.\n")))
}
createBin(limits1,data1) printBin(limits1,createBin(limits1,data1)) createBin(limits2,data2) printBin(limits2,createBin(limits2,data2)) oneLine(limits2,c(data1,data2))</lang>
- Output:
> createBin(limits1,data1) [1] 11 4 2 6 9 5 13 > printBin(limits1,createBin(limits1,data1)) Bin 0 covers the range: -∞ < x < 23 and contains 11 elements. Bin 1 covers the range: 23 ≤ x < 37 and contains 4 elements. Bin 2 covers the range: 37 ≤ x < 43 and contains 2 elements. Bin 3 covers the range: 43 ≤ x < 53 and contains 6 elements. Bin 4 covers the range: 53 ≤ x < 67 and contains 9 elements. Bin 5 covers the range: 67 ≤ x < 83 and contains 5 elements. Bin 6 covers the range: 83 ≤ x < ∞ and contains 13 elements. > createBin(limits2,data2) [1] 3 0 44 10 16 2 28 16 6 16 59 > printBin(limits2,createBin(limits2,data2)) Bin 0 covers the range: -∞ < x < 14 and contains 3 elements. Bin 1 covers the range: 14 ≤ x < 18 and contains 0 elements. Bin 2 covers the range: 18 ≤ x < 249 and contains 44 elements. Bin 3 covers the range: 249 ≤ x < 312 and contains 10 elements. Bin 4 covers the range: 312 ≤ x < 389 and contains 16 elements. Bin 5 covers the range: 389 ≤ x < 392 and contains 2 elements. Bin 6 covers the range: 392 ≤ x < 513 and contains 28 elements. Bin 7 covers the range: 513 ≤ x < 591 and contains 16 elements. Bin 8 covers the range: 591 ≤ x < 634 and contains 6 elements. Bin 9 covers the range: 634 ≤ x < 720 and contains 16 elements. Bin 10 covers the range: 720 ≤ x < ∞ and contains 59 elements. > oneLine(limits2,c(data1,data2))#Not needed. Bin 0 covers the range: -∞ < x < 14 and contains 10 elements. Bin 1 covers the range: 14 ≤ x < 18 and contains 2 elements. Bin 2 covers the range: 18 ≤ x < 249 and contains 85 elements. Bin 3 covers the range: 249 ≤ x < 312 and contains 10 elements. Bin 4 covers the range: 312 ≤ x < 389 and contains 16 elements. Bin 5 covers the range: 389 ≤ x < 392 and contains 2 elements. Bin 6 covers the range: 392 ≤ x < 513 and contains 28 elements. Bin 7 covers the range: 513 ≤ x < 591 and contains 16 elements. Bin 8 covers the range: 591 ≤ x < 634 and contains 6 elements. Bin 9 covers the range: 634 ≤ x < 720 and contains 16 elements. Bin 10 covers the range: 720 ≤ x < ∞ and contains 59 elements.
Raku
<lang perl6> sub bin_it ( @limits, @data ) {
my @ranges = ( -Inf, |@limits, Inf ).rotor( 2 => -1 ).map: { .[0] ..^ .[1] }; my @binned = @data.classify(-> $d { @ranges.grep(-> $r { $d ~~ $r }) }); my %counts = @binned.map: { .key => .value.elems }; return @ranges.map: { $_ => ( %counts{$_} // 0 ) };
} sub bin_format ( @bins ) {
return @bins.map: { .key.gist.fmt('%9s => ') ~ .value.fmt('%2d') };
}
my @tests =
{ limits => (23, 37, 43, 53, 67, 83), data => (95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47,16,8,9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55), }, { limits => (14, 18, 249, 312, 389, 392, 513, 591, 634, 720), data => ( 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933,416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247,346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97,854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692,698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791,466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749 ), },
for @tests -> ( :@limits, :@data ) {
my @bins = bin_it( @limits, @data ); .say for bin_format(@bins); say ;
} </lang>
- Output:
-Inf..^23 => 11 23..^37 => 4 37..^43 => 2 43..^53 => 6 53..^67 => 9 67..^83 => 5 83..^Inf => 13 -Inf..^14 => 3 14..^18 => 0 18..^249 => 44 249..^312 => 10 312..^389 => 16 389..^392 => 2 392..^513 => 28 513..^591 => 16 591..^634 => 6 634..^720 => 16 720..^Inf => 59
REXX
<lang rexx>/*REXX program counts how many numbers of a set that fall in the range of each bin. */ lims= 23 37 43 53 67 83 /* ◄■■■■■■1st set of bin limits & data.*/ data= 95 21 94 12 99 4 70 75 83 93 52 80 57 5 53 86 65 17 92 83 71 61 54 58 47 ,
16 8 9 32 84 7 87 46 19 30 37 96 6 98 40 79 97 45 64 60 29 49 36 43 55
call lims lims; call bins data call show 'the 1st set of bin counts for the specified data:'
say; say; say
lims= 14 18 249 312 389 392 513 591 634 720 /* ◄■■■■■■2nd set of bin limits & data.*/ data= 445 814 519 697 700 130 255 889 481 122 932 77 323 525 570 219 367 523 442 933 ,
416 589 930 373 202 253 775 47 731 685 293 126 133 450 545 100 741 583 763 306 , 655 267 248 477 549 238 62 678 98 534 622 907 406 714 184 391 913 42 560 247 , 346 860 56 138 546 38 985 948 58 213 799 319 390 634 458 945 733 507 916 123 , 345 110 720 917 313 845 426 9 457 628 410 723 354 895 881 953 677 137 397 97 , 854 740 83 216 421 94 517 479 292 963 376 981 480 39 257 272 157 5 316 395 , 787 942 456 242 759 898 576 67 298 425 894 435 831 241 989 614 987 770 384 692 , 698 765 331 487 251 600 879 342 982 527 736 795 585 40 54 901 408 359 577 237 , 605 847 353 968 832 205 838 427 876 959 686 646 835 127 621 892 443 198 988 791 , 466 23 707 467 33 670 921 180 991 396 160 436 717 918 8 374 101 684 727 749
call lims lims; call bins data call show 'the 2nd set of bin counts for the specified data:' exit 0 /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ bins: parse arg nums; !.= 0; datum= words(nums); wc= length(datum) /*max width count.*/
do j=1 for datum; x= word(nums, j) do k=0 for # /*find the bin that this number is in. */ if x < @.k then do; !.k= !.k + 1; iterate j; end /*bump a bin count*/ end /*k*/ !.k= !.k + 1 /*number is > the highest bin specified*/ end /*j*/; return
/*──────────────────────────────────────────────────────────────────────────────────────*/ lims: parse arg limList; #= words(limList); wb= 0 /*max width binLim*/
do j=1 for #; _= j - 1; @._= word(limList, j); wb= max(wb, length(@._) ) end /*j*/; return
/*──────────────────────────────────────────────────────────────────────────────────────*/ show: parse arg t; say center(t, 51 ); $= left(, 9) /*$: for indentation*/
say center(, 51, "═") /*show title separator.*/ jp= # - 1; ge= '≥'; le='<'; eq= ' count=' do j=0 for #; jm= j - 1; bin= right(@.j, wb) if j==0 then say $ left(, length(ge) +3+wb+length(..) )le bin eq right(!.j, wc) else say $ ge right(@.jm, wb) .. le bin eq right(!.j, wc) if j==jp then say $ ge right(@.jp,wb) left(, 3+length(..)+wb) eq right(!.#, wc) end /*j*/; return</lang>
- output when using the internal default input:
the 1st set of bin counts for the specified data: ═══════════════════════════════════════════════════ < 23 count= 11 ≥ 23 .. < 37 count= 4 ≥ 37 .. < 43 count= 2 ≥ 43 .. < 53 count= 6 ≥ 53 .. < 67 count= 9 ≥ 67 .. < 83 count= 5 ≥ 83 count= 13 the 2nd set of bin counts for the specified data: ═══════════════════════════════════════════════════ < 14 count= 3 ≥ 14 .. < 18 count= 0 ≥ 18 .. < 249 count= 44 ≥ 249 .. < 312 count= 10 ≥ 312 .. < 389 count= 16 ≥ 389 .. < 392 count= 2 ≥ 392 .. < 513 count= 28 ≥ 513 .. < 591 count= 16 ≥ 591 .. < 634 count= 6 ≥ 634 .. < 720 count= 16 ≥ 720 count= 59
Ruby
Perform a binary search on the data to select the limit and keep a tally on that. Uses Ruby 3.0 end-less and begin-less Ranges. <lang ruby>Test = Struct.new(:limits, :data) tests = Test.new( [23, 37, 43, 53, 67, 83],
[95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55]), Test.new( [14, 18, 249, 312, 389, 392, 513, 591, 634, 720], [445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749])
def bin(limits, data)
data.map{|d| limits.bsearch{|limit| limit > d} }.tally
end
def present_bins(limits, bins)
ranges = ([nil]+limits+[nil]).each_cons(2).map{|low, high| Range.new(low, high, true) } ranges.each{|range| puts "#{range.to_s.ljust(12)} #{bins[range.end].to_i}"}
end
tests.each do |test|
present_bins(test.limits, bin(test.limits, test.data)) puts
end </lang>
- Output:
...23 11 23...37 4 37...43 2 43...53 6 53...67 9 67...83 5 83... 13 ...14 3 14...18 0 18...249 44 249...312 10 312...389 16 389...392 2 392...513 28 513...591 16 591...634 6 634...720 16 720... 59
Rust
A very simple and naive algorithm that uses nested dynamic arrays.
<lang rust>fn make_bins(limits: &Vec<usize>, data: &Vec<usize>) -> Vec<Vec<usize>> {
let mut bins: Vec<Vec<usize>> = Vec::with_capacity(limits.len() + 1); for _ in 0..=limits.len() {bins.push(Vec::new());}
limits.iter().enumerate().for_each(|(idx, limit)| { data.iter().for_each(|elem| { if idx == 0 && elem < limit { bins[0].push(*elem); } // smaller than the smallest limit else if idx == limits.len()-1 && elem >= limit { bins[limits.len()].push(*elem); } // larger than the largest limit else if elem < limit && elem >= &limits[idx-1] { bins[idx].push(*elem); } // otherwise }); });
bins
}
fn print_bins(limits: &Vec<usize>, bins: &Vec<Vec<usize>>) {
for (idx, bin) in bins.iter().enumerate() { if idx == 0 { println!(" < {:3} := {:3}", limits[idx], bin.len()); } else if idx == limits.len() { println!(">= {:3} := {:3}", limits[idx-1], bin.len()); }else { println!(">= {:3} .. < {:3} := {:3}", limits[idx-1], limits[idx], bin.len()); } };
}
fn main() {
let limits1 = vec![23, 37, 43, 53, 67, 83]; let data1 = vec![95,21,94,12,99,4,70,75,83,93,52,80,57,5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96,6,98,40,79,97,45,64,60,29,49,36,43,55];
let limits2 = vec![14, 18, 249, 312, 389, 392, 513, 591, 634, 720]; let data2 = vec![ 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749 ];
// Why are we calling it RC anyways??? println!("RC FIRST EXAMPLE"); let bins1 = make_bins(&limits1, &data1); print_bins(&limits1, &bins1);
println!("\nRC SECOND EXAMPLE"); let bins2 = make_bins(&limits2, &data2); print_bins(&limits2, &bins2);
} </lang>
- Output:
RC FIRST EXAMPLE < 23 := 11 >= 23 .. < 37 := 4 >= 37 .. < 43 := 2 >= 43 .. < 53 := 6 >= 53 .. < 67 := 9 >= 67 .. < 83 := 5 >= 83 := 13 RC SECOND EXAMPLE < 14 := 3 >= 14 .. < 18 := 0 >= 18 .. < 249 := 44 >= 249 .. < 312 := 10 >= 312 .. < 389 := 16 >= 389 .. < 392 := 2 >= 392 .. < 513 := 28 >= 513 .. < 591 := 16 >= 591 .. < 634 := 6 >= 634 .. < 720 := 16 >= 720 := 59
Tcl
For Tcl 8.6 (due to lsearch -bisect
):
<lang tcl>namespace path {::tcl::mathop ::tcl::mathfunc}
- Not necessary but useful helper
proc lincr {_list index} { upvar $_list list lset list $index [+ [lindex $list $index] 1] }
proc distribute_bins {binlims data} { set bins [lrepeat [+ [llength $binlims] 1] 0] foreach val $data { lincr bins [+ [lsearch -exact -integer -sorted -bisect $binlims $val] 1] } return $bins }
proc print_bins {binlims bins} { set binlims [list -∞ {*}$binlims ∞] for {set i 0} {$i < [llength $bins]} {incr i} { puts "[lindex $binlims $i]..[lindex $binlims [+ $i 1]]: [lindex $bins $i]" } }
set binlims {23 37 43 53 67 83} set data {95 21 94 12 99 4 70 75 83 93 52 80 57 5 53 86 65 17 92 83 71 61 54 58 47
16 8 9 32 84 7 87 46 19 30 37 96 6 98 40 79 97 45 64 60 29 49 36 43 55}
print_bins $binlims [distribute_bins $binlims $data] puts ""
set binlims {14 18 249 312 389 392 513 591 634 720} set data {445 814 519 697 700 130 255 889 481 122 932 77 323 525 570 219 367 523 442 933
416 589 930 373 202 253 775 47 731 685 293 126 133 450 545 100 741 583 763 306 655 267 248 477 549 238 62 678 98 534 622 907 406 714 184 391 913 42 560 247 346 860 56 138 546 38 985 948 58 213 799 319 390 634 458 945 733 507 916 123 345 110 720 917 313 845 426 9 457 628 410 723 354 895 881 953 677 137 397 97 854 740 83 216 421 94 517 479 292 963 376 981 480 39 257 272 157 5 316 395 787 942 456 242 759 898 576 67 298 425 894 435 831 241 989 614 987 770 384 692 698 765 331 487 251 600 879 342 982 527 736 795 585 40 54 901 408 359 577 237 605 847 353 968 832 205 838 427 876 959 686 646 835 127 621 892 443 198 988 791 466 23 707 467 33 670 921 180 991 396 160 436 717 918 8 374 101 684 727 749}
print_bins $binlims [distribute_bins $binlims $data]</lang>
- Output:
-∞..23: 11 23..37: 4 37..43: 2 43..53: 6 53..67: 9 67..83: 5 83..∞: 13 -∞..14: 3 14..18: 0 18..249: 44 249..312: 10 312..389: 16 389..392: 2 392..513: 28 513..591: 16 591..634: 6 634..720: 16 720..∞: 59
Wren
<lang ecmascript>import "/sort" for Find import "/fmt" for Fmt
var getBins = Fn.new { |limits, data|
var n = limits.count var bins = List.filled(n+1, 0) for (d in data) { var res = Find.all(limits, d) // uses binary search var found = res[0] var index = res[2].from if (found) index = index + 1 bins[index] = bins[index] + 1 } return bins
}
var printBins = Fn.new { |limits, bins|
for (i in 0..limits.count) { if (i == 0) { Fmt.print(" < $3d = $2d", limits[0], bins[0]) } else if (i == limits.count) { Fmt.print(">= $3d = $2d", limits[i-1], bins[i]) } else { Fmt.print(">= $3d and < $3d = $2d", limits[i-1], limits[i], bins[i]) } } System.print()
}
var limitsList = [
[23, 37, 43, 53, 67, 83], [14, 18, 249, 312, 389, 392, 513, 591, 634, 720]
]
var dataList = [
[ 95,21,94,12,99,4,70,75,83,93,52,80,57, 5,53,86,65,17,92,83,71,61,54,58,47, 16, 8, 9,32,84,7,87,46,19,30,37,96, 6,98,40,79,97,45,64,60,29,49,36,43,55 ], [ 445,814,519,697,700,130,255,889,481,122,932, 77,323,525,570,219,367,523,442,933, 416,589,930,373,202,253,775, 47,731,685,293,126,133,450,545,100,741,583,763,306, 655,267,248,477,549,238, 62,678, 98,534,622,907,406,714,184,391,913, 42,560,247, 346,860, 56,138,546, 38,985,948, 58,213,799,319,390,634,458,945,733,507,916,123, 345,110,720,917,313,845,426, 9,457,628,410,723,354,895,881,953,677,137,397, 97, 854,740, 83,216,421, 94,517,479,292,963,376,981,480, 39,257,272,157, 5,316,395, 787,942,456,242,759,898,576, 67,298,425,894,435,831,241,989,614,987,770,384,692, 698,765,331,487,251,600,879,342,982,527,736,795,585, 40, 54,901,408,359,577,237, 605,847,353,968,832,205,838,427,876,959,686,646,835,127,621,892,443,198,988,791, 466, 23,707,467, 33,670,921,180,991,396,160,436,717,918, 8,374,101,684,727,749 ]
]
for (i in 0...limitsList.count) {
System.print("Example %(i+1):\n") var bins = getBins.call(limitsList[i], dataList[i]) printBins.call(limitsList[i], bins)
}</lang>
- Output:
Example 1: < 23 = 11 >= 23 and < 37 = 4 >= 37 and < 43 = 2 >= 43 and < 53 = 6 >= 53 and < 67 = 9 >= 67 and < 83 = 5 >= 83 = 13 Example 2: < 14 = 3 >= 14 and < 18 = 0 >= 18 and < 249 = 44 >= 249 and < 312 = 10 >= 312 and < 389 = 16 >= 389 and < 392 = 2 >= 392 and < 513 = 28 >= 513 and < 591 = 16 >= 591 and < 634 = 6 >= 634 and < 720 = 16 >= 720 = 59