Search a list of records: Difference between revisions

From Rosetta Code
Content added Content deleted
m (Oops, was already a J implementation for that...)
Line 89: Line 89:


But this doesn't seem any clearer or more concise than our previous expression which finds the index of the first example of the most populous city with a population less than five million. Not only that, but if there were multiple cities which had the same population number which satisfied this constraint, this version would return all of those indices where the task explicitly required we return the first example.
But this doesn't seem any clearer or more concise than our previous expression which finds the index of the first example of the most populous city with a population less than five million. Not only that, but if there were multiple cities which had the same population number which satisfied this constraint, this version would return all of those indices where the task explicitly required we return the first example.

=== J: Another approach ====


<lang j> city=: <;._1 ';Lagos;Cairo;Kinshasa-Brazzaville;Greater Johannesburg;Mogadishu;Khartoum-Omdurman;Dar Es Salaam;Alexandria;Abidjan;Casablanca'
<lang j> city=: <;._1 ';Lagos;Cairo;Kinshasa-Brazzaville;Greater Johannesburg;Mogadishu;Khartoum-Omdurman;Dar Es Salaam;Alexandria;Abidjan;Casablanca'

Revision as of 06:01, 11 October 2015

Search a list of records is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Given a vector type (such as list or array), which may be homogeneous or heterogeneous, search for something. Demonstrate by searching for at least two types of things (which may require multiple examples).

For one of your examples, you should:

  1. Show an ordered list or array in which each element has two properties: one string and one numeric. For example, use the names and populations of the 10 most populous municipalities in Africa, as reported in https://en.wikipedia.org/wiki/List_of_metropolitan_areas_in_Africa
  2. Return the index (in this list/array) of a particular city as returned by a search on its name
  3. Return the name of the first city with a population exceeding a particular threshold.

For example, from an array of the African municipality data (population in millions), shown here in a JSON format (but use whatever representation is most idiomatic in your language):

<lang JavaScript>[{"name":"Lagos", "population":21}, {"name":"Cairo", "population":15.2}, {"name":"Kinshasa-Brazzaville", "population":11.3}, {"name":"Greater Johannesburg", "population":7.55}, {"name":"Mogadishu", "population":5.85}, {"name":"Khartoum-Omdurman", "population":4.98}, {"name":"Dar Es Salaam", "population":4.7}, {"name":"Alexandria", "population":4.58}, {"name":"Abidjan", "population":4.4}, {"name":"Casablanca", "population":3.98}]</lang>

Show code which uses general search functions to return:

  1. . The (zero-based) index of Dar Es Salaam in the list,
  2. . The name of the first city in this list (sorted by descending size of municipality) which has a population of less than 5 million.

The general search function(s) which you demonstrate should normally take at least two arguments:

  1. The vector (array/list) itself
  2. A predicate function which, applied to an element in the list/array, returns true if that element matches the search requirement.

( If this is not the approach which would be most natural or idiomatic in your language, explain why, and show an alternative, but equally general approach to obtaining the same results )


J

J supports several "searching" primitives.

i. finds the indices of the things being looked for (with 0 being the first index). Nonmatches get a result of 1+largest valid index.

<lang j> 1 2 3 4 5 6 7 8 9 i. 2 3 5 60 1 2 4 9

  (;:'one two three four five six seven eight nine') i. ;:'two three five sixty'

1 2 4 9</lang>

e. finds whether items are members of a set, returning a bitmask to select the members:

<lang j> 1 2 3 4 5 6 7 8 9 e. 2 3 5 60 0 1 1 0 1 0 0 0 0

  (;:'one two three four five six seven eight nine') e. ;:'two three five sixty'

0 1 1 0 1 0 0 0 0</lang>

I. finds indices, but performs a binary search (which requires that the list being searched is sorted). This can be useful for finding non-exact matches (the index of the next value is returned for non-exact matches).

<lang j> 1 2 3 4 5 6 7 8 9 I. 2 3 5 60 6.66 1 2 4 9 6

  (;:'eight five four nine one seven six three two') I. ;:'two three five sixty'

8 7 1 7</lang>

And, for the tabular example in the current task description, here is the data we will be using:

<lang J>colnumeric=: 0&".&.>@{`[`]}

data=: 1 colnumeric |: fixcsv 0 :0 Lagos, 21 Cairo, 15.2 Kinshasa-Brazzaville, 11.3 Greater Johannesburg, 7.55 Mogadishu, 5.85 Khartoum-Omdurman, 4.98 Dar Es Salaam, 4.7 Alexandria, 4.58 Abidjan, 4.4 Casablanca, 3.98 )</lang>

And here are the required computations:

<lang J> (0 { data) i. <'Dar Es Salaam' 6

  (i. >./)@(* 5&>)@:>@{: data

5

  5 {:: 0 {data

Khartoum-Omdurman</lang>

The "general search function" mentioned in the task does not seem a natural fit for this set of data, because of the multi-column nature of this data. Nevertheless, we could for example define:

<lang j>gsf=: 1 :0

  I. u x { y

)</lang>

This uses the single argument aspect of the definition of I. to convert a bit mask to the corresponding sequence of indices. And the column(s) we are searching on are exposed as a parameter for the interface, which allows us to ignore (for this problem) the irrelevant columns...

Thus, we could say:

<lang J> 1 (= >./)@(* 5&>)@:> gsf data 5</lang>

But this doesn't seem any clearer or more concise than our previous expression which finds the index of the first example of the most populous city with a population less than five million. Not only that, but if there were multiple cities which had the same population number which satisfied this constraint, this version would return all of those indices where the task explicitly required we return the first example.

J: Another approach =

<lang j> city=: <;._1 ';Lagos;Cairo;Kinshasa-Brazzaville;Greater Johannesburg;Mogadishu;Khartoum-Omdurman;Dar Es Salaam;Alexandria;Abidjan;Casablanca'

  popln=: 21 15.2 11.3 7.55 5.85 4.98 4.7 4.58 4.4 3.98
  city i. <'Dar Es Salaam'            NB. index of Dar Es Salaam

6

  city {~ (popln < 5) {.@# \: popln   NB. name of first city with population less than 5 million

┌─────────────────┐ │Khartoum-Omdurman│ └─────────────────┘</lang>

JavaScript

ES5

For arrays containing simple string and numeric datatypes, JavaScript provides Array.toIndex().

Note that JS uses two different equality operators for simple types:

  1. Under Abstract equality, 3.0 == '3' -> true (http://www.ecma-international.org/ecma-262/5.1/#sec-9.12)
  2. Under Strict equality, 3.0 === '3' -> false http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.6

and Array.toIndex matches only on Strict equality. Therefore:

<lang JavaScript>(function () {

 var blnAbstractEquality = (3.0 == '3'), // true  http://www.ecma-international.org/ecma-262/5.1/#sec-9.12
     blnStrictEquality = (3.0 === '3'); // false;  http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.6
 var lstNumerics = [1, 1.0, '1', 2, 2.0, '2', 3, 3.0, '3'];
 return [
   blnAbstractEquality,
   blnStrictEquality,
   lstNumerics.indexOf(3.0),
   lstNumerics.indexOf('3')
 ]

})();</lang>

Returns:

<lang JavaScript>[

 true,
 false,
 6,
 8

]</lang>

Strict Equality does not, however, return true for two instances of objects which match identically in terms of their keys and values. This means that Array.indexOf() will alway return a -1 value (not found) in searches for objects other than simple strings and numbers.

To find more complex objects in a JS ES5 array, we can define find(), and findIndex(), which take predicate functions (defining the match required) as arguments:

The following code:

<lang JavaScript>(function (fnNameMatch, fnPopulationMatch) {

   function find(fnPredicate, list) {
     for (var i = 0, lng = list.length; i < lng; i++) {
       if (fnPredicate(list[i])) {
         return list[i];
       }
     }
     return undefined;
   };
   function findIndex(fnPredicate, list) {
     for (var i = 0, lng = list.length; i < lng; i++) {
       if (fnPredicate(list[i])) {
         return i;
       }
     }
     return undefined;
   };
   var lstCities = [{
     "name": "Lagos",
     "population": 21
   }, {
     "name": "Cairo",
     "population": 15.2
   }, {
     "name": "Kinshasa-Brazzaville",
     "population": 11.3
   }, {
     "name": "Greater Johannesburg",
     "population": 7.55
   }, {
     "name": "Mogadishu",
     "population": 5.85
   }, {
     "name": "Khartoum-Omdurman",
     "population": 4.98
   }, {
     "name": "Dar Es Salaam",
     "population": 4.7
   }, {
     "name": "Alexandria",
     "population": 4.58
   }, {
     "name": "Abidjan",
     "population": 4.4
   }, {
     "name": "Casablanca",
     "population": 3.98
 }];
   return [
     lstCities.indexOf({
       "name": "Alexandria",
       "population": 4.58
     }),
     find(fnNameMatch, lstCities),
     findIndex(fnNameMatch, lstCities),
     find(fnPopulationMatch, lstCities),
     findIndex(fnPopulationMatch, lstCities)
   ]
 })(
   function (e) {
     return e.name && e.name.toLowerCase() === 'dar es salaam';
   },
   function (e) {
     return e.population && e.population < 5.0;
   }
 );

})();</lang>

returns: <lang JavaScript>[

 -1, // the Alexandria object can not be found with Array.indexOf()
 {
   "name": "Dar Es Salaam",
   "population": 4.7
 },
 6,
 {
   "name": "Khartoum-Omdurman",
   "population": 4.98
 },
 5

]</lang>

ES6

In ES6, Array.find() and Array.findIndex() are provided as built-in methods.

Perl

<lang perl>if(grep $_ eq $needle, @haystack) { print 'Found'; }</lang>

PHP

<lang PHP>echo in_array('needle', ['hay', 'stack']) ? 'Found' : 'Not found';</lang>

zkl

<lang zkl>list:=T(SD("name","Lagos", "population",21), SD("name","Cairo", "population",15.2), SD("name","Kinshasa-Brazzaville", "population",11.3), SD("name","Greater Johannesburg", "population",7.55), SD("name","Mogadishu", "population",5.85), SD("name","Khartoum-Omdurman", "population",4.98), SD("name","Dar Es Salaam", "population",4.7), SD("name","Alexandria", "population",4.58), SD("name","Abidjan", "population",4.4), SD("name","Casablanca", "population",3.98));

n:=list.filter1n(fcn(city){ city["name"]=="Dar Es Salaam" });

  // or city["name"].matches("dar es salaam") for case insensitive wild card match

n.println(" == index of ",list[n].values);

city:=list.filter1(fcn(city){ city["population"]<5 }); city.values.println(" is the first city with population under 5 million.");</lang> where a SD is a small read only dictionary and filter1 is a filter that stops at the first match (returning the matched item), filter1N is the same but returns the index. Both filter methods return False on failure.

Note that the dictionarys contain either int or float values and the filter could use either int or float (in this case, the 5 is converted to the type in the dictionary).

The YAJL library could be used to parse the JSON data directly.

Output:
6 == index of L("Dar Es Salaam",4.7)
L("Khartoum-Omdurman",4.98) is the first city with population under 5 million.