Search a list of records: Difference between revisions

From Rosetta Code
Content added Content deleted
m (→‎{{header|zkl}}: yet more verbage)
Line 206: Line 206:
where a SD is a small read only dictionary and filter1 is a filter that stops at the first match (returning the matched item), filter1N is the same but returns the index. Both filter methods return False on failure.
where a SD is a small read only dictionary and filter1 is a filter that stops at the first match (returning the matched item), filter1N is the same but returns the index. Both filter methods return False on failure.


Note that the dictionarys contain either int or float values and the filter could use either int or float.
Note that the dictionarys contain either int or float values and the filter could use either int or float (in this case, the 5 is converted to the type in the dictionary).


The YAJL library could be used to parse the JSON data directly.
The YAJL library could be used to parse the JSON data directly.

Revision as of 03:39, 11 October 2015

Search a list of records is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Given a vector type (such as list or array), which may be homogeneous or heterogeneous, search for something. Demonstrate by searching for at least two types of things (which may require multiple examples).

For one of your examples, you should:

  1. Show an ordered list or array in which each element has two properties: one string and one numeric. For example, use the names and populations of the 10 most populous municipalities in Africa, as reported in https://en.wikipedia.org/wiki/List_of_metropolitan_areas_in_Africa
  2. Return the index (in this list/array) of a particular city as returned by a search on its name
  3. Return the name of the first city with a population exceeding a particular threshold.

For example, from an array of the African municipality data (population in millions), shown here in a JSON format (but use whatever representation is most idiomatic in your language):

<lang JavaScript>[{"name":"Lagos", "population":21}, {"name":"Cairo", "population":15.2}, {"name":"Kinshasa-Brazzaville", "population":11.3}, {"name":"Greater Johannesburg", "population":7.55}, {"name":"Mogadishu", "population":5.85}, {"name":"Khartoum-Omdurman", "population":4.98}, {"name":"Dar Es Salaam", "population":4.7}, {"name":"Alexandria", "population":4.58}, {"name":"Abidjan", "population":4.4}, {"name":"Casablanca", "population":3.98}]</lang>

Show code which uses general search functions to return:

  1. . The (zero-based) index of Dar Es Salaam in the list,
  2. . The name of the first city in this list (sorted by descending size of municipality) which has a population of less than 5 million.

The general search function(s) which you demonstrate should normally take at least two arguments:

  1. The vector (array/list) itself
  2. A predicate function which, applied to an element in the list/array, returns true if that element matches the search requirement.

( If this is not the approach which would be most natural or idiomatic in your language, explain why, and show an alternative, but equally general approach to obtaining the same results )


J

J supports several "searching" primitives.

i. finds the indices of the things being looked for (with 0 being the first index). Nonmatches get a result of 1+largest valid index.

<lang j> 1 2 3 4 5 6 7 8 9 i. 2 3 5 60 1 2 4 9

  (;:'one two three four five six seven eight nine') i. ;:'two three five sixty'

1 2 4 9</lang>

e. finds whether items are members of a set, returning a bitmask to select the members:

<lang j> 1 2 3 4 5 6 7 8 9 e. 2 3 5 60 0 1 1 0 1 0 0 0 0

  (;:'one two three four five six seven eight nine') e. ;:'two three five sixty'

0 1 1 0 1 0 0 0 0</lang>

I. finds indices, but performs a binary search (which requires that the list being searched is sorted). This can be useful for finding non-exact matches (the index of the next value is returned for non-exact matches).

<lang j> 1 2 3 4 5 6 7 8 9 I. 2 3 5 60 6.66 1 2 4 9 6

  (;:'eight five four nine one seven six three two') I. ;:'two three five sixty'

8 7 1 7</lang>


JavaScript

ES5

For arrays containing simple string and numeric datatypes, JavaScript provides Array.toIndex().

Note that JS uses two different equality operators for simple types:

  1. Under Abstract equality, 3.0 == '3' -> true (http://www.ecma-international.org/ecma-262/5.1/#sec-9.12)
  2. Under Strict equality, 3.0 === '3' -> false http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.6

and Array.toIndex matches only on Strict equality. Therefore:

<lang JavaScript>(function () {

 var blnAbstractEquality = (3.0 == '3'), // true  http://www.ecma-international.org/ecma-262/5.1/#sec-9.12
     blnStrictEquality = (3.0 === '3'); // false;  http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.6
 var lstNumerics = [1, 1.0, '1', 2, 2.0, '2', 3, 3.0, '3'];
 return [
   blnAbstractEquality,
   blnStrictEquality,
   lstNumerics.indexOf(3.0),
   lstNumerics.indexOf('3')
 ]

})();</lang>

Returns:

<lang JavaScript>[

 true,
 false,
 6,
 8

]</lang>

Strict Equality does not, however, return true for two instances of objects which match identically in terms of their keys and values. This means that Array.indexOf() will alway return a -1 value (not found) in searches for objects other than simple strings and numbers.

To find more complex objects in a JS ES5 array, we can define find(), and findIndex(), which take predicate functions (defining the match required) as arguments:

The following code:

<lang JavaScript>(function (fnNameMatch, fnPopulationMatch) {

   function find(fnPredicate, list) {
     for (var i = 0, lng = list.length; i < lng; i++) {
       if (fnPredicate(list[i])) {
         return list[i];
       }
     }
     return undefined;
   };
   function findIndex(fnPredicate, list) {
     for (var i = 0, lng = list.length; i < lng; i++) {
       if (fnPredicate(list[i])) {
         return i;
       }
     }
     return undefined;
   };
   var lstCities = [{
     "name": "Lagos",
     "population": 21
   }, {
     "name": "Cairo",
     "population": 15.2
   }, {
     "name": "Kinshasa-Brazzaville",
     "population": 11.3
   }, {
     "name": "Greater Johannesburg",
     "population": 7.55
   }, {
     "name": "Mogadishu",
     "population": 5.85
   }, {
     "name": "Khartoum-Omdurman",
     "population": 4.98
   }, {
     "name": "Dar Es Salaam",
     "population": 4.7
   }, {
     "name": "Alexandria",
     "population": 4.58
   }, {
     "name": "Abidjan",
     "population": 4.4
   }, {
     "name": "Casablanca",
     "population": 3.98
 }];
   return [
     lstCities.indexOf({
       "name": "Alexandria",
       "population": 4.58
     }),
     find(fnNameMatch, lstCities),
     findIndex(fnNameMatch, lstCities),
     find(fnPopulationMatch, lstCities),
     findIndex(fnPopulationMatch, lstCities)
   ]
 })(
   function (e) {
     return e.name && e.name.toLowerCase() === 'dar es salaam';
   },
   function (e) {
     return e.population && e.population < 5.0;
   }
 );

})();</lang>

returns: <lang JavaScript>[

 -1, // the Alexandria object can not be found with Array.indexOf()
 {
   "name": "Dar Es Salaam",
   "population": 4.7
 },
 6,
 {
   "name": "Khartoum-Omdurman",
   "population": 4.98
 },
 5

]</lang>

ES6

In ES6, Array.find() and Array.findIndex() are provided as built-in methods.

Perl

<lang perl>if(grep $_ eq $needle, @haystack) { print 'Found'; }</lang>

PHP

<lang PHP>echo in_array('needle', ['hay', 'stack']) ? 'Found' : 'Not found';</lang>

zkl

<lang zkl>list:=T(SD("name","Lagos", "population",21), SD("name","Cairo", "population",15.2), SD("name","Kinshasa-Brazzaville", "population",11.3), SD("name","Greater Johannesburg", "population",7.55), SD("name","Mogadishu", "population",5.85), SD("name","Khartoum-Omdurman", "population",4.98), SD("name","Dar Es Salaam", "population",4.7), SD("name","Alexandria", "population",4.58), SD("name","Abidjan", "population",4.4), SD("name","Casablanca", "population",3.98));

n:=list.filter1n(fcn(city){ city["name"]=="Dar Es Salaam" }); n.println(" == index of ",list[n].values);

city:=list.filter1(fcn(city){ city["population"]<5 }); city.values.println(" is the first city with population under 5 million.");</lang> where a SD is a small read only dictionary and filter1 is a filter that stops at the first match (returning the matched item), filter1N is the same but returns the index. Both filter methods return False on failure.

Note that the dictionarys contain either int or float values and the filter could use either int or float (in this case, the 5 is converted to the type in the dictionary).

The YAJL library could be used to parse the JSON data directly.

Output:
6 == index of L("Dar Es Salaam",4.7)
L("Khartoum-Omdurman",4.98) is the first city with population under 5 million.