Search a list of records

Many programming languages provide convenient ways to look for a known value in a simple list of strings or list of numbers.
But what if the elements of the list are themselves compound records/objects/data-structures, and the search condition is more complex than a simple equality test?

Task

Write a function/method/etc. that can find the first element in a given list of records/objects matching a given condition.
It should be as generic and reusable as possible.
(Of course if your programming language already provides such a feature, you can use that instead of recreating it.)

Then to demonstrate its functionality, create the data structure specified below, and perform the following searches on it:

Find the (zero-based) index of the first city in the list whose name is "Dar Es Salaam".
Find the name of the first city in this list whose population is less than 5 million.
Find the population of the first city in this list whose name starts with the letter "A".

Data set

The data structure to be used contains the names and populations of the 10 largest metropolitan areas in Africa, and looks as follows when represented in JSON:

<lang JavaScript>[

 { "name": "Lagos",                "population": 21.0  },
 { "name": "Cairo",                "population": 15.2  },
 { "name": "Kinshasa-Brazzaville", "population": 11.3  },
 { "name": "Greater Johannesburg", "population":  7.55 },
 { "name": "Mogadishu",            "population":  5.85 },
 { "name": "Khartoum-Omdurman",    "population":  4.98 },
 { "name": "Dar Es Salaam",        "population":  4.7  },
 { "name": "Alexandria",           "population":  4.58 },
 { "name": "Abidjan",              "population":  4.4  },
 { "name": "Casablanca",           "population":  3.98 }

]</lang>

However, you shouldn't parse it from JSON, but rather represent it natively in your programming language.

The top-level data structure should be an ordered collection (i.e. a list, array, vector, or similar).
Each element in this list should be an associative collection that maps from keys to values (i.e. a struct, object, hash map, dictionary, or similar).
Each of them has two entries: One string value with key "name", and one numeric value with key "population".

If any of that is impossible or unreasonable in your programming language, then feel free to deviate, as long as you explain it in a comment above your solution.

Guidance

If your programming language supports higher-order programming, then the most elegant way to implement the requested functionality in a generic and reusable way, might be to write a function (maybe called "find_index" or similar), that takes two arguments:

The list to search through.
A function/lambda/closure (the so-called "predicate"), which will be applied in turn to each element in the list, and whose boolean return value defines whether that element matches the search requirement.

If this is not the approach which would be most natural or idiomatic in your language, explain why, and show what is.

ALGOL 68

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
The second test-case has been clarified to no longer require sort.

A third test-case has been added.

Note the data for the task is already sorted in descending order of population so the sort shown here isn't strictly necessary, however in general we couldn't rely on that. <lang algol68># Algol 68 doesn't have generic array searches but we can easily provide #

type specific ones #

mode to hold the city/population info #

MODE CITYINFO = STRUCT( STRING name, REAL population in millions );

array of cities and populations #

[ 1 : 10 ]CITYINFO cities := ( ( "Lagos", 21.0 )

                            , ( "Cairo",                15.2 )
                            , ( "Kinshasa-Brazzaville", 11.3 )
                            , ( "Greater Johannesburg", 7.55 )
                            , ( "Mogadishu",            5.85 )
                            , ( "Khartoum-Omdurman",    4.98 )
                            , ( "Dar Es Salaam",        4.7  )
                            , ( "Alexandria",           4.58 )
                            , ( "Abidjan",              4.4  )
                            , ( "Casablanca",           3.98 )
                            );

operator to find the first city with the specified criteria, expressed as a procedure #
returns the index of the CITYINFO. We can also overload FIND so it can be applied to #
arrays of other types #
If there is no city matching the criteria, a value greater than the upper bound of #
the cities array is returned #

PRIO FIND = 1; OP FIND = ( REF[]CITYINFO cities, PROC( REF CITYINFO )BOOL criteria )INT:

    BEGIN
        INT  result := UPB cities + 1;
        BOOL found  := FALSE;
        FOR pos FROM LWB cities TO UPB cities WHILE NOT found DO
            IF criteria( cities[ pos ] )
            THEN
                found  := TRUE;
                result := pos
            FI
        OD;
        result
    END # FIND # ;

find the 0-based index of Dar Es Salaam #
( if we remove the "[ @ 0 ]", it would find the 1-based index ) #
NB - this assumes there is one - would get a subscript bound error if there isn't #

print( ( "index of Dar Es Salaam (from 0): "

      , whole( cities[ @ 0 ] FIND ( ( REF CITYINFO city )BOOL: name OF city = "Dar Es Salaam" ), 0 )
      , newline
      )
    );

operator to sort the cities with a specified comparator #

PRIO SORT = 1; OP SORT = ( REF[]CITYINFO array, PROC( REF CITYINFO, REF CITYINFO )BOOL less than )REF[]CITYINFO:

    BEGIN
        HEAP[ LWB array : UPB array ]CITYINFO sorted := array;
        # bubble sort - replace with more efficient algorithm for "real" use... #
        FOR end pos FROM UPB sorted - 1 BY -1 TO LWB sorted
        WHILE
            BOOL swapped := FALSE;
            FOR pos FROM LWB sorted TO end pos DO
                IF NOT less than( sorted[ pos ], sorted[ pos + 1 ] )
                THEN              
                    CITYINFO t        := sorted[ pos     ];
                    sorted[ pos     ] := sorted[ pos + 1 ];
                    sorted[ pos + 1 ] := t;
                    swapped           := TRUE
                FI
            OD;
            swapped
        DO SKIP OD;
        sorted
    END # SORT # ;

find the city with the highest population under 5M #
NB - this assumes there is one - would get a subscript range error if there isn't #

print( ( name OF cities[ ( cities SORT ( ( REF CITYINFO a, b )BOOL: population in millions OF a > population in millions OF b )

                        ) FIND ( ( REF CITYINFO city )BOOL: population in millions OF city < 5.0 )
                      ]
       , " has the maximum population under 5M"
       , newline
       )
     )</lang>

Output:

index of Dar Es Salaam (from 0): 6
Khartoum-Omdurman has the maximum population under 5M

C

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
The second test-case has been clarified to no longer require sort.

A third test-case has been added.

This solution makes use of the 'bsearch' and 'lfind' library functions. Note: 'lfind' is available only on Posix systems, and is found in the 'search.h' header. <lang c>

include <stdint.h> /* intptr_t */
include <stdio.h>
include <stdlib.h> /* bsearch */
include <string.h>
include <search.h> /* lfind */

define LEN(x) (sizeof(x) / sizeof(x[0]))

struct cd {

   char *name;
   double population;

};

/* Return -1 if name could not be found */ int search_get_index_by_name(const char *name, const struct cd *data, const size_t data_length,

       int (*cmp_func)(const void *, const void *))

{

   struct cd key = { (char *) name, 0 };
   struct cd *match = bsearch(&key, data, data_length,
           sizeof(struct cd), cmp_func);

   if (match == NULL)
       return -1;
   else
       return ((intptr_t) match - (intptr_t) data) / sizeof(struct cd);

}

/* Return NULL if no value satisfies threshold */ char* search_get_pop_threshold(double pop_threshold, const struct cd *data, size_t data_length,

       int (*cmp_func)(const void *, const void *))

{

   struct cd key = { NULL, pop_threshold };
   struct cd *match = lfind(&key, data, &data_length,
           sizeof(struct cd), cmp_func);

   if (match == NULL)
       return NULL;
   else
       return match->name;

}

int cd_name_cmp(const void *a, const void *b) {

   struct cd *aa = (struct cd *) a;
   struct cd *bb = (struct cd *) b;
   return strcmp(bb->name, aa->name);

}

int cd_pop_cmp(const void *a, const void *b) {

   struct cd *aa = (struct cd *) a;
   struct cd *bb = (struct cd *) b;
   return bb->population >= aa->population;

}

int main(void) {

   const struct cd citydata[] = {
       { "Lagos", 21 },
       { "Cairo", 15.2 },
       { "Kinshasa-Brazzaville", 11.3 },
       { "Greater Johannesburg", 7.55 },
       { "Mogadishu", 5.85 },
       { "Khartoum-Omdurman", 4.98 },
       { "Dar Es Salaam", 4.7 },
       { "Alexandria", 4.58 },
       { "Abidjan", 4.4 },
       { "Casablanca", 3.98 }
   };

   const size_t citydata_length = LEN(citydata);

   printf("%d\n", search_get_index_by_name("Dar Es Salaam", citydata, citydata_length, cd_name_cmp));
   printf("%s\n", search_get_pop_threshold(5, citydata, citydata_length, cd_pop_cmp));
   printf("%d\n", search_get_index_by_name("Dar Salaam", citydata, citydata_length, cd_name_cmp));
   printf("%s\n", search_get_pop_threshold(2, citydata, citydata_length, cd_pop_cmp) ?: "(null)");
   return 0;

} </lang>

Output:

6
Khartoum-Omdurman
-1
(null)

EchoLisp

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
You shouldn't parse the input from JSON - instead, show to readers what the data structure looks like natively.

A third test-case has been added.

We demonstrate the vector-search primitive, which takes as input a vector, and a predicate. <lang scheme> (require 'struct) (require 'json)

importing data

(define cities

<<

[{"name":"Lagos", "population":21}, {"name":"Cairo", "population":15.2}, {"name":"Kinshasa-Brazzaville", "population":11.3}, {"name":"Greater Johannesburg", "population":7.55}, {"name":"Mogadishu", "population":5.85}, {"name":"Khartoum-Omdurman", "population":4.98}, {"name":"Dar Es Salaam", "population":4.7}, {"name":"Alexandria", "population":4.58}, {"name":"Abidjan", "population":4.4}, {"name":"Casablanca", "population":3.98}] >>#)

define a structure matching data heterogenous slots values

(struct city (name population))

convert JSON to EchoLisp instances of structures

(set! cities (vector-map (lambda(x) (json->struct x struct:city)) (json-import cities)))

search by name, case indifferent

(define (city-index name) (vector-search (lambda(x) (string-ci=? (city-name x) name)) cities))

returns first city name such as population < seuil

(define (city-pop seuil) (define idx (vector-search (lambda(x) (< (city-population x) seuil)) cities)) (if idx (city-name (vector-ref cities idx)) (cons seuil 'not-found)))

(city-index "Dar Es Salaam") → 6 (city-pop 5) → "Khartoum-Omdurman" (city-pop -666) → (-666 . not-found) (city-index "alexandra") → #f </lang>

J

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
The stuff about searching simple lists of strings or numbers, doesn't belong here. Maybe move it to Search a list.

A third search condition test-case has been added.

J supports several "searching" primitives.

i. finds the indices of the things being looked for (with 0 being the first index). Nonmatches get a result of 1+largest valid index.

<lang j> 1 2 3 4 5 6 7 8 9 i. 2 3 5 60 1 2 4 9

  (;:'one two three four five six seven eight nine') i. ;:'two three five sixty'

1 2 4 9</lang>

e. finds whether items are members of a set, returning a bitmask to select the members:

<lang j> 1 2 3 4 5 6 7 8 9 e. 2 3 5 60 0 1 1 0 1 0 0 0 0

  (;:'one two three four five six seven eight nine') e. ;:'two three five sixty'

0 1 1 0 1 0 0 0 0</lang>

I. finds indices, but performs a binary search (which requires that the list being searched is sorted). This can be useful for finding non-exact matches (the index of the next value is returned for non-exact matches).

<lang j> 1 2 3 4 5 6 7 8 9 I. 2 3 5 60 6.66 1 2 4 9 6

  (;:'eight five four nine one seven six three two') I. ;:'two three five sixty'

8 7 1 7</lang>

And, for the tabular example in the current task description, here is the data we will be using:

<lang J>colnumeric=: 0&".&.>@{`[`]}

data=: 1 colnumeric |: fixcsv 0 :0 Lagos, 21 Cairo, 15.2 Kinshasa-Brazzaville, 11.3 Greater Johannesburg, 7.55 Mogadishu, 5.85 Khartoum-Omdurman, 4.98 Dar Es Salaam, 4.7 Alexandria, 4.58 Abidjan, 4.4 Casablanca, 3.98 )</lang>

And here are the required computations:

<lang J> (0 { data) i. <'Dar Es Salaam' 6

  (i. >./)@(* 5&>)@:>@{: data

5

  5 {:: 0 {data

Khartoum-Omdurman</lang>

The "general search function" mentioned in the task does not seem a natural fit for this set of data, because of the multi-column nature of this data. Nevertheless, we could for example define:

<lang j>gsf=: 1 :0

  I. u x { y

)</lang>

This uses the single argument aspect of the definition of I. to convert a bit mask to the corresponding sequence of indices. And the column(s) we are searching on are exposed as a parameter for the interface, which allows us to ignore (for this problem) the irrelevant columns...

Thus, we could say:

But this doesn't seem any clearer or more concise than our previous expression which finds the index of the first example of the most populous city with a population less than five million. Not only that, but if there were multiple cities which had the same population number which satisfied this constraint, this version would return all of those indices where the task explicitly required we return the first example.

J: Another approach

<lang j> city=: <;._1 ';Lagos;Cairo;Kinshasa-Brazzaville;Greater Johannesburg;Mogadishu;Khartoum-Omdurman;Dar Es Salaam;Alexandria;Abidjan;Casablanca'

  popln=: 21 15.2 11.3 7.55 5.85 4.98 4.7 4.58 4.4 3.98
  city i. <'Dar Es Salaam'            NB. index of Dar Es Salaam

6

  city {~ (popln < 5) {.@# \: popln   NB. name of first city with population less than 5 million

┌─────────────────┐ │Khartoum-Omdurman│ └─────────────────┘</lang>

JavaScript

ES5

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
The stuff about searching simple lists of strings or numbers, doesn't belong here. Maybe move it to Search a list.

A third test-case (search condition) has been added to the task description.

For arrays containing simple string and numeric datatypes, JavaScript provides Array.toIndex().

Note that JS uses two different equality operators for simple types:

Under Abstract equality, 3.0 == '3' -> true (http://www.ecma-international.org/ecma-262/5.1/#sec-9.12)
Under Strict equality, 3.0 === '3' -> false http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.6

and Array.toIndex matches only on Strict equality. Therefore:

<lang JavaScript>(function () {

 var blnAbstractEquality = (3.0 == '3'), // true  http://www.ecma-international.org/ecma-262/5.1/#sec-9.12
     blnStrictEquality = (3.0 === '3'); // false;  http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.6

 var lstNumerics = [1, 1.0, '1', 2, 2.0, '2', 3, 3.0, '3'];

 return [
   blnAbstractEquality,
   blnStrictEquality,
   lstNumerics.indexOf(3.0),
   lstNumerics.indexOf('3')
 ]

})();</lang>

Returns:

<lang JavaScript>[

 true,
 false,
 6,
 8

]</lang>

Strict Equality does not, however, return true for two instances of objects which match identically in terms of their keys and values. This means that Array.indexOf() will alway return a -1 value (not found) in searches for objects other than simple strings and numbers. For strings, Strict Equality is case-sensitive.

To find more complex objects in a JS ES5 array, or search more flexibly, we can define find(), and findIndex(), which take predicate functions (defining the match required) as arguments:

The following code:

<lang JavaScript>(function (fnNameMatch, fnPopulationMatch) {

   function find(fnPredicate, list) {
     for (var i = 0, lng = list.length; i < lng; i++) {
       if (fnPredicate(list[i])) {
         return list[i];
       }
     }
     return undefined;
   };

   function findIndex(fnPredicate, list) {
     for (var i = 0, lng = list.length; i < lng; i++) {
       if (fnPredicate(list[i])) {
         return i;
       }
     }
     return undefined;
   };

   var lstCities = [{
     "name": "Lagos",
     "population": 21
   }, {
     "name": "Cairo",
     "population": 15.2
   }, {
     "name": "Kinshasa-Brazzaville",
     "population": 11.3
   }, {
     "name": "Greater Johannesburg",
     "population": 7.55
   }, {
     "name": "Mogadishu",
     "population": 5.85
   }, {
     "name": "Khartoum-Omdurman",
     "population": 4.98
   }, {
     "name": "Dar Es Salaam",
     "population": 4.7
   }, {
     "name": "Alexandria",
     "population": 4.58
   }, {
     "name": "Abidjan",
     "population": 4.4
   }, {
     "name": "Casablanca",
     "population": 3.98
 }];

   return [
     lstCities.indexOf({
       "name": "Alexandria",
       "population": 4.58
     }),

     find(fnNameMatch, lstCities),
     findIndex(fnNameMatch, lstCities),

     find(fnPopulationMatch, lstCities),
     findIndex(fnPopulationMatch, lstCities)
   ]

 })(
   function (e) {
     return e.name && e.name.toLowerCase() === 'dar es salaam';

   },
   function (e) {
     return e.population && e.population < 5.0;
   }
 );

})();</lang>

returns: <lang JavaScript>[

 -1, // the Alexandria object can not be found with Array.indexOf()
 {
   "name": "Dar Es Salaam",
   "population": 4.7
 },
 6,
 {
   "name": "Khartoum-Omdurman",
   "population": 4.98
 },
 5

]</lang>

ES6

This example is in need of improvement:

please show actual code that performs the searches specified in the task description

In ES6, Array.find() and Array.findIndex() are provided as built-in methods, but beware that these are not supported in IE.

Perl

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
The stuff about searching simple lists of strings, doesn't belong here. Maybe move it to Search a list.

You shouldn't parse the input from JSON - instead, show to readers what the data structure looks like natively.

The code uses a hard-coded loop for each search. Consider a more generic and re-usable approach as asked for by the task description, or explain to readers why that wasn't possible or sensible in this language.

<lang perl>if(grep $_ eq $needle, @haystack) { print 'Found'; }

print grep($_ eq 'needle', ('apple', 'orange')) ? 'Found' : 'Not found';

specifically solving the problem above

!/usr/bin/perl

use strict ; use warnings ; use JSON ;

my $african_cities =

   ' [
     {"name":"Lagos","population":21},
     {"name":"Cairo","population":15.2},
     {"name":"Kinshasa-Brazzaville","population":11.3},
     {"name":"Greater Johannesburg","population":7.55},
     {"name":"Mogadishu","population":5.85},
     {"name":"Khartoum-Omdurman","population":4.98},
     {"name":"Dar Es Salaam","population":4.7},
     {"name":"Alexandria","population":4.58},
     {"name":"Abidjan","population":4.4},
     {"name":"Casablanca","population":3.98}
  ]' ;

my $cityhash = from_json ( $african_cities ) ; my $current = 0 ; while ( $cityhash->[$current]->{"name"} ne "Dar Es Salaam" ) {

  $current++ ;

} print "Dar Es Salaam has the index $current!\n" ; $current = 0 ; while ( $cityhash->[$current]->{"population"} >= 5.0 ) {

  $current++ ;

} print "The first city with a population of less than 5 million is "

 . $cityhash->[$current]->{"name"} . " !\n" ;</lang>

Output:

Dar Es Salaam has the index 6!
The first city with a population of less than 5 million is Khartoum-Omdurman !

Perl 6

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
You shouldn't parse the input from JSON - instead, show to readers what the data structure looks like natively.

A third test-case (search condition) has been added to the task description.

The second test-case has been clarified to no longer require sort.

Works with: rakudo version 2015-11-29

There are several search operations that may be used. It mostly depends on whether you want to find actual values or pointers, and/or all possible values or a single value matching your criteria. The most appropriate for the given test data/operations are shown here.

<lang perl6>use JSON::Tiny;

my $cities = from-json(' [{"name":"Lagos", "population":21}, {"name":"Cairo", "population":15.2}, {"name":"Kinshasa-Brazzaville", "population":11.3}, {"name":"Greater Johannesburg", "population":7.55}, {"name":"Mogadishu", "population":5.85}, {"name":"Khartoum-Omdurman", "population":4.98}, {"name":"Dar Es Salaam", "population":4.7}, {"name":"Alexandria", "population":4.58}, {"name":"Abidjan", "population":4.4}, {"name":"Casablanca", "population":3.98}] ');

Find the indicies of the cities named 'Dar Es Salaam'.

say grep { $_<name> eq 'Dar Es Salaam'}, :k, @$cities; # (6)

Find the name of the first city with a population less
than 5 when sorted by population, largest to smallest.

say ($cities.sort( -*.<population> ).first: *.<population> < 5)<name>; # Khartoum-Omdurman

Find all of the city names that contain an 'm'

say join ', ', sort grep( {$_<name>.lc ~~ /'m'/}, @$cities )»<name>; # Dar Es Salaam, Khartoum-Omdurman, Mogadishu</lang>

Phix

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

<lang Phix>constant CITY_NAME = 1, POPULATION = 2 constant municipalities = {{"Lagos",21},

                          {"Cairo",15.2},
                          {"Kinshasa-Brazzaville",11.3},
                          {"Greater Johannesburg",7.55},
                          {"Mogadishu",5.85},
                          {"Khartoum-Omdurman",4.98},
                          {"Dar Es Salaam",4.7},
                          {"Alexandria",4.58},
                          {"Abidjan",4.4},
                          {"Casablanca",3.98}}

function searchfor(sequence s, integer rid, object user_data, integer return_index=0)

   for i=1 to length(s) do
       if call_func(rid,{s[i],user_data}) then
           return iff(return_index?i:s[i])
       end if
   end for
   return 0 -- not found

end function

function city_named(sequence si, string city_name)

   return si[CITY_NAME]=city_name

end function

?searchfor(municipalities,routine_id("city_named"),"Dar Es Salaam",1)

function smaller_than(sequence si, atom population)

   return si[POPULATION]<population

end function

?searchfor(municipalities,routine_id("smaller_than"),5)[CITY_NAME]</lang> The columnize function reorganises hetrogenous data into corresponding homogenous arrays, which can make this sort of thing much simpler, at least for exact matches. <lang Phix>constant {cities,populations} = columnize(municipalities)

?find(cities,"Dar Es Salaam")</lang>

Output:

7
"Khartoum-Omdurman"
7

Note that Phix subscripts are 1-based, hence the output of 7 not 6.

PHP

This example is incorrect. Please fix the code and remove this message.

Details:
The stuff about searching simple lists of strings or numbers, doesn't belong here. Maybe move it to Search a list.

Solve the problem specified in the task description instead.

<lang PHP>echo in_array('needle', ['hay', 'stack']) ? 'Found' : 'Not found';</lang>

Racket

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

The more idiomatic functions for the task is findf but it doesn't provide the position of the element in the list, so we write a variant. If the item is not found we return #f as most of the Racket primitives do in these cases. <lang Racket>#lang racket

(define (findf/pos proc lst)

 (let loop ([lst lst] [pos 0])
   (cond
     [(null? lst) #f]
     [(proc (car lst)) pos]
     [else (loop (cdr lst) (add1 pos))])))</lang>

Now we define the list that has the data for the task. <lang Racket>(define data '(("Lagos" 21)

              ("Cairo" 15.2)
              ("Kinshasa-Brazzaville" 11.3)
              ("Greater Johannesburg" 7.55)
              ("Mogadishu" 5.85)
              ("Khartoum-Omdurman" 4.98)
              ("Dar Es Salaam" 4.7)
              ("Alexandria" 4.58)
              ("Abidjan" 4.4)
              ("Casablanca" 3.98)))</lang>

We write tiny wrappers to solve the specific task. <lang Racket>(define (city-pos name)

 (findf/pos (lambda (x) (string=? (car x) name)) data))

(city-pos "Dar Es Salaam") (city-pos "Buenos Aires")

(define (city-smaller pop)

 (let ([city (findf (lambda (x) (< (cadr x) pop)) data)])
   (and city (car city))))

(city-smaller 5) (city-smaller -1)</lang>

Output:

6
#f
"Khartoum-Omdurman"
#f

REXX

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

The code uses a hard-coded loop for each search. Consider a more generic and re-usable approach as asked for by the task description, or explain to readers why that wasn't possible or sensible in this language.

It is more idiomatic in REXX to use sparse arrays to express a list of CSV values which have embedded blanks in them.

Most REXX interpreters use hashing to index sparse arrays, which is much faster than performing a sequential search. <lang rexx>/*REXX program (using criteria) finds values or indices in a list (of 2 datatypes).*/ $="Lagos=21, Cairo=15.2, Kinshasa-Brazzaville=11.3, Greater Johannesburg=7.55, Mogadishu=5.85,",

 "Khartoum-Omdurman=4.98, Dar Es Salaam=4.7,   Alexandria=4.58,   Abidjan=4.4,    Casablanca=3.98"

@.= '(not found)' /*default value for "not found" cities.*/ w=0 /*W: used for formatting city's name. */

     do #=0  while $\=                        /* [↓]  extract all cities&populations.*/
     parse  var  $   c  '='  p  ","  $          /*destructive parse of the  $  string. */
     c=space(c);     w=max(w, length(c) )       /*remove superfluous spaces in table.  */
     city.#=c;   pop.#=p;    pop.c=#;     @.c=# /*assign city&pop──arrays; & city──►idx*/
     end   /*#*/                                /* [↑]   TOWN. array starts at 0 index.*/
                       /*▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ task 1:  show the ordered city list.*/

say center('index',9,"═") center("city",w,"═") center('population (millions)',27,"═")

     do j=0  for #                                                /*get pertinent info.*/
     say center(j,9)     center(city.j, w)     center(pop.j, 27)  /*index, city, pop.  */
     end   /*j*/

say

                       /*▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ task 2:  show the  INDEX  of a city.*/

town= 'Dar Es Salaam' /*show (zero─based) index of this city.*/ say 'The city of ' town " has an index of: " findIndex(city); say

                       /*▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ task 3:  show 1st city whose pop<5 M*/

many=5 /*show 1st city whose population is <5M*/

     do k=0  for #;        p=getPop(k)          /*get a city's population from an index*/
     if \lessPop(p, many)  then iterate         /*does the city fail predicate test?   */
     say 'The city of '    city.k    " has a population less than "    many   ' million.'
     leave                                      /*only show the first city of pop < 5 M*/
     end   /*k*/

if k># then say 'no city found with a population of less than ' many " million." exit /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ findIndex: parse arg _; return @._ /*returns the INDEX of a city or null. */ getPop: parse arg _; return pop._ /*returns the pop of an index of city. */ lessPop: return arg(1) < arg(2) /*predicate function tests if pop < 5 M*/</lang> output when using the default inputs:

══index══ ════════city════════ ═══population (millions)═══
    0            Lagos                     21
    1            Cairo                    15.2
    2     Kinshasa-Brazzaville            11.3
    3     Greater Johannesburg            7.55
    4          Mogadishu                  5.85
    5      Khartoum-Omdurman              4.98
    6        Dar Es Salaam                 4.7
    7          Alexandria                 4.58
    8           Abidjan                    4.4
    9          Casablanca                 3.98

The city of  Dar Es Salaam  has an index of:  6

The city of  Khartoum-Omdurman  has a population less than  5  million.

Ring

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

The code uses a hard-coded loop for the population<5 search. Consider a more generic approach as asked for by the task description, or explain to readers why that wasn't possible or sensible in this language.

<lang Ring> name = 1 population = 2 cities = [ ["Lagos", 21] , ["Cairo", 15.2], ["Kinshasa-Brazzaville", 11.3], ["Greater Johannesburg", 7.55], ["Mogadishu", 5.85], ["Khartoum-Omdurman", 4.98], ["Dar Es Salaam", 4.7], ["Alexandria", 4.58], ["Abidjan", 4.4], ["Casablanca", 3.98 ] ] See find(cities,"Dar Es Salaam",name) + nl # output = 7 See cities[find(cities,4.58,population)][name] + nl # output = Alexandria for x in cities if x[population] < 5 see x[name] + nl exit ok next # output = Khartoum-Omdurman </lang>

Ruby

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

<lang Ruby>cities = [ {name: "Lagos", population: 21}, {name: "Cairo", population: 15.2}, {name: "Kinshasa-Brazzaville", population: 11.3}, {name: "Greater Johannesburg", population: 7.55}, {name: "Mogadishu", population: 5.85}, {name: "Khartoum-Omdurman", population: 4.98}, {name: "Dar Es Salaam", population: 4.7}, {name: "Alexandria", population: 4.58}, {name: "Abidjan", population: 4.4}, {name: "Casablanca", population: 3.98} ]

puts cities.index{|city| city[:name] == "Dar Es Salaam"} # => 6 puts cities.find {|city| city[:population] < 5.0}[:name] # => Khartoum-Omdurman </lang>

Sidef

<lang ruby>struct City {

   String name,
   Number population,

}

var cities = [

   City("Lagos", 21),
   City("Cairo", 15.2),
   City("Kinshasa-Brazzaville", 11.3),
   City("Greater Johannesburg", 7.55),
   City("Mogadishu", 5.85),
   City("Khartoum-Omdurman", 4.98),
   City("Dar Es Salaam", 4.7),
   City("Alexandria", 4.58),
   City("Abidjan", 4.4),
   City("Casablanca", 3.98),

]

say cities.index{|city| city.name == "Dar Es Salaam"} # => 6 say cities.first{|city| city.population < 5.0}.name # => Khartoum-Omdurman</lang>

Tcl

Tcl's lsearch command takes many useful options. This task illustrates the -index and -bisect options.

<lang Tcl>set cities {

   {"Lagos"    21}
   {"Cairo"    15.2}
   {"Kinshasa Brazzaville"     11.3}
   {"Greater Johannesburg"     7.55}
   {"Mogadishu"        5.85}
   {"Khartoum Omdurman"        4.98}
   {"Dar Es Salaam"    4.7}
   {"Alexandria"       4.58}
   {"Abidjan"  4.4}
   {"Casablanca"       3.98}

} puts "Dar Es Salaam is at position: [

   lsearch -index 0 $cities "Dar Es Salaam"

]."

set cities_by_size [

   lsort -index 1 -real $cities

]

puts "The largest city of < 5m is: [

   lsearch -inline -index 1 -bisect -real $cities_by_size 5.0

]"</lang>

Output:

Dar Es Salaam is at position: 6.
The largest city of < 5m is: "Khartoum Omdurman"        4.98

zkl

<lang zkl>list:=T(SD("name","Lagos", "population",21), SD("name","Cairo", "population",15.2), SD("name","Kinshasa-Brazzaville", "population",11.3), SD("name","Greater Johannesburg", "population",7.55), SD("name","Mogadishu", "population",5.85), SD("name","Khartoum-Omdurman", "population",4.98), SD("name","Dar Es Salaam", "population",4.7), SD("name","Alexandria", "population",4.58), SD("name","Abidjan", "population",4.4), SD("name","Casablanca", "population",3.98));

n:=list.filter1n(fcn(city){ city["name"]=="Dar Es Salaam" });

  // or city["name"].matches("dar es salaam") for case insensitive wild card match

n.println(" == index of ",list[n].values);

city:=list.filter1(fcn(city){ city["population"]<5 }); city.values.println(" is the first city with population under 5 million."); list.filter(fcn(city){ city["population"]<5.0 }).apply("get","name").println(); list.filterNs(fcn(city){ city["population"]<5.0 }).println();

list.filter1n(fcn(city){ city["name"]=="alexandra" }).println();</lang> where a SD is a small read only dictionary and filter1 is a filter that stops at the first match (returning the matched item), filter1N is the same but returns the index. Both filter methods return False on failure.

Note that the dictionarys contain either int or float values and the filter could use either int or float (in this case, the 5 is converted to the type in the dictionary).

The YAJL library could be used to parse the JSON data directly.

Output:

6 == index of L("Dar Es Salaam",4.7)
L("Khartoum-Omdurman",4.98) is the first city with population under 5 million.
L("Khartoum-Omdurman","Dar Es Salaam","Alexandria","Abidjan","Casablanca")
L(5,6,7,8,9)
False