Search a list of records: Difference between revisions

Content added Content deleted

Inline

Revision as of 12:12, 27 August 2016

Many programming languages provide convenient ways to look for a known value in a simple list of strings or numbers.
But what if the elements of the list are themselves compound records/objects/data-structures, and the search condition is more complex than a simple equality test?

Task

Write a function/method/etc. that can find the first element in a given list matching a given condition.
It should be as generic and reusable as possible.
(Of course if your programming language already provides such a feature, you can use that instead of recreating it.)

Then to demonstrate its functionality, create the data structure specified under #Data set, and perform on it the searches specified under #Test cases.

Data set

The data structure to be used contains the names and populations (in millions) of the 10 largest metropolitan areas in Africa, and looks as follows when represented in JSON:

<lang JavaScript>[

 { "name": "Lagos",                "population": 21.0  },
 { "name": "Cairo",                "population": 15.2  },
 { "name": "Kinshasa-Brazzaville", "population": 11.3  },
 { "name": "Greater Johannesburg", "population":  7.55 },
 { "name": "Mogadishu",            "population":  5.85 },
 { "name": "Khartoum-Omdurman",    "population":  4.98 },
 { "name": "Dar Es Salaam",        "population":  4.7  },
 { "name": "Alexandria",           "population":  4.58 },
 { "name": "Abidjan",              "population":  4.4  },
 { "name": "Casablanca",           "population":  3.98 }

]</lang>

However, you shouldn't parse it from JSON, but rather represent it natively in your programming language.

The top-level data structure should be an ordered collection (i.e. a list, array, vector, or similar).
Each element in this list should be an associative collection that maps from keys to values (i.e. a struct, object, hash map, dictionary, or similar).
Each of them has two entries: One string value with key "name", and one numeric value with key "population".
You may rely on the list being sorted by population count, as long as you explain this to readers.

If any of that is impossible or unreasonable in your programming language, then feel free to deviate, as long as you explain your reasons in a comment above your solution.

Test cases

Search	Expected result
Find the (zero-based) index of the first city in the list whose name is "`Dar Es Salaam`"	`6`
Find the name of the first city in this list whose population is less than 5 million	`Khartoum-Omdurman`
Find the population of the first city in this list whose name starts with the letter "`A`"	`4.58`

Guidance

If your programming language supports higher-order programming, then the most elegant way to implement the requested functionality in a generic and reusable way, might be to write a function (maybe called "find_index" or similar), that takes two arguments:

The list to search through.
A function/lambda/closure (the so-called "predicate"), which will be applied in turn to each element in the list, and whose boolean return value defines whether that element matches the search requirement.

If this is not the approach which would be most natural or idiomatic in your language, explain why, and show what is.

ALGOL 68

<lang algol68># Algol 68 doesn't have generic array searches but we can easily provide #

type specific ones #

mode to hold the city/population info #

MODE CITYINFO = STRUCT( STRING name, REAL population in millions );

array of cities and populations #

[ 1 : 10 ]CITYINFO cities := ( ( "Lagos", 21.0 )

                            , ( "Cairo",                15.2 )
                            , ( "Kinshasa-Brazzaville", 11.3 )
                            , ( "Greater Johannesburg", 7.55 )
                            , ( "Mogadishu",            5.85 )
                            , ( "Khartoum-Omdurman",    4.98 )
                            , ( "Dar Es Salaam",        4.7  )
                            , ( "Alexandria",           4.58 )
                            , ( "Abidjan",              4.4  )
                            , ( "Casablanca",           3.98 )
                            );

operator to find the first city with the specified criteria, expressed as a procedure #
returns the index of the CITYINFO. We can also overload FIND so it can be applied to #
arrays of other types #
If there is no city matching the criteria, a value greater than the upper bound of #
the cities array is returned #

PRIO FIND = 1; OP FIND = ( REF[]CITYINFO cities, PROC( REF CITYINFO )BOOL criteria )INT:

    BEGIN
        INT  result := UPB cities + 1;
        BOOL found  := FALSE;
        FOR pos FROM LWB cities TO UPB cities WHILE NOT found DO
            IF criteria( cities[ pos ] )
            THEN
                found  := TRUE;
                result := pos
            FI
        OD;
        result
    END # FIND # ;

convenience operator to determine whether a STRING starts with a particular character #
returns TRUE if s starts with c, FALSE otherwise #

PRIO STARTSWITH = 9; OP STARTSWITH = ( STRING s, CHAR c )BOOL:

    IF LWB s > UPB s THEN FALSE # empty string                                         #
    ELSE s[ LWB s ] = c
    FI # STARTSWITH # ;

find the 0-based index of Dar Es Salaam #
( if we remove the "[ @ 0 ]", it would find the 1-based index ) #
NB - this assumes there is one - would get a subscript bound error if there isn't #

print( ( "index of Dar Es Salaam (from 0): "

      , whole( cities[ @ 0 ] FIND ( ( REF CITYINFO city )BOOL: name OF city = "Dar Es Salaam" ), 0 )
      , newline
      )
    );

find the first city with population under 5M #
NB - this assumes there is one - would get a subscript bound error if there isn't #

print( ( name OF cities[ cities FIND ( ( REF CITYINFO city )BOOL: population in millions OF city < 5.0 ) ]

      , " has a population under 5M"
      , newline
      )
    );

find the population of the first city whose name starts with "A" #
NB - this assumes there is one - would get a subscript bound error if there isn't #

print( ( "The population of a city named ""A..."" is: "

      , fixed( population in millions OF cities[ cities FIND ( ( REF CITYINFO city )BOOL: name OF city STARTSWITH "A" ) ], 0, 2 )
      , newline
      )
    )

</lang>

Output:

index of Dar Es Salaam (from 0): 6
Khartoum-Omdurman has a population under 5M
The population of a city named "A..." is: 4.58

AppleScript

Translation of: JavaScript

<lang AppleScript>property lst : [¬

on run {¬ findIndex(mClosure(my nameMatch, {|name|:"Dar Es Salaam"}), lst), ¬ ¬

of find(mClosure(my popBelow, {population:5}), lst), ¬

       ¬
           population of find(mClosure(my nameBeginsWith, {firstLetter:"A"}), lst) ¬
       }

end run

-- nameMatch :: Record -> Bool on nameMatch(rec)

of rec = |name| of my closure

end nameMatch

-- popBelow :: Record -> Bool on popBelow(rec)

   population of rec < population of my closure

end popBelow

-- nameBeginsWith :: Record -> Bool on nameBeginsWith(rec)

   text 1 of |name| of rec = firstLetter of my closure

end nameBeginsWith

-- GENERIC FUNCTIONS

-- findIndex :: (a -> Bool) -> [a] -> Maybe Int on findIndex(f, xs)

   set mf to mReturn(f)
   set lng to length of xs
   repeat with i from 1 to lng
       if mf's lambda(item i of xs) then return (i - 1)
   end repeat
   return missing value

end findIndex

-- find :: (a -> Bool) -> [a] -> Maybe a on find(f, xs)

   set mf to mReturn(f)
   set lng to length of xs
   repeat with i from 1 to lng
       if mf's lambda(item i of xs) then return item i of xs
   end repeat
   return missing value

end find

-- mReturn :: Handler -> Script on mReturn(f)

   if class of f is script then return f
   script
       property lambda : f
   end script

end mReturn

-- mClosure :: Handler -> Record -> Script on mClosure(f, recBindings)

   script
       property closure : recBindings
       property lambda : f
   end script

end mClosure</lang>

Output:

<lang AppleScript>{6, "Khartoum-Omdurman", 4.58}</lang>

C

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case has been added.

This solution makes use of the 'bsearch' and 'lfind' library functions. Note: 'lfind' is available only on Posix systems, and is found in the 'search.h' header. <lang c>

include <stdint.h> /* intptr_t */
include <stdio.h>
include <stdlib.h> /* bsearch */
include <string.h>
include <search.h> /* lfind */

define LEN(x) (sizeof(x) / sizeof(x[0]))

struct cd {

   char *name;
   double population;

};

/* Return -1 if name could not be found */ int search_get_index_by_name(const char *name, const struct cd *data, const size_t data_length,

       int (*cmp_func)(const void *, const void *))

{

   struct cd key = { (char *) name, 0 };
   struct cd *match = bsearch(&key, data, data_length,
           sizeof(struct cd), cmp_func);

   if (match == NULL)
       return -1;
   else
       return ((intptr_t) match - (intptr_t) data) / sizeof(struct cd);

}

/* Return NULL if no value satisfies threshold */ char* search_get_pop_threshold(double pop_threshold, const struct cd *data, size_t data_length,

       int (*cmp_func)(const void *, const void *))

{

   struct cd key = { NULL, pop_threshold };
   struct cd *match = lfind(&key, data, &data_length,
           sizeof(struct cd), cmp_func);

   if (match == NULL)
       return NULL;
   else
       return match->name;

}

int cd_name_cmp(const void *a, const void *b) {

   struct cd *aa = (struct cd *) a;
   struct cd *bb = (struct cd *) b;
   return strcmp(bb->name, aa->name);

}

int cd_pop_cmp(const void *a, const void *b) {

   struct cd *aa = (struct cd *) a;
   struct cd *bb = (struct cd *) b;
   return bb->population >= aa->population;

}

int main(void) {

   const struct cd citydata[] = {
       { "Lagos", 21 },
       { "Cairo", 15.2 },
       { "Kinshasa-Brazzaville", 11.3 },
       { "Greater Johannesburg", 7.55 },
       { "Mogadishu", 5.85 },
       { "Khartoum-Omdurman", 4.98 },
       { "Dar Es Salaam", 4.7 },
       { "Alexandria", 4.58 },
       { "Abidjan", 4.4 },
       { "Casablanca", 3.98 }
   };

   const size_t citydata_length = LEN(citydata);

   printf("%d\n", search_get_index_by_name("Dar Es Salaam", citydata, citydata_length, cd_name_cmp));
   printf("%s\n", search_get_pop_threshold(5, citydata, citydata_length, cd_pop_cmp));
   printf("%d\n", search_get_index_by_name("Dar Salaam", citydata, citydata_length, cd_name_cmp));
   printf("%s\n", search_get_pop_threshold(2, citydata, citydata_length, cd_pop_cmp) ?: "(null)");
   return 0;

} </lang>

Output:

6
Khartoum-Omdurman
-1
(null)

C++

std::find_if accepts a lambda as predicate.

<lang cpp>#include <iostream>

include <string>
include <vector>
include <algorithm>

struct city {

   std::string name;
   float population;

};

int main() {

   std::vector<city> cities = {
       { "Lagos", 21 },
       { "Cairo", 15.2 },
       { "Kinshasa-Brazzaville", 11.3 },
       { "Greater Johannesburg", 7.55 },
       { "Mogadishu", 5.85 },
       { "Khartoum-Omdurman", 4.98 },
       { "Dar Es Salaam", 4.7 },
       { "Alexandria", 4.58 },
       { "Abidjan", 4.4 },
       { "Casablanca", 3.98 },
   };
   
   auto i1 = std::find_if( cities.begin(), cities.end(),
       [](city c){ return c.name == "Dar Es Salaam"; } );
   if (i1 != cities.end()) {
       std::cout << i1 - cities.begin() << "\n";
   }
   
   auto i2 = std::find_if( cities.begin(), cities.end(),
       [](city c){ return c.population < 5.0; } );
   if (i2 != cities.end()) {
       std::cout << i2->name << "\n";
   }
   
   auto i3 = std::find_if( cities.begin(), cities.end(),
       [](city c){ return c.name.length() > 0 && c.name[0] == 'A'; } );
   if (i3 != cities.end()) {
       std::cout << i3->population << "\n";
   }

}</lang>

Output:

6
Khartoum-Omdurman
4.58

EchoLisp

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
You shouldn't parse the input from JSON - instead, show to readers what the data structure looks like natively.

A third test-case has been added.

We demonstrate the vector-search primitive, which takes as input a vector, and a predicate. <lang scheme> (require 'struct) (require 'json)

importing data

(define cities

<<

[{"name":"Lagos", "population":21}, {"name":"Cairo", "population":15.2}, {"name":"Kinshasa-Brazzaville", "population":11.3}, {"name":"Greater Johannesburg", "population":7.55}, {"name":"Mogadishu", "population":5.85}, {"name":"Khartoum-Omdurman", "population":4.98}, {"name":"Dar Es Salaam", "population":4.7}, {"name":"Alexandria", "population":4.58}, {"name":"Abidjan", "population":4.4}, {"name":"Casablanca", "population":3.98}] >>#)

define a structure matching data heterogenous slots values

(struct city (name population))

convert JSON to EchoLisp instances of structures

(set! cities (vector-map (lambda(x) (json->struct x struct:city)) (json-import cities)))

search by name, case indifferent

(define (city-index name) (vector-search (lambda(x) (string-ci=? (city-name x) name)) cities))

returns first city name such as population < seuil

(define (city-pop seuil) (define idx (vector-search (lambda(x) (< (city-population x) seuil)) cities)) (if idx (city-name (vector-ref cities idx)) (cons seuil 'not-found)))

(city-index "Dar Es Salaam") → 6 (city-pop 5) → "Khartoum-Omdurman" (city-pop -666) → (-666 . not-found) (city-index "alexandra") → #f </lang>

Elixir

<lang elixir>cities = [

 [name: "Lagos",                 population: 21.0 ],
 [name: "Cairo",                 population: 15.2 ],
 [name: "Kinshasa-Brazzaville",  population: 11.3 ],
 [name: "Greater Johannesburg",  population:  7.55],
 [name: "Mogadishu",             population:  5.85],
 [name: "Khartoum-Omdurman",     population:  4.98],
 [name: "Dar Es Salaam",         population:  4.7 ],
 [name: "Alexandria",            population:  4.58],
 [name: "Abidjan",               population:  4.4 ],
 [name: "Casablanca",            population:  3.98]

]

IO.puts Enum.find_index(cities, fn city -> city[:name] == "Dar Es Salaam" end) IO.puts Enum.find(cities, fn city -> city[:population] < 5.0 end)[:name] IO.puts Enum.find(cities, fn city -> String.first(city[:name])=="A" end)[:population]</lang>

Output:

6
Khartoum-Omdurman
4.58

J

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
The stuff about searching simple lists of strings or numbers, doesn't belong here. Maybe move it to Search a list.

A third search condition test-case has been added.

J supports several "searching" primitives.

i. finds the indices of the things being looked for (with 0 being the first index). Nonmatches get a result of 1+largest valid index.

<lang j> 1 2 3 4 5 6 7 8 9 i. 2 3 5 60 1 2 4 9

  (;:'one two three four five six seven eight nine') i. ;:'two three five sixty'

1 2 4 9</lang>

e. finds whether items are members of a set, returning a bitmask to select the members:

<lang j> 1 2 3 4 5 6 7 8 9 e. 2 3 5 60 0 1 1 0 1 0 0 0 0

  (;:'one two three four five six seven eight nine') e. ;:'two three five sixty'

0 1 1 0 1 0 0 0 0</lang>

I. finds indices, but performs a binary search (which requires that the list being searched is sorted). This can be useful for finding non-exact matches (the index of the next value is returned for non-exact matches).

<lang j> 1 2 3 4 5 6 7 8 9 I. 2 3 5 60 6.66 1 2 4 9 6

  (;:'eight five four nine one seven six three two') I. ;:'two three five sixty'

8 7 1 7</lang>

And, for the tabular example in the current task description, here is the data we will be using:

<lang J>colnumeric=: 0&".&.>@{`[`]}

data=: 1 colnumeric |: fixcsv 0 :0 Lagos, 21 Cairo, 15.2 Kinshasa-Brazzaville, 11.3 Greater Johannesburg, 7.55 Mogadishu, 5.85 Khartoum-Omdurman, 4.98 Dar Es Salaam, 4.7 Alexandria, 4.58 Abidjan, 4.4 Casablanca, 3.98 )</lang>

And here are the required computations:

<lang J> (0 { data) i. <'Dar Es Salaam' 6

  (i. >./)@(* 5&>)@:>@{: data

5

  5 {:: 0 {data

Khartoum-Omdurman</lang>

The "general search function" mentioned in the task does not seem a natural fit for this set of data, because of the multi-column nature of this data. Nevertheless, we could for example define:

<lang j>gsf=: 1 :0

  I. u x { y

)</lang>

This uses the single argument aspect of the definition of I. to convert a bit mask to the corresponding sequence of indices. And the column(s) we are searching on are exposed as a parameter for the interface, which allows us to ignore (for this problem) the irrelevant columns...

Thus, we could say:

But this doesn't seem any clearer or more concise than our previous expression which finds the index of the first example of the most populous city with a population less than five million. Not only that, but if there were multiple cities which had the same population number which satisfied this constraint, this version would return all of those indices where the task explicitly required we return the first example.

J: Another approach

<lang j> city=: <;._1 ';Lagos;Cairo;Kinshasa-Brazzaville;Greater Johannesburg;Mogadishu;Khartoum-Omdurman;Dar Es Salaam;Alexandria;Abidjan;Casablanca'

  popln=: 21 15.2 11.3 7.55 5.85 4.98 4.7 4.58 4.4 3.98
  city i. <'Dar Es Salaam'            NB. index of Dar Es Salaam

6

  city {~ (popln < 5) {.@# \: popln   NB. name of first city with population less than 5 million

┌─────────────────┐ │Khartoum-Omdurman│ └─────────────────┘</lang>

JavaScript

ES5

<lang JavaScript>(function () {

   'use strict';

   // find :: (a -> Bool) -> [a] -> Maybe a
   function find(f, xs) {
       for (var i = 0, lng = xs.length; i < lng; i++) {
           if (f(xs[i])) return xs[i];
       }
       return undefined;
   }

   // findIndex :: (a -> Bool) -> [a] -> Maybe Int
   function findIndex(f, xs) {
       for (var i = 0, lng = xs.length; i < lng; i++) {
           if (f(xs[i])) return i;
       }   
       return undefined;
   }


   var lst = [
     { "name": "Lagos",                "population": 21.0  },
     { "name": "Cairo",                "population": 15.2  },
     { "name": "Kinshasa-Brazzaville", "population": 11.3  },
     { "name": "Greater Johannesburg", "population":  7.55 },
     { "name": "Mogadishu",            "population":  5.85 },
     { "name": "Khartoum-Omdurman",    "population":  4.98 },
     { "name": "Dar Es Salaam",        "population":  4.7  },
     { "name": "Alexandria",           "population":  4.58 },
     { "name": "Abidjan",              "population":  4.4  },
     { "name": "Casablanca",           "population":  3.98 }
   ];

   return {
       darEsSalaamIndex: findIndex(function (x) {
           return x.name === 'Dar Es Salaam';
       }, lst),

       firstBelow5M: find(function (x) {
               return x.population < 5;
           }, lst)
           .name,

       firstApop: find(function (x) {
               return x.name.charAt(0) === 'A';
           }, lst)
           .population
   };

})();</lang>

Output:

{"darEsSalaamIndex":6, "firstBelow5M":"Khartoum-Omdurman", "firstApop":4.58}

ES6

<lang JavaScript>(function () {

   'use strict';

   var lst = [
         { "name": "Lagos",                "population": 21.0  },
         { "name": "Cairo",                "population": 15.2  },
         { "name": "Kinshasa-Brazzaville", "population": 11.3  },
         { "name": "Greater Johannesburg", "population":  7.55 },
         { "name": "Mogadishu",            "population":  5.85 },
         { "name": "Khartoum-Omdurman",    "population":  4.98 },
         { "name": "Dar Es Salaam",        "population":  4.7  },
         { "name": "Alexandria",           "population":  4.58 },
         { "name": "Abidjan",              "population":  4.4  },
         { "name": "Casablanca",           "population":  3.98 }
       ];

   return {
       darEsSalaamIndex: lst.findIndex(x => x.name === 'Dar Es Salaam'),
       firstBelow5M: lst.find(x => x.population < 5)
           .name,
       firstApop: lst.find(x => x.name[0] === 'A')
           .population
   };

})();</lang>

Output:

{"darEsSalaamIndex":6, "firstBelow5M":"Khartoum-Omdurman", "firstApop":4.58}

Lua

Lua tables are well suited as the element type for this task. The master data structure is a table of tables. <lang Lua>-- Dataset declaration local cityPops = {

   {name = "Lagos", population = 21.0},
   {name = "Cairo", population = 15.2},
   {name = "Kinshasa-Brazzaville", population = 11.3},
   {name = "Greater Johannesburg", population = 7.55},
   {name = "Mogadishu", population = 5.85},
   {name = "Khartoum-Omdurman", population = 4.98},
   {name = "Dar Es Salaam", population = 4.7},
   {name = "Alexandria", population = 4.58},
   {name = "Abidjan", population = 4.4},
   {name = "Casablanca", population = 3.98}

}

-- Function to search a dataset using a custom match function function recordSearch (dataset, matchFunction)

   local returnValue
   for index, element in pairs(dataset) do
       returnValue = matchFunction(index, element)
       if returnValue then return returnValue end
   end
   return nil

end

-- Main procedure local testCases = {

   function (i, e) if e.name == "Dar Es Salaam" then return i - 1 end end,
   function (i, e) if e.population < 5 then return e.name end end,
   function (i, e) if e.name:sub(1, 1) == "A" then return e.population end end

} for _, func in pairs(testCases) do print(recordSearch(cityPops, func)) end</lang>

Output:

6
Khartoum-Omdurman
4.58

Perl

The first function from the core module List::Util provides short-circuiting search using a block as predicate. However, it can only return the value of the found element, not its index – so for the first test-case we need to operate on the list of indices.

<lang perl>use feature 'say'; use List::Util qw(first);

my @cities = (

 { name => 'Lagos',                population => 21.0  },
 { name => 'Cairo',                population => 15.2  },
 { name => 'Kinshasa-Brazzaville', population => 11.3  },
 { name => 'Greater Johannesburg', population =>  7.55 },
 { name => 'Mogadishu',            population =>  5.85 },
 { name => 'Khartoum-Omdurman',    population =>  4.98 },
 { name => 'Dar Es Salaam',        population =>  4.7  },
 { name => 'Alexandria',           population =>  4.58 },
 { name => 'Abidjan',              population =>  4.4  },
 { name => 'Casablanca',           population =>  3.98 },

);

my $index1 = first { $cities[$_]{name} eq 'Dar Es Salaam' } 0..$#cities; say $index1;

my $record2 = first { $_->{population} < 5 } @cities; say $record2->{name};

my $record3 = first { $_->{name} =~ /^A/ } @cities; say $record3->{population};</lang>

Output:

6
Khartoum-Omdurman
4.58

The CPAN module List::MoreUtils provides the first_index function which could be used to write that first case more elegantly:

<lang perl6>use List::MoreUtils qw(first_index);

$index1 = first_index { $_->{name} eq 'Dar Es Salaam' } @cities;</lang>

Perl 6

The built-in method .first fulfills the requirements of this task.
It takes any smart-matcher as a predicate. The :k adverb makes it return the key (i.e. numerical index) instead of the value of the element.

Works with: Rakudo version 2016.08

<lang perl6>my @cities =

 { name => 'Lagos',                population => 21.0  },
 { name => 'Cairo',                population => 15.2  },
 { name => 'Kinshasa-Brazzaville', population => 11.3  },
 { name => 'Greater Johannesburg', population =>  7.55 },
 { name => 'Mogadishu',            population =>  5.85 },
 { name => 'Khartoum-Omdurman',    population =>  4.98 },
 { name => 'Dar Es Salaam',        population =>  4.7  },
 { name => 'Alexandria',           population =>  4.58 },
 { name => 'Abidjan',              population =>  4.4  },
 { name => 'Casablanca',           population =>  3.98 },

say @cities.first(*<name> eq 'Dar Es Salaam', :k); say @cities.first(*<population> < 5).<name>; say @cities.first(*<name>.match: /^A/).<population>;</lang>

Output:

6
Khartoum-Omdurman
4.58

Phix

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

<lang Phix>constant CITY_NAME = 1, POPULATION = 2 constant municipalities = {{"Lagos",21},

                          {"Cairo",15.2},
                          {"Kinshasa-Brazzaville",11.3},
                          {"Greater Johannesburg",7.55},
                          {"Mogadishu",5.85},
                          {"Khartoum-Omdurman",4.98},
                          {"Dar Es Salaam",4.7},
                          {"Alexandria",4.58},
                          {"Abidjan",4.4},
                          {"Casablanca",3.98}}

function searchfor(sequence s, integer rid, object user_data, integer return_index=0)

   for i=1 to length(s) do
       if call_func(rid,{s[i],user_data}) then
           return iff(return_index?i:s[i])
       end if
   end for
   return 0 -- not found

end function

function city_named(sequence si, string city_name)

   return si[CITY_NAME]=city_name

end function

?searchfor(municipalities,routine_id("city_named"),"Dar Es Salaam",1)

function smaller_than(sequence si, atom population)

   return si[POPULATION]<population

end function

?searchfor(municipalities,routine_id("smaller_than"),5)[CITY_NAME]</lang> The columnize function reorganises hetrogenous data into corresponding homogenous arrays, which can make this sort of thing much simpler, at least for exact matches. <lang Phix>constant {cities,populations} = columnize(municipalities)

?find(cities,"Dar Es Salaam")</lang>

Output:

7
"Khartoum-Omdurman"
7

Note that Phix subscripts are 1-based, hence the output of 7 not 6.

PHP

This example is incorrect. Please fix the code and remove this message.

Details:
The stuff about searching simple lists of strings or numbers, doesn't belong here. Maybe move it to Search a list.

Solve the problem specified in the task description instead.

<lang PHP>echo in_array('needle', ['hay', 'stack']) ? 'Found' : 'Not found';</lang>

Racket

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

The more idiomatic functions for the task is findf but it doesn't provide the position of the element in the list, so we write a variant. If the item is not found we return #f as most of the Racket primitives do in these cases. <lang Racket>#lang racket

(define (findf/pos proc lst)

 (let loop ([lst lst] [pos 0])
   (cond
     [(null? lst) #f]
     [(proc (car lst)) pos]
     [else (loop (cdr lst) (add1 pos))])))</lang>

Now we define the list that has the data for the task. <lang Racket>(define data '(("Lagos" 21)

              ("Cairo" 15.2)
              ("Kinshasa-Brazzaville" 11.3)
              ("Greater Johannesburg" 7.55)
              ("Mogadishu" 5.85)
              ("Khartoum-Omdurman" 4.98)
              ("Dar Es Salaam" 4.7)
              ("Alexandria" 4.58)
              ("Abidjan" 4.4)
              ("Casablanca" 3.98)))</lang>

We write tiny wrappers to solve the specific task. <lang Racket>(define (city-pos name)

 (findf/pos (lambda (x) (string=? (car x) name)) data))

(city-pos "Dar Es Salaam") (city-pos "Buenos Aires")

(define (city-smaller pop)

 (let ([city (findf (lambda (x) (< (cadr x) pop)) data)])
   (and city (car city))))

(city-smaller 5) (city-smaller -1)</lang>

Output:

6
#f
"Khartoum-Omdurman"
#f

REXX

It is more idiomatic in REXX to use sparse arrays to express a list of CSV values, especially those which have
embedded blanks in them (or other special characters).

Most REXX interpreters use (very efficient) hashing to index sparse arrays, which is much faster than performing an
incremental (sequential) search through an indexed array.

Only one loop is needed to find the result for the 2nd task requirement (although the loop could be eliminated).
The other two task requirements are found without using traditional IF statements.

The approach taken in this REXX program makes use of a DO WHILE and DO UNTIL structure which
makes it much simpler (and idiomatic) and easier to code (instead of adding multiple IF statements to a
generic search routine/function).

This REXX version does not rely on the list being sorted by population count. <lang rexx>/*REXX program (when using criteria) locates values (indices) from an associate array. */ $="Lagos=21, Cairo=15.2, Kinshasa-Brazzaville=11.3, Greater Johannesburg=7.55, Mogadishu=5.85,",

 "Khartoum-Omdurman=4.98, Dar Es Salaam=4.7,  Alexandria=4.58,   Abidjan=4.4,  Casablanca=3.98"

@.= '(city not found)'; city.= "(no city)" /*city search results for not found.*/

                                                   /* [↓]  construct associate arrays. */
   do #=0  while $\=;  parse var $ c '=' p "," $;  c=space(c);  parse var c a 2;  @.c=#
   city.#=c;  pop.#=p;  pop.c=#;  if @.a==@.  then @.a=c;  /*assign city, pop, indices.*/
   end   /*#*/                                     /* [↑]  city array starts at 0 index*/
                       /*▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ task 1:  show the  INDEX  of a city.*/

town= 'Dar Es Salaam' /*the name of a city for the search.*/ say 'The city of ' town " has an index of: " @.town /*show (zero─based) index of a city.*/ say /*▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ task 2: show 1st city whose pop<5 M*/ many=5 /*size of a city's pop in millions. */

     do k=0  for #  until pop.k<many; end          /*find a city's pop from an index.  */

say '1st city that has a population less than ' many " million is: " city.k say /*▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ task 3: show 1st city with A* name.*/ c1= 'A' /*1st character of a city for search*/ say '1st city that starts with the letter' c1 "is: " @.c1 /*stick a fork in it, all done*/</lang> output when using the default inputs:

The city of  Dar Es Salaam  has an index of:  6

1st city that has a population less than  5  million is:  Khartoum-Omdurman

1st city that starts with the letter A is:  Alexandria

Ring

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

The code uses a hard-coded loop for the population<5 search. Consider a more generic approach as asked for by the task description, or explain to readers why that wasn't possible or sensible in this language.

<lang Ring> name = 1 population = 2 cities = [ ["Lagos", 21] , ["Cairo", 15.2], ["Kinshasa-Brazzaville", 11.3], ["Greater Johannesburg", 7.55], ["Mogadishu", 5.85], ["Khartoum-Omdurman", 4.98], ["Dar Es Salaam", 4.7], ["Alexandria", 4.58], ["Abidjan", 4.4], ["Casablanca", 3.98 ] ] See find(cities,"Dar Es Salaam",name) + nl # output = 7 See cities[find(cities,4.58,population)][name] + nl # output = Alexandria for x in cities if x[population] < 5 see x[name] + nl exit ok next # output = Khartoum-Omdurman </lang>

Ruby

<lang Ruby>cities = [

   {name: "Lagos", population: 21}, 
   {name: "Cairo", population: 15.2}, 
   {name: "Kinshasa-Brazzaville", population: 11.3}, 
   {name: "Greater Johannesburg", population: 7.55}, 
   {name: "Mogadishu", population: 5.85}, 
   {name: "Khartoum-Omdurman", population: 4.98}, 
   {name: "Dar Es Salaam", population: 4.7}, 
   {name: "Alexandria", population: 4.58}, 
   {name: "Abidjan", population: 4.4}, 
   {name: "Casablanca", population: 3.98},

]

puts cities.index{|city| city[:name] == "Dar Es Salaam"} # => 6 puts cities.find {|city| city[:population] < 5.0}[:name] # => Khartoum-Omdurman puts cities.find {|city| city[:name][0] == "A"}[:population] # => 4.58 </lang>

Sidef

<lang ruby>struct City {

   String name,
   Number population,

}

var cities = [

   City("Lagos", 21),
   City("Cairo", 15.2),
   City("Kinshasa-Brazzaville", 11.3),
   City("Greater Johannesburg", 7.55),
   City("Mogadishu", 5.85),
   City("Khartoum-Omdurman", 4.98),
   City("Dar Es Salaam", 4.7),
   City("Alexandria", 4.58),
   City("Abidjan", 4.4),
   City("Casablanca", 3.98),

]

say cities.index{|city| city.name == "Dar Es Salaam"} say cities.first{|city| city.population < 5.0}.name say cities.first{|city| city.name.begins_with("A")}.population</lang>

Output:

6
Khartoum-Omdurman
4.58

Tcl

This example needs updating due to a modification in the task. Please examine and fix the code if needed, then remove this message.

Details:
A third test-case (search condition) has been added to the task description.

You should mention to readers that the bisect example relies in the list being sorted.

You should explain to readers why you deviated from the data structure in the task description by making the records positional (index-based) instead of associative (key-based).

Tcl's lsearch command takes many useful options. This task illustrates the -index and -bisect options.

<lang Tcl>set cities {

   {"Lagos"    21}
   {"Cairo"    15.2}
   {"Kinshasa Brazzaville"     11.3}
   {"Greater Johannesburg"     7.55}
   {"Mogadishu"        5.85}
   {"Khartoum Omdurman"        4.98}
   {"Dar Es Salaam"    4.7}
   {"Alexandria"       4.58}
   {"Abidjan"  4.4}
   {"Casablanca"       3.98}

} puts "Dar Es Salaam is at position: [

   lsearch -index 0 $cities "Dar Es Salaam"

]."

set cities_by_size [

   lsort -index 1 -real $cities

]

puts "The largest city of < 5m is: [

   lsearch -inline -index 1 -bisect -real $cities_by_size 5.0

]"</lang>

Output:

Dar Es Salaam is at position: 6.
The largest city of < 5m is: "Khartoum Omdurman"        4.98

zkl

<lang zkl>list:=T(SD("name","Lagos", "population",21.0), // SD is a fixed dictionary

       SD("name","Cairo", 		  "population",15.2),

SD("name","Kinshasa-Brazzaville", "population",11.3), SD("name","Greater Johannesburg", "population", 7.55), SD("name","Mogadishu", "population", 5.85), SD("name","Khartoum-Omdurman", "population", 4.98), SD("name","Dar Es Salaam", "population", 4.7), SD("name","Alexandria", "population", 4.58), SD("name","Abidjan", "population", 4.4), SD("name","Casablanca", "population", 3.98));

// Test case 1: n:=list.filter1n(fcn(city){ city["name"]=="Dar Es Salaam" }); // one way n:=list.filter1n(fcn(city){ city["name"].matches("dar es salaam") }); // or this way n.println("==index of ",list[n].values);

// Test case 2: city:=list.filter1(fcn(city){ city["population"]<5.0 }); // stop after first match city["name"].println(" is the first city with population under 5 million.");

// Test case 3: city:=list.filter1(fcn(city){ city["name"][0]=="A" }); println("The first \"A*\" city (%s) with population under 5 million: %f".fmt(city.values.xplode()));</lang> where a SD is a small read only dictionary and filter1 is a filter that stops at the first match (returning the matched item). The filter method returns False on failure. The YAJL library could be used to parse the JSON data directly (eg if the data is from the web).

Output:

6==index of L("Dar Es Salaam",4.7)
Khartoum-Omdurman is the first city with population under 5 million.
The first "A*" city (Alexandria) with population under 5 million: 4.580000

@@ Line 932: / Line 932: @@
 =={{header|Sidef}}==
-{{update|Sidef|
-* A third test-case (search condition) has been added to the task description.
-}}
 <lang ruby>struct City {
@@ Line 955: / Line 951: @@
 ]
-say cities.index{|city| city.name == "Dar Es Salaam"}   # => 6
+say cities.index{|city| city.name == "Dar Es Salaam"}
-say cities.first{|city| city.population < 5.0}.name     # => Khartoum-Omdurman</lang>
+say cities.first{|city| city.population < 5.0}.name
+say cities.first{|city| city.name.begins_with("A")}.population</lang>
+{{out}}
+<pre>
+Khartoum-Omdurman
+.58
+</pre>
 =={{header|Tcl}}==