Subset sum problem

From Rosetta Code
Revision as of 19:03, 1 January 2012 by rosettacode>Sluggo (more comments)
Subset sum problem is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Implement a function/procedure/method/subroutine that takes a set/array/list/stream/table/collection of words with integer weights, and identifies a non-empty subset of them whose weights sum to zero (cf. the Dropbox Diet candidate screening exercise and the Subset sum problem Wikipedia article).

For example, for this set of weighted words, one solution would be the set of words {elysee, efferent, deploy, departure, centipede, bonnet, balm, archbishop}, because their respective weights of -326, 54, 44, 952, -658, 452, 397, and -915 sum to zero.

Table of weighted words
word weight
alliance -624
archbishop -915
balm 397
bonnet 452
brute 870
centipede -658
cobol 362
covariate 590
departure 952
deploy 44
diophantine 645
efferent 54
elysee -326
eradicate 376
escritoire 856
exorcism -983
fiat 170
filmy -874
flatworm 503
gestapo 915
infra -847
isis -982
lindholm 999
markham 475
mincemeat -880
moresby 756
mycenae 183
plugging -266
smokescreen 423
speakeasy -745
vein 813

Another solution would be the set of words {flatworm, gestapo, infra, isis, lindholm, plugging, smokescreen, speakeasy}, because their respective weights of 503, 915, -847, -982, 999, -266, 423, and -745 also sum to zero.

You may assume the weights range from -1000 to 1000. If there are multiple solutions, only one needs to be found. Use any algorithm you want and demonstrate it on a set of at least 30 weighted words with the results shown in a human readable form. Note that an implementation that depends on enumerating all possible subsets is likely to be infeasible.

Ursala

This solution scans the set sequentially while maintaining a record of all distinct sums obtainable by words encountered thus far, and stops when a zero sum is found. <lang Ursala>#import std

  1. import int

weights =

{

  'alliance': -624,
  'archbishop': -915,
  'balm': 397,
  'bonnet': 452,
  'brute': 870,
  'centipede': -658,
  'cobol': 362,
  'covariate': 590,
  'departure': 952,
  'deploy': 44,
  'diophantine': 645,
  'efferent': 54,
  'elysee': -326,
  'eradicate': 376,
  'escritoire': 856,
  'exorcism': -983,
  'fiat': 170,
  'filmy': -874,
  'flatworm': 503,
  'gestapo': 915,
  'infra': -847,
  'isis': -982,
  'lindholm': 999,
  'markham': 475,
  'mincemeat': -880,
  'moresby': 756,
  'mycenae': 183,
  'plugging': -266,
  'smokescreen': 423,
  'speakeasy': -745,
  'vein': 813}

nullset = ~&nZFihmPB+ =><> ~&ng?r\~&r ^TnK2hS\~&r ^C/~&lmPlNCX *D ^A/sum@lmPrnPX ~&lrmPC

  1. cast %zm

main = nullset weights</lang> The name of the function that takes the weighted set is nullset. It manipulates a partial result represented as a list of pairs, each containing a subset of weighted words and the sum of their weights. Here is a rough translation:

  • =><> fold right combinator with the empty list as the vacuuous case
  • ~&ng?r\~&r If the partial result contains a zero sum, return it.
  • ^TnK2hS\~&r Concatenate the partial result with the new list of subsets (computed as follows) and delete duplicate sums.
  • ^C/~&lmPlNCX Cons a singleton subset containing the next word to the partial results.
  • *D Distribute the next word in the set to the partial results and do the following to each.
  • sum@lmPrnPX Add the weight of the new word to the existing sum.
  • ~&lrmPC Cons the new word to the list of existing ones.
  • ~&nZFihmPB+ To conclude, search for a result with a zero sum, if any, and return its associated subset of weighted words.

output:

<
   'flatworm': 503,
   'gestapo': 915,
   'infra': -847,
   'isis': -982,
   'lindholm': 999,
   'plugging': -266,
   'smokescreen': 423,
   'speakeasy': -745>