Stream merge: Difference between revisions
Content added Content deleted
(→Python) |
|||
Line 11: | Line 11: | ||
== Python == |
== Python == |
||
Built-in function <code>open</code> opens a file for reading and returns a line-by-line iterator (stream) over the file. |
|||
There exists a standard library function <code>heapq.merge</code> that takes any number of sorted stream iterators and merges them into one sorted iterator, using a [[heap]]. |
|||
<lang python>import heapq |
<lang python>import heapq |
Revision as of 09:22, 15 June 2016
Stream merge
You are encouraged to solve this task according to the task description, using any language you may know.
You are encouraged to solve this task according to the task description, using any language you may know.
- 2-stream merge
- Read two sorted streams of items from external source (e.g. disk, or network), and write one stream of sorted items to external sink.
- Common algorithm: keep 1 buffered item from each source, select minimal of them, write it, fetch another item from that stream from which the written item was.
- N-stream merge
- The same as above, but reading from N sources.
- Common algorithm: same as above, but keep buffered items and their source descriptors in a heap.
Assuming streams are very big. You must not suck them whole in the memory, but read them as streams.
Python
Built-in function open
opens a file for reading and returns a line-by-line iterator (stream) over the file.
There exists a standard library function heapq.merge
that takes any number of sorted stream iterators and merges them into one sorted iterator, using a heap.
<lang python>import heapq import sys
sources = sys.argv[1:] for item in heapq.merge(open(source) for source in sources):
print(item)</lang>
Shell
sort --merge source1 source2 sourceN > sink