Interesting paper about changes to M-R, in part to enable online processing. Ideas like pipelining and better inter-job data flow have been on the radar for a while.

This kind of thing will likely be useful on the Amazon cloud. Rather than uploading data and then running M-R, it might be possible to begin the job as the data is uploading, thus getting results back sooner.

Leave a Reply