Map-Reduce
RDBMS/SQL empowers us by restricting how we store and query data.
Map-Reduce empowers us by restrcting how we implement algorithms.
Why “map” and “reduce”?
Let’s say there are bunch of documents or files and we want to find out the file including certain words. Intuitively, we could traverse these documents one by one and get the final output. However, any word appears in a document is independent of other documents. It’s beeter to process documents independently and combine the results. That’s why we use map-reduce.
map(function f, values[x1, x2, ..., xn]) -> [f(x1),f(x2), ...f(xn)]
reduce(function g, values[x1, x2, ..., xn]) -> g(x1, reduce(g, [x2,..., xn]))