big data 8
PRACTICAL NO – 8 Aim: Implementing Clustering Algorithm Using Map-Reduce Algorithm for Mapper Input: A set of objects X = {x1, x2… xn}, A Set ofinitial Centroids C = {c1, c2, ,ck} Output: An output list which contains pairs of (Ci, xj)where 1 ≤ i≤ n and 1 ≤j ≤ k Procedure M1←{x1, x2… xm} current_centroids←C Distance (p, q) =√Σd i=1 (pi– qi) 2 (where pi (or qi)is the coordinate of p (or q) in dimension i) for all xi ϵ M1 such that 1≤i≤m do bestCentroid←null minDist←∞ for all c ϵ current_centroids do emit (bestCentroid, xi) i+=1 dist← distance (xi, c) if (bestCentroid = null || dist<minDist) then minDist←dist bestCentroid ← c end if end for end for return Outputlist Algorithm for Reducer Input: (Key, Value), where key = bestCentroid and Value =Objects assigned to the lpgr'; 1\] x centroid by the mapper Output: