Lab 3: A Simple MapReduce-Style Wordcount Application
代写project | 大数据 | big data代写 | mapreduce – 这个项目是project代写的代写题目,涉及了大数据相关的内容
CMPSC 473, SUMMER 2022
Released on July 21, 2022, due on August 04, 2022, ll:59:59pm
Raj Pandey and Bhuvan Urgaonkar
1 Purpose and Background
This project is designed to give you experience in writing multi-threaded programs by
implementing a simplified MapReduce-style wordcount application. By working on this project:
- You will learn to write multi-threaded code that correctly deals with race conditions.
- You will carry out a simple performance evaluation to examine the performance impact of (i) the degree of parallelism in the mapper stage and (ii) the size of the shared buffer which the two stages of your application will use to communicate.
Inp ut
File
read
fappers Buffer
produce
Reducer
consume write
Output
File
Figure 1: Overview of our Mapreduce-style multi-threaded wordcount application.
The wordcount application takes as input a text file and produces as output the counts for all uniquely occurring words in the input file arranged in an alphabetically increas ing order. We will assume that the words within our input files will only contain letters of the English alphabet and the digits 0-9 (i.e., no punctuation marks or other special characters). Our wordcount will consist of two stages. The first stage, called "mapper,"