代做java | oop代写 | 作业hadoop | lab代写 – Introduction to Programming on Cluster

COMP

代做java | oop代写 | 作业hadoop | lab代写 – 该题目是一个常规的hadoop的练习题目代写, 是比较典型的java/oop/hadoop等代写方向, 该题目是值得借鉴的lab代写的题目

java代写 代写java

lab 1B Introduction to Programming on Cluster

Using Text UI

(last revamped by Mandel Chan)

Connecting the Big-Data Cluster

  1. Download SSH client to establish a connection to the big-data cluster. You may use the PuTTY. Download URL:https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
  2. First, connect to the gateway machinefaith.comp.hkbu.edu.hk (port:22), and log in using your COMP department user name and password.
  1. In the command prompt, use the following command to connect a node of the big- data cluster. $ ssh csrXX ## Where XX is a number in the range from 5 0 to 5 4
  2. Use your research machine password to sign in the node.

Running Example

Then, Lets follow the steps below to run the Had oop bundled example wordcount under TeZ framework.

  1. Change the current directory to word_count. $ cd /home/bigdata/word_count
  2. Run it. The run_wordcount file is a script that contains the commands for running the program. $ ./run_wordcount
  3. You can list all the output files by using the following command: $ hadoop fs -ls ~/output_txt
  4. Run the following command to show the contents of the output files: $ hadoop fs -cat ~/output_txt/part-*

Getting Source Code

We can obtain the source code of the map-reduce example from its installation path. The following commands are used for obtaining the source code:

$ mkdir ~/src
$ cd /opt/hadoop/share/hadoop/mapreduce/sources/
$ cp hadoop-mapreduce-examples-2.7.1-sources.jar ~/src
$ cd ~/src
$ jar xf hadoop-mapreduce-examples-2.7.1-sources.jar
$ cd org/apache/hadoop/examples
$ ls

Then, you can edit the sample code using tools on Unix (e.g. emacs or vi) or uploading the code after editing using IntelliJ.

Compile

Once you finished your coding, you can compile them to java class files. You can compile the files by using commands on Linux or use IntelliJ to make the compilation and upload to the server.

The following is the procedure of the JAR generation:

  1. Compile the files in ~/src:
$ javac -classpath $CLASSPATH:. org/apache/hadoop/examples/WordCount.java
  1. Then, pack the class files into a single JAR package:
$ jar -cvf wordcount.jar org/apache/hadoop/examples/*.class

Running Program on Cluster

The final step is to run your program on the cluster. If you have not yet logged in the research

machine in the big-data cluster, you need to run the steps showed in the Connecting the Big- Data Cluster section again.

First, we need to copy from data from local (NFS) to Hadoop HDFS

$ hadoop fs -copyFromLocal /home/bigdata/word_count/input_txt
/home/comp/$USER/

Then, following commands are used for running your program:

$ hadoop fs -rm -r ~/output_txt
$ hadoop jar ./wordcount.jar org.apache.hadoop.examples.WordCount
~/input_txt ~/output_txt

Finally, the following commands are used for printing the output on screen:

$ hadoop fs -cat ~/output_txt/*