# social network | 作业network | Algorithm代做 | spark | 代做lab – 1 People You Might Know

### 1 People You Might Know

Network | 作业network | Algorithm代做 | spark | 代做lab – 本题是一个利用social network进行练习的代做, 对social network的流程进行训练解析, 涵盖了social network/network/Algorithm/spark等方面, 这是值得参考的lab代写的题目

Write a spark program that implements a simple People You Might Know social network friend- ship recommendation algorithm. The key idea is that if two people have a lot of mutual friends, then the system should recommend that they connect with each other. For example,AandE in Figure 1 have 3 mutual friends (i.e.,B,C,D), which is a high number in this toy example. But, if the two users are already friends, the system should not recommend them to each, even if they share mutual friends. For example,EandFhave one mutual friendG, but they are already friends with each other.

#### A

``````Figure 1:A social network toy example.
``````

Data:

• Associated data file isego-facebook.txtinA1/.
• The file contains the edge list and has multiple lines in the following format: Here,andare unique integer IDs corresponding to two unique user,
##### 1
``````respectively. The pair denotes<User1>and<User2>are friends. Note that the friend-
ships are mutual (i.e., edges are undirected): ifAis friend withBthenBis also friend with
A. For the friendship betweenAandB, we only have one edge in the data.
``````

Algorithm: Let us use a simple Algorithm such that, for each useru, the algorithm recommends N = 10users who are not already friends withu, but have the most number of mutual friends in common withu.

Output:

• The output should contain one line per user in the following format: whereis a unique ID corresponding to a user andis a comma separated list of unique IDs corresponding to the algorithms recommendation of people thatmight know, ordered in decreasing number of mutual friends.
• Note: The exact number of recommendations per user could be less than 10. If a user has less than 10 second-degree friends, output all of them in decreasing order of the number of mutual friends. If a user has no friends, you can provide an empty list of recommendations. If there are recommended users with the same number of mutual friends, then output those user IDs in numerically ascending order.

Pipeline sketch: Please provide a description of how you used Spark to solve this problem. Dont write more than 3 to 4 sentences for this: we only want a very high-level description of your strategy to tackle this problem.

Tips:

• Use Google Co lab to use Spark seamlessly, e.g., copy and adapt the setup cells from Colab
• Before submitting a complete application to Spark, you may go line by line, checking the outputs of each step. Command.take(X)should be helpful, if you want to check the firstXelements in the RDD.
• For sanity check, your top 10 recommendations foruser ID 1571should be: 35, 247, 716, 719, 1526, 1527, 1528, 1529, 1530, 1531.
• The execution may take a while.
• You can also create a toy test dataset (e.g., using Figure 1 ) to help you debug the program.

What to submit: You need to submit the following two files to BlackBoard.

1. A short writeup contains
• A short paragraph sketching your spark pipeline. (16 pts)
• The recommendations for the users with following user IDs: 10, 152, 288, 603, 714, 1525, 2434, 2681. (8 pts for each, 64 pts in total)
2. Your code. (20 pts)