COMP7630 web Intelligence and its Applications
web | Algorithm代做 | 代做assignment | database作业 – 这是一个web的practice, 考察web的理解, 涉及了web | Algorithm代做 | 代做assignment | database等代写方面, 这个项目是assignment代写的代写题目
Department of Computer Science
COMP7630 web Intelligence and its Applications
Semester 2, 2022/
assignment 2 (Due March 2 6 , 23:59)
Evolutionary Algorithms and Association Rules Mining
Submit via the link provided on the Moodle course page by the 2 6 March. It is possible to submit either a Word or PDF document. For handwritten answers, please make sure they are legible, then scan them and submit a softcopy.
EVOLUTIONARY ALGORITHMS ( 35 MARKS)
- Enumerate all the possible offspring solutions which can be generated by applying the one-point crossover to the following two parent solutions: Parent1 = 01010 Parent2 = 10101
- What are the probabilities that the single bit-flip mutation and the uniform mutation operators, when applied to the solution x=1 10011 , will generate the new solution y=1 01101 as a mutant? Also explain why.
- Let NoMutationGA be a genetic Algorithm that does not use any mutation operator, i.e., its only components are the selection, crossover and replacement operators (as those seen in the lessons). Suppose also that, at iteration t of an execution, the population of NoMutationGA is formed by the following solutions: x1 = 01010 x2 = 01111 x3 = 11111 x4 = 11101 x5 = 01101 Discuss which solutions can and cannot be generated by NoMutationGA in the iterations t + k , for k 1.
- Let f and g be two simple Objective functions, taking in input a bit-string of length 4 and returning a real number to be maximized, defined as follows: def f(x): return x[0] + x[1] + x[2] + x[3] def g(x): return 2.8 * f(x) + 88. Also consider a genetic algorithm, called GA1 , which uses the following operators: tournament selection, one-point crossover, uniform mutation, and plus replacement. Suppose that GA1 is run once for each objective function by initializing the random number generator with the same seed. Let indicate the two executions with GA1 ( f , seed ) and GA1 ( g , seed ). Is there a relationship between the solutions evaluated in the executions GA1 ( f , seed ) and GA1 ( g , seed )? If so, explain what it is and try to discuss a more general property of GA.
- Let h be the objective function, taking in input a vector in ^3 and returning a real number to be minimized, defined as follows: def h(x): if x[0]==x[1]==x[2]==8.8: return 0. return 1. Also consider: (i) a very trivial algorithm, called Random Search , which randomly generates k solutions, evaluates them, and returns the best one; (ii) the Differential Evolution algorithm with a budget of k evaluations; (iii) the Particle Sw arm Optimization algorithm with a budget of k evaluations. Do you expect to observe any difference among the performances of the three algorithms in minimizing h? Also explain why.
ASSOCIATION RULES mining ( 35 MARKS)
- Table 1 shows the database of transactions of an electronic store. Transaction ID Items T1 Laptop, Mouse, Smartphone T2 Laptop, Mouse T3 Laptop, UsbPen, Headset T4 Headset, UsbPen T5 Headset, Smartphone T6 Laptop, UsbPen, Headset Table 1 a. Trace the results of using the Apriori algorithm, considering minimum support minsup =33% and minimum confidence minconf =60%, by showing both the candidate and frequent itemsets for each database scan. b. Enumerate all the final frequent itemsets. c. Indicate the association rules that are generated, along with their support and confidence, then highlight the valid ones. d. List the valid association rules, sorted by lift and reporting their lift values.
- Suppose you are responsible for managing the stores e-commerce website of question 6. Moreover, you notice that: – a customer has the UsbPen item in her/his virtual shopping basket, – for graphical reasons, the basket webpage only allows one item to be suggested. Given the database of transactions of Table 1 , which item would you suggest? Also explain why.
- Suppose that each transaction has a numeric weight (e.g., the amount of money paid), design and discuss a strategy that allows the Apriori algorithm to take the weights into account, but without changing the internal mechanisms of Apriori.
EVOLUTIONARY ALGORITHMS FOR ASSOCIATION RULES MINING ( 30 MARKS)
- Suppose to have a large database of transactions, each representing the set of hashtags in a social media post. The number of different hashtags is also very large. This scenario is particularly challenging for the Apriori algorithm, so we decide to design a genetic algorithm, called GA-ARules , for mining some^1 frequent itemsets of co-occurring hashtags. GA-ARules is designed as shown in the pseudocode below. initialization while termination condition not satisfied do: selection crossover mutation evaluation of population fitnesses update the external archive replacement return the external archive In practice, the individual solutions in the population of GA-ARules are itemsets that, after their fitness is evaluated, are added to an archive (external to the evolved population) if and only if their support is greater than a user specified minimum support threshold minsup. This external archive is finally returned by GA-ARules. All the other GA-ARules operators (selection, crossover, mutation, replacement) can be chosen among those described in the lessons. In the light of that, answer to the following questions. a. Design a representation for the solutions evolved by GA-ARules. b. Is there any preprocessing calculation which may reduce the size of the search space navigated by GA-ARules? If so, describe how it should work. c. Based on the chosen representation, which crossover can be adopted? Either choose one of the crossover operators shown in the lessons or quickly design a suitable one. Also explain why the chosen crossover has to be preferred for the chosen representation. d. Design a fitness function to evaluate the goodness of an itemset that also encourages the production of small itemsets. e. How many scans of the database are required to evaluate a population of N solutions at each iteration of GA-ARules? Also explain why. f. When a frequent itemset F is added to the external archive, is it possible to automatically infer other frequent itemsets related to F , even if they were never visited by GA-ARules? If so, explain how.
(^1) Due to the nature of evolutionary algorithms, it is not possible to guarantee that a GA will learn all the frequent itemsets.