Fall 2018 CS24 Project 2
Data structure | project | C++代写 | 算法代写 – 这是一个c++设计的practice, 考察数据结构link list的理解, 是有一定代表意义的Data structure/c++等代写方向, 这是值得参考的project代写的题目
Due: December 07, 2018
Objectives : This project is the last project you will be working on this quarter. The main Objective of this project is to gain experience developing larger programs in C++. Project specification : In this project, you will be building upon the functionality you built in Project 1, with a few modifications. You will be creating 2 executables with different functionalities for this project.
- In the first executable, you will be taking as input a word to be searched and also a threshold value for the count and you must output the list of files in which the word appears a minimum of (threshold) number of times. For example, given a word cat and threshold 2, you must output the list of files in which the word cat appears 2 or more times.
- In the second executable, you will be taking as input 2 words, and you must output the union of the 2 lists of files in which the 2 words appear. Each of the above 2 executables takes exactly one input during execution, so the inputs need not be given in a while loop. It will suffice if the program exits after printing the output for the one single input given. Input specification : The program takes as a command line argument a directory which contains set of files, containing multiple words. Words can appear multiple times in a single file or in different files. The input files are assumed to be stored in a dedicated directory (for e.g. /cs/class/cs24/project1/input) and loaded in memory upon program execution. Use the driver program given, to get a list of (word count,file) from the list of input files. After the words have been loaded into memory, each of the above 2 executables takes a single input from the user, prints the corresponding output and exits. Program functionality : Below you can find the format specification: First executable: $./wordsearchcount <path_to_input_files> Examples for execution format: $./wordsearchcount input Enter word: cat Enter threshold: 2 File: file1.txt; Count: 3
File: file4.txt; Count: 3 File: file2.txt; Count: 2 —Program exits If executed as specified above, the program should return the name of files that contain word cat 2 or more times, and exit. Second executable: $./wordsearchunion <path_to_input_files> Examples for execution format: $./wordsearchunion input Enter word1:cat Enter word2:dog file1.txt file4.txt file6.txt file8.txt —Program exits If executed as specified above, the program should return the union of the list of files that contain words – cat or dog. Implementation requirements : You will be retaining the same functionality as Project 1, but with some key changes :
- Instead of using an array to store the list of words, you will be creating a doubly linked list of words in which the words are sorted in alphabetical order, in the doubly linked list.
- Instead of using a bag to store the list of files, you will be again creating a doubly linked list of filenames with their respective counts and this doubly linked list needs to be sorted by the number of occurrences of the word in the given file, in decreasing order. For example, in the doubly linked list , if file A contains the word 3 times and file B contains the word 1 time, the doubly linked list will first have file A, followed by file B. Key implementation detail: In project 1, the details of the file name along with the respective count of the word was displayed from within the bag.cpp file (i.e – the print method was inside the bag cpp file). In this project, for each of the 2 executables, the reference to the linked list must be returned to the calling function in wordsearchcount / wordsearchunion cpp file and the printing of the output must happen from within this cpp file.
A basic layout of the Data structure you would need to implement is illustrated on the figure below: Pointers for code changes in the implementation:
- Word.cpp must now contain a reference to a doubly linked list of files.
- Bag.cpp must now be replaced by list.cpp which must take care of the functionalities for the doubly linked list of files.
- Itemtype.cpp may be reused as before.
- For finding the union of the 2 linked list of files, use the below pseudo code as reference: Create a new list and add the files corresponding to the first list (list of files of word 1 ) into this list. Take each element of the second list(list of files of word 2) and check if it is present in the newly created list, and if it is not there, add it to the list, else do not add it if it is already present in the list. At the end, the new list will contain the union of the 2 lists. Print out the list of files in this new list.
- You are advised to use the doubly linked list implementation discussed in class, and modify it to suit the needs of the project. As before, you may assume the maximum number of files to be 100 and the maximum number of words to be 1000. You will need 10 files: wordsearchcount.h, wordsearchcount.cpp, wordsearchunion.h, wordsearchunion.cpp, itemtype.h, itemtype.cpp, list.h, list.cpp, word.h, word.cpp.
Instructions for compilation: Your program should compile on a CSIL machine with the following command without any errors or warnings. $ g++ -o wordsearchcount wordsearchcount.cpp itemtype.cpp list.cpp word.cpp Turn-in procedure : Upload your files to gauchospace under the corresponding project submission folder.