python代写 – COMP 1010 Fall 2017 Assignment 4

本次作业设计数据分析和python代写
Material Covered
• Arrays and Functions
• Text Plotting
Notes
• Name your sketches using your name, the assignment number, and question number, as in this example: SmithJohnA4Q1.
• Your program must run upon download to receive any marks.
• To submit the assignment, upload a .zip archive of the .pde les via UM Learn
(from the course web page go to “Assessments → Assignments”).
• Your assignment must follow the programming standards document
published on UMLearn.
• Place your functions in a standard location so the marker can nd them
quickly. Apply these 2 rules:
1. Put setup and draw right under the global variables. 2. Arrange the other functions in alphabetical order.
Word Cloud [18 marks]
This year marks the 152nd anniversary of the publication of Alice’s Adventures in Wonderland by Lewis Carroll. In this assignment, your program will read the text of this book, extract a list of words, and plot them in a non-overlapping random word cloud, as shown in the diagram below.
Solving the positions of words in a word cloud to
not overlap is a hard problem. In this assignment,
we will apply a method to work toward a solution incrementally using random movements. Since we are using Processing, we will see this as an animation: the words all start out at the middle of the screen, and you will see them move around to nd their nal resting place.
Have a look at the provided template program and use it as a starting point for the assignment as it contains important parts of the code. You should look at it
now. Also, import the text le alice_just_text.txt (see footnote) into your project as was done in assignment 3.
The text of the book comes from Project Gutenberg, which has made thousands of old books that are no longer under copyright available for everyone to read for free. This particular book can be found at http://www.gutenberg.org/ebooks/11 . Project Gutenberg adds a lot of source information to each text le, information that would not be suitable for our analysis purposes, so the provided le has been modi ed to contain just the text of the original book.
Page 1 of 4

Overview
Your program has several problems to solve before it will work. You need to pick a set of random words from the book to plot (this is a simpli cation of the problem – in reality it should actually pick the most common words!), you need to plot them on the canvas at appropriate sizes, you need to determine if words are overlapping, and if they are, move them away from each other.
Your program will follow this basic process
• In setup:
o load the book and process it to become a very long array of words
corresponding to the text. For example, “This is a cat” would be a
string array: {“This”, “is”, “a”, “cat”}. This code is provided.
o Select a random sub-sample of words to use, by calling sample,
which you will create.
o Pre-calculate the word sizes and positions.
• In draw, it clears the screen, and calls drawWordCloud(), which does the heavy lifting:
o For each word:
– Draws it at its current location
– Checks if it overlaps with the other words, and if so, nd the
average x and y distance to all overlapping words
– Moves the word away from the overlapping ones to try and
resolve the situation.
– The rst word in the array does not ever move, and stays in
the center of the screen
Text size and boundaries: knowing if words overlap
To determine whether or not words overlap, we will use a shortcut – we will construct a reasonable box around each word to represent its boundaries, and check if that box overlaps with other words’ boxes. This becomes a little easier
if we think of our text as having it’s draw location be the very center of the box (instead of the bottom left). In Processing, this can be achieved with the textAlign(CENTER, CENTER) ; command. Now, calling text will draw the text centered on the coordinate you specify. NOTE: throughout the whole assignment, use the centers for the x and y coordinates of each word.
Note how the size of the bounding box around a word depends on the text size as well as the string itself. As in the inset, each word has a di erent size. We make words earlier in the list larger, and ones later smaller, and use the following formula to calculate the size of the word: textSize = MAX_TEXT_SIZE / sqrt(position in list). Using a MAX_TEXT_SIZE of 105, then the rst word in our list will be 105, the next will be 74, and so forth. Save these sizes in a global array for use later.
We calculate how big to make the box using some Processing built-in functionality. In the setup, add code to pre-calculate the WIDTH and HEIGHT of each word and save the data in global arrays. For each word that you get
Page 2 of 4
back from sample, calculate the size as described above, set the font size using textSize(pointSize), and ask Processing how wide that text is using a new command textWidth(word), which returns how wide a String is if drawn. We estimate the height of the word as the text size.
Finally in the setup you will also pre-calculate each word’s starting point. The rst word will be placed in the center of the canvas, and every other word will be o set by a random amount from –JITTER to JITTER in both the x and y dimensions: use a JITTER value of 1, but make sure to use a named constant. Store these initial positions in global arrays.
Func on: sample
Create a function called sample: it selects a number of random words from a String array and returns a new array with those words in it. Repeatedly call random to select a random word in the incoming array. For each picked word, you add it to your new array of sampled words, unless: it is too short (fewer than 3 characters), or already exists in your new data set. A call to nd (below) can tell you if the word exists in the array. Stop this process when you have enough words and return the new array.
Func on: nd
This function3 checks a given array of Strings for a given word, within the given range of the array. It searches from start (included) up to but excluding stop for the word, and returns the index where the word is rst found. It returns -1 if the word is not found.
Func on: drawWordCloud
This function4 rst plots each word at its current position, and then moves on to determine how the words should move. HINT: start by drawing rectangles around each word to make sure that your widths and heights were calculated correctly above.
The move code looks complex but is just a bit of math. Here is how it works. For each word in our list, except the rst word, which doesn’t move:
• Find all the words that overlap this word, and average the positions of all those words (this is called the centroid). This is done in avgOverlap, described below.
• Move away from that average position as follows:
o Calculate the distance from the center of the current word to the
averaged center.
o Calculate the x and y di erence, where di X = wordCenterX-avgX,
and di Y = wordCenterY-avgY. Also calculate the totalDistance as the hypotenuse of di X and di Y.
2 Signature: String[] sample(String[] words, int numToPlot). The String array returned from this call will have that numToPlot randomly chosen, distinct words in it.
3 Signature: int nd(String word, String[] words, int start, int stop). 4 Signature: void drawWordCloud().
2
Page 3 of 4

o Move the text x-coordinate by di X / totalDistance, and the y similarly. o Add a random jitter to the x and y position of the moving word,
the same jitter used in the setup.
o Make sure the word does not go o the screen
Func on: avgOverlap
This function5 takes a word position (just the index into the WORDS array),
and does two things: it nds all the words that overlap with that word on the canvas, and, calculates the average centers of all those words.
Two words overlap if their bounding boxes overlap. You can test this by checking the x and y separately and using an AND operation. Given a word w and a candidate overlap c,
• Calculate the absolute x distance between the centers of w and c. Also calculate the absolute y distance.
• Calculate the minimum distance between the two words in x and y directions: half of the w word’s width and have of c’s width, added together, gives you the minimum x distance.
• Two words overlap if their x distance is less than the minimum x distance, AND, the y distance is less than the minimum y distance.
You need to average the x and y values of all the words that overlap. You return the average as a new two element array.
The example of a word cloud on the rst page uses only 50 words. See how many you can consistently position. Up to 100 should be possible. When you rst start programming, it is best just to use a few.
Suggested Strategy: type in all the function stubs and return any data,
just to get it to run. Implement the nd function rst and test it with fake data. Following this, implement the sample function to get a small set of words, and try testing this function independently as well. Finish the setup code with the required calculations, and to test these results, implement a part of drawWordCloud that simply draws the words on screen. Your sizes should look reasonable, and draw boxes around the words to test your width and height calculations. Finally, start working on avgOverlap which is the hardest part.
Try rst detecting only one overlap, then use a loop to test all, and implement the average. Finally, make the words move. If you follow the above strategy, and things do not work as expected, try to sit down, and think about how to debug the problem that you are seeing.
Make small changes, and then test again. Your changes should never introduce more than one or two new errors.
5 Signature: oat[] avgOverlap(int pos). Compares the word in position pos in WORDS with every word in the array. It nds the centroid (average position) of all the words that overlap with WORDS[pos]. There will always be at least one such word, namely itself.

Leave a Reply

Your email address will not be published. Required fields are marked *