data mining代写 | Machine learning | 代做Python | 代写AI | assignment作业 – Machine learning Project

Machine learning Project

data mining代写 | Machine learning | 代做Python | 代写AI | assignment作业 – 该题目是一个常规的mining的练习题目代写, 涵盖了mining/Machine learning/Python/AI等方面, 这是值得参考的assignment代写的题目

data mining代写 代做data mining

This assignment is a culmination of your text mining, machine learning, data mining and AI skills. Shakespeare, the great bard from Stratford-on-Avon who changed the direction of theater for centuries across multiple languages and cultures, penned 37 plays over the course of 23 years.

Even though his theater was a few blocks from John Harvards birthplace, is it really possible that a single genius could possess the creative strength and focus to craft all that work by himself? Perhaps Shakespeare was actually a team of playwrights or the joint effort of some collaborations. Your task is to formulate a more specific hypothesis that tests this idea, and to perform an in-depth test of your hypothesis using your Machine learning skills. A convenient source of the data is but you are welcome to use any source available. The text mining pipeline from section 3 and HW2 can likely be re-used to easily convert the plays into a data matrix. Note that it is plausible that a scene or act could be written by a different author so you should likely have more than 37 rows in your matrix.

There are multiple types of analyses that you could use. We would like you to perform an in- depth analysis using just one general type of approach, clustering. That is, we would like you to do a thorough job focusing on clustering, for example, with multiple approaches rather than do a surface-level analysis using many of these. You will likely need to use some visualization to tell your analysis story regardless.

So, how can you determine whether Shakespeare was the author? Think along the lines of patterns:

  • Writing style or choice of words used
  • Temporal patterns in sentiment, etc. throughout the plays (comedies, tragedies, etc.)
  • Chronological analysis did his writing traceably develop or evolve over time?
  • Did the scenes or characters in a scene have a particular formula?
  • Are there particular topics that emerge consistently throughout his plays?

What to submit Submit your i Python notebook or equivalent and corresponding pdf file. If you have used external code sources, be sure to properly cite those. You can either write a short document that describes your goals, method, and results or be sure to include them in the ipython notebook with sufficient detail that we can clearly follow the hypothesis, approaches, and interpretation of the results.

Note: Please follow the text mining pipeline (please see sample jupyter notbook), and be sure to clean the data well before using the clustering methods, the quality of the results is very important, please be sure to include using Word to Vec, Glove (create vector metric for each word), LDA, SVD, and some other clustering methods that will generate good results.



This is a dataset comprised of all of Shakespeare's plays. It includes the following:
  • The first column is the Data-Line, it just keeps track of all the rows there are.
  • The second column is the play that the lines are from.
  • The third column is the actual line being spoken at any given time.
  • The fourth column is the Act-Scene-Line from which any given line is from.
  • The fifth column is the player who is saying any given line.
  • The sixth column is the line being spoken.