Python (Data Visualization)
Python | 可视化 | big data | mining – 这是一个关于Python的题目, 主要考察了关于Python的内容,是一个比较经典的可视化题目, 是有一定代表意义的可视化等代写方向
- Data Import
We can import the Iris dataset from the Python package scikit-learn.
Detailed information about scikit-learn can be found at scikit-learn.org.
from sklearn import datasets iris = datasets.load_iris()
What does the Iris dataset look like? iris.feature_names Result: [‘sepal length (cm)’, ‘sepal width (cm)’, ‘petal length (cm)’, ‘petal width (cm)’]
To display the names of the target classes: iris.target_names Result: array([‘setosa’, ‘versicolor’, ‘virginica’], dtype='<U10′)
To display the attribute values of the records: iris.data Result: array([ 5.1, 3.5, 1.4, 0.2], [ 4.9, 3. , 1.4, 0.2], [ 4.7, 3.2, 1.3, 0.2], [ 4.6, 3.1, 1.5, 0.2], [ 5. , 3.6, 1.4, 0.2] …… )
To display the target outputs of the records: iris.target Result: array ([0, 0, 0,…,1, 1, 1 ,…,2, 2, 2,…])
The classes setosa, versicolor and virginica are denoted by 0, 1, and 2, respectively.
- Data Visualization
In this section, we use the package matplotlib to visualize data.
Detailed information about matplotlib can be found at matplotlib.org.
Package setup for visualization: import matplotlib.pyplot as plt
We use a subset of attributes in the Iris dataset for visualization. First, we select the attributes Petal length and Petal width as follows.
X = iris.data[:, 2:4] t = iris.target
We can now generate a scatter plot using the attribute values in X, and use the target outputs to distinguish the instances.
plt.scatter(X[:, 0], X[:, 1], c=t) plt.xlabel(‘Petal length’) plt.ylabel(‘Petal width’) plt.show()
You can generate the scatter plot for other pairs of attributes. For example, the attribute pair (Sepal length, Sepal width) can be specified as follows:
X = iris.data[:, :2]
Accordingly, labels for the two axes should also be changed:
plt.scatter(X[:, 0], X[:, 1], c=t) plt.xlabel(‘Sepal length’) plt.ylabel(‘Sepal width’) plt.show()
