matplotlib 模块{#matplotlib}

Matplotlib 是一个 Python 2D作图模块,它可以生成各种高质量的图形。

matplotlib 的图表组成元素{#matplotlib-elements}

在一个图像输出窗口中,底层是一个 Figure 实例,我们通常称之为画布,包含了一些可见和不可见的元素。

在画布上作出图形,这些图形是 Axes 实例,Axes 实例几乎包含了我们需要用到的 matplotlib 组成元素,例如坐标轴、刻度、标签、线和标记等^[matplotlib 实践]。

```{python matplotlib-elements, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt import numpy as np

from matplotlib import cm as cm

定义数据

x = np.linspace(0.5, 3.5, 100) y = np.sin(x) y1 = np.random.randn(100)

设置画布(figure)大小

plt.figure(figsize=(12, 9))

散点图

plt.scatter(x, y1, c="0.25", label="scatter figure")

普通图

plt.plot(x, y, ls="--", lw=2, label="plot figure")

一些清理(移除图表垃圾)

关闭上面和右面脊柱(spine),gca 解释为 get current axes

for spine in plt.gca().spines.keys(): if spine == "top" or spine == "right": plt.gca().spines[spine].set_color("none")

开启 x 轴的底部 ticks

plt.gca().xaxis.set_ticks_position("bottom")

设置底部的 tick_line 位置

xticks = list(np.linspace(0.0, 3.5, 8)) xticks.append(3.8) plt.xticks(xticks, xticks) # 刻度位置和标签

开启 y 轴的左 ticks

plt.gca().yaxis.set_ticks_position("left")

设置左面的 tick_line 位置

设置 x,y 轴限制

plt.xlim(0.0, 4.0) plt.ylim(-3.0, 3.0)

设置轴标签

plt.xlabel("x_axis") plt.ylabel("y_axis")

设置 x,y 轴网格

plt.grid(True, ls=":", color="r")

添加水平线

plt.axhline(y=0.0, c="r", ls="--", lw=2)

沿着 x 轴添加垂直范围

plt.axvspan(xmin=1.0, xmax=2.0, facecolor="y", alpha=.3)

设置注释信息

plt.annotate("maximum", xy=(np.pi/2, 1.0), xytext=((np.pi/2)+0.15, 1.5), weight="bold", color="r", arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="r"))

plt.annotate("spines", xy=(0.75, -3), xytext=(0.35, -2.25), weight="bold", color="b", arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))

plt.annotate("", xy=(0, -2.78), xytext=(0.4, -2.32), arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))

plt.annotate("", xy=(3.8, -2.98), xytext=(3.9, -2.70), arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))

设置文本信息

plt.text(3.9, -2.70, "'|' is tickline", weight="bold", color="b") plt.text(3.9, -2.95, "3.8 is ticklabel", weight="bold", color="b")

设置标题

plt.title("structure of matplotlib", fontsize=16)

设置图例

plt.legend(loc="upper left")

plt.show()

## 时间序列对比图{#series-contrast} 

在工作中碰到对今日和昨日每小时统计的数据作对比展示。当然,将其做成折线图就可以实现这个要求,但是,如果对其做成面积图,同时绘制出每个点并添加数据标签,在直观感受上会更好一些。

```{python matplotlib-contrast, engine.path="~/anaconda3/bin/python3.7"}
import matplotlib.pyplot as plt

# 数据
hour = range(0, 24)
dataYesterday = [24, 22, 14, 5, 12, 9, 10, 25, 21, 28, 51, 42, 40, 44, 51, 43, 41, 39, 25, 25, 30, 23, 43, 29]
dataToday = [26, 16, 13, 13, 7, 9, 10, 13, 29, 38, 37, 41, 39, 43, 58, 54, 37, 47, 46, 24, 36, 28, 38, 34]

# 作图
# 先进行整体设置,然后plt.figure()
plt.rcParams['figure.figsize'] = (16, 8)  
plt.figure()

handle1 = plt.scatter(hour, dataToday, color='r')
handle2 = plt.scatter(hour, dataYesterday, color='b')
# 添加 ticks
plt.xticks(ticks=range(0, 24), labels=range(0, 24)) 

# 关闭上面和右面脊柱(spine)
for spine in plt.gca().spines.keys():
    if spine == "top" or spine == "right":
        plt.gca().spines[spine].set_color("none")

# 数据标签
for x, y in zip(hour, dataToday):
    plt.text(x, y, y, color='r', fontsize=12)
for x, y in zip(hour, dataYesterday):
    plt.text(x, y, y, color='b', fontsize=12)

# 面积填充    
plt.fill_between(hour, 0, dataToday, color='r', alpha=0.3)
plt.fill_between(hour, 0, dataYesterday, color='b', alpha=0.1)
plt.title("data contrast", loc='center', fontsize=16)
plt.xlabel("time(hour)")
plt.ylabel("count")

# 图例
legendText1 = 'count of today: ' + str(sum(dataToday))
legendText2 = 'count of yesterday: ' + str(sum(dataYesterday))
plt.legend(handles=[handle1, handle2], labels=[legendText1, legendText2], loc='upper left', shadow=True, fontsize='medium')

plt.show()

柱状图{#bar}

简单柱状图{#bar-ordinary}

柱状图在数据可视化中使用的频率很高,主要是用在离散型数据的分布展示中。它有垂直样式和水平样式两种展示效果。

```{python matplotlib-bar-1, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt

数据

x = [1, 2, 3, 4, 5, 6] y = [3, 1, 4, 5, 8, 9]

作图

plt.figure()

plt.bar(x, y, align="center", color="c", tick_label=["q", "a", "c", "e", "r", "j"], hatch="/", edgecolor='r')

设置x,y轴标签

plt.xlabel("box number") plt.ylabel("box weight(kg)")

plt.show()

**常用参数**

|参数|描述|
|:--------|:----------------------------------------------------------|
|`x`|标量序列,bars的`x`坐标轴|
|`height`|标量或者标量序列,bars的高度|
|`width`|标量或者类似数组,可选择项,bars的宽度,默认 0.8|
|`bottom`|标量或者类似数组,可选择项,bars的`y`坐标起点,默认 0|
|`align`|{'center', 'edge'},可选择项,默认'center';使用'edge'时默认左对齐,若要右对齐,可以设置`width`为负,且 align='edge'|
|`color`|标量或者类似数组,可选择项,bar faces的颜色|
|`edgecolor`|标量或者类似数组,可选择项,bar edges的颜色|
|`linewidth``lw`|标量或者类似数组,可选择项,bar edges的宽度,如果为 0,则表示没有边界|
|`linestyle``ls`|标量或者类似数组,可选择项,线的类型|
|`alpha`|0~1的数值,透明度|
|`tick_label`|字符串或者类似数组,bars的tick labels|
|`hatch`|bars里面的装饰,同linestyle类型|

### 堆栈柱状图{#bar-stacked}

```{python matplotlib-bar-2, engine.path="~/anaconda3/bin/python3.7"}
import matplotlib.pyplot as plt

# 数据
x = [1, 2, 3, 4, 5, 6]
y1 = [4, 3, 2, 1, 4, 1]
y2 = [3, 1, 4, 5, 8, 9]

# 作图
plt.figure()

plt.bar(x, y1, align="center", color="r", lw="2", ls=":", alpha=0.5)
plt.bar(x, y2, align="center", bottom=y1, color="c", tick_label=["q", "a", "c", "e", "r", "j"], hatch="/", edgecolor="white", lw="2", ls=":", alpha=0.5)

# 设置x,y轴标签
plt.xlabel("box number")
plt.ylabel("box weight(kg)")

plt.show()

并列柱状图{#bar-grouped}

原图见matplotlib官网

```{python matplotlib-bar-3, engine.path="~/anaconda3/bin/python3.7"} import matplotlib import matplotlib.pyplot as plt import numpy as np

labels = ['G1', 'G2', 'G3', 'G4', 'G5'] men_means = [20, 34, 30, 35, 27] women_means = [25, 32, 34, 20, 25]

x = np.arange(len(labels)) # the label locations width = 0.35 # the width of the bars

fig, ax = plt.subplots() rects1 = ax.bar(x - width/2, men_means, width, label='Men') rects2 = ax.bar(x + width/2, women_means, width, label='Women')

Add some text for labels, title and custom x-axis tick labels, etc.

ax.set_ylabel('Scores') ax.set_title('Scores by group and gender') ax.set_xticks(x) ax.set_xticklabels(labels) ax.legend()

对柱状图加数字标签

def autolabel(rects): """Attach a text label above each bar in rects, displaying its height.""" for rect in rects: height = rect.get_height() ax.annotate('{}'.format(height), xy=(rect.get_x() + rect.get_width() / 2, height), xytext=(0, 3), # 3 points vertical offset textcoords="offset points", ha='center', va='bottom')

autolabel(rects1) autolabel(rects2)

fig.tight_layout()

plt.show()

## 直方图{#histogram}

直方图是用来展现连续型数据分布特征的统计图形。

### 简单直方图{#histogram-ordinary}

绘制学生测试分数的分布

```{python matplotlib-hist-1, engine.path="~/anaconda3/bin/python3.7"}
import numpy as np
import matplotlib.pyplot as plt

# 测试成绩
np.random.seed(1234)
scores = np.random.randint(0, 100, 100)

# 作图
plt.figure()

bins = range(0, 101, 10)

n, bins, patches = plt.hist(x=scores, 
        bins=bins, 
        color="#3773b8", 
        histtype="bar",
        rwidth=1.0,
        edgecolor="white")

# 设置x, y轴标签
plt.xlabel("score")
plt.ylabel("number of students")

plt.show()

print("n=", n, "\nbins=", bins, "\npatches=", patches)

plt.hist() 返回值为元组(n, bins, patches)

常用参数

|参数|描述| |:------|:----------------------------------------------------------------| |x|array or sequence of arrays,arrays不要求长度相同| |bins|int or sequence or str,可选择项,如果是int,自动计算 n+1 个边界,除了最后一个柱体是闭区间,其它柱体数据范围是左闭右开区间| |range|tuple或者None,可选择项,bins的最低和最高范围,如果bins是一个序列,range不起作用| |bottom|array_like, scalar, or None,默认None,也就是0。每个bin的底部基线| |histtype|可选择项,'bar', 'barstacked', 'step', 'stepfilled',默认'bar'| |align|可选择项,默认'mid'| |orientation|'horizontal', 'vertical',可选择项| |rwidth|scalar or None,可选择项,bin width的相对宽度,在[0.0, 1.0]之间| |stacked|bool,可选择项|

堆栈直方图{#histogram-stacked}

```{python matplotlib-hist-2, engine.path="~/anaconda3/bin/python3.7"} import numpy as np import matplotlib.pyplot as plt

测试成绩

np.random.seed(1234) scoresT1 = np.random.randint(0, 100, 100) scoresT2 = np.random.randint(0, 100, 100)

x = [scoresT1, scoresT2] colors = ["#8dd3c7", "#bebada"] labels = ["班级A", "班级B"]

作图

plt.figure()

bins = range(0, 101, 10)

plt.hist(x=x, bins=bins, color=colors, histtype="bar", rwidth=1.0, edgecolor="white", stacked=True, # 如果为False,则最后为并列直方图 label=labels)

设置x, y轴标签

plt.xlabel("score") plt.ylabel("number of students")

plt.title("Histogram of Different Classes") plt.legend(loc="upper left")

plt.show()

> 注:当 plt.hist() 中参数 stacked = False 时,为并列直方图。


## 饼图{#pie}

### 简单饼图{#pie-ordinary}

```{python matplotlib-pie-1, engine.path="~/anaconda3/bin/python3.7"}
import matplotlib.pyplot as plt

fruits = ["apple", "orange", "banana", "pear"]
colors = ["#e41a1c", "#377eb8", "#4daf4a", "#984ea3"]
soldNums = [0.1, 0.4, 0.15, 0.35]
explode = [0, 0, 0.1, 0]

# 饼图
fig1, ax1 = plt.subplots()

ax1.pie(x=soldNums,
        radius=1.0,
        explode=explode,
        labels=fruits,
        autopct="%3.1f%%",
        startangle=60,
        colors=colors,
        shadow=True)

plt.title("Ratio of Different Fruits")
plt.legend(title="fruits", 
          loc="lower left", 
          bbox_to_anchor=(1, 0, 0.5, 1))

plt.show()

作数组 x 的饼图,每一个楔块(wedge)的比例由 x/sum(x) 给出。如果 sum(x) < 1,则 x 的值表示楔块的比例,最终饼图中会有 1 - sum(x) 的缺失部分。

常用参数

|参数|描述| |:------|:----------------------------------------------------------------| |explode|array-like, optional, default: None。如果不是 None,则为长度为 len(x) 的数组,表示每一楔块的半径偏移比例| |labels|list, optional, default: None。为每个楔块提供标签的字符串序列| |colors|array-like, optional, default: None。matplotlib颜色参数序列| |autopct|string, or function, optional, default: None。如果是一个格式字符串,label为fmt%pct,如上图| |pctdistance|float, optional, default: 0.6。每个 pie slice 的中心和 autopct 生成的文本之间的比例| |shadow|bool, optional, default: False| |labeldistance|float or None, optional, default: 1.1。| |startangle|float, optional, default: None。逆时针旋转 pie chart 开始的角度| |counterclock|bool, optional, default: True| |wedgeprops|dict, optional, default: None。传递给 wedge 对象的参数字典| |textprops|dict, optional, default: None。传递给 text 对象的参数字典| |rotatelabels|bool, optional, default: False|

这里的参数 labels 和 colors 都是复数形式。

嵌套环状饼图{#pie-nested}

```{python matplotlib-pie-2, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt import numpy as np

fig, ax = plt.subplots()

size = 0.3 vals = np.array([[60., 32.], [37., 40.], [29., 10.]])

cmap = plt.get_cmap("tab20c") outer_colors = cmap(np.arange(3)*4) inner_colors = cmap(np.array([1, 2, 5, 6, 9, 10]))

ax.pie(vals.sum(axis=1), radius=1, colors=outer_colors, wedgeprops=dict(width=size, edgecolor='w'))

radius = 1 - size,使得第二个饼图嵌套在第一个里面

ax.pie(vals.flatten(), radius=1-size, colors=inner_colors, wedgeprops=dict(width=size, edgecolor='w'))

ax.set(aspect="equal", title='Pie plot with ax.pie')

plt.show() ```



shaocf/notes documentation built on Nov. 5, 2019, 8:51 a.m.