docs/python-matplotlib.md

matplotlib 模块{#matplotlib}

Matplotlib 是一个 Python 2D作图模块,它可以生成各种高质量的图形。

matplotlib 的图表组成元素{#matplotlib-elements}

在一个图像输出窗口中,底层是一个 Figure 实例,我们通常称之为画布,包含了一些可见和不可见的元素。

在画布上作出图形,这些图形是 Axes 实例,Axes 实例几乎包含了我们需要用到的 matplotlib 组成元素,例如坐标轴、刻度、标签、线和标记等^[matplotlib 实践]。

import matplotlib.pyplot as plt
import numpy as np
from matplotlib import cm as cm
# 定义数据
x = np.linspace(0.5, 3.5, 100)
y = np.sin(x)
y1 = np.random.randn(100)
# 设置画布(figure)大小
plt.figure(figsize=(12, 9))
# 散点图
plt.scatter(x, y1, c="0.25", label="scatter figure")
# 普通图
plt.plot(x, y, ls="--", lw=2, label="plot figure")
# 一些清理(移除图表垃圾)
# 关闭上面和右面脊柱(spine),gca 解释为 get current axes
for spine in plt.gca().spines.keys():
    if spine == "top" or spine == "right":
        plt.gca().spines[spine].set_color("none")

# 开启 x 轴的底部 ticks
plt.gca().xaxis.set_ticks_position("bottom")
# 设置底部的 tick_line 位置
xticks = list(np.linspace(0.0, 3.5, 8))
xticks.append(3.8)
plt.xticks(xticks, xticks)  # 刻度位置和标签
# 开启 y 轴的左 ticks
plt.gca().yaxis.set_ticks_position("left")
# 设置左面的 tick_line 位置
# 设置 x,y 轴限制
plt.xlim(0.0, 4.0)
plt.ylim(-3.0, 3.0)
# 设置轴标签
plt.xlabel("x_axis")
plt.ylabel("y_axis")
# 设置 x,y 轴网格
plt.grid(True, ls=":", color="r")
# 添加水平线
plt.axhline(y=0.0, c="r", ls="--", lw=2)
# 沿着 x 轴添加垂直范围
plt.axvspan(xmin=1.0, xmax=2.0, facecolor="y", alpha=.3)
# 设置注释信息
plt.annotate("maximum", xy=(np.pi/2, 1.0), xytext=((np.pi/2)+0.15, 1.5), weight="bold", color="r", 
            arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="r"))
plt.annotate("spines", xy=(0.75, -3), xytext=(0.35, -2.25), weight="bold", color="b", 
            arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))
plt.annotate("", xy=(0, -2.78), xytext=(0.4, -2.32), 
            arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))
plt.annotate("", xy=(3.8, -2.98), xytext=(3.9, -2.70),
            arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))
# 设置文本信息
plt.text(3.9, -2.70, "'|' is tickline", weight="bold", color="b")
plt.text(3.9, -2.95, "3.8 is ticklabel", weight="bold", color="b")
# 设置标题
plt.title("structure of matplotlib", fontsize=16)
# 设置图例
plt.legend(loc="upper left")
plt.show()

时间序列对比图{#series-contrast}

在工作中碰到对今日和昨日每小时统计的数据作对比展示。当然,将其做成折线图就可以实现这个要求,但是,如果对其做成面积图,同时绘制出每个点并添加数据标签,在直观感受上会更好一些。

import matplotlib.pyplot as plt
# 数据
hour = range(0, 24)
dataYesterday = [24, 22, 14, 5, 12, 9, 10, 25, 21, 28, 51, 42, 40, 44, 51, 43, 41, 39, 25, 25, 30, 23, 43, 29]
dataToday = [26, 16, 13, 13, 7, 9, 10, 13, 29, 38, 37, 41, 39, 43, 58, 54, 37, 47, 46, 24, 36, 28, 38, 34]
# 作图
# 先进行整体设置,然后plt.figure()
plt.rcParams['figure.figsize'] = (16, 8)  
plt.figure()
handle1 = plt.scatter(hour, dataToday, color='r')
handle2 = plt.scatter(hour, dataYesterday, color='b')
# 添加 ticks
plt.xticks(ticks=range(0, 24), labels=range(0, 24)) 
# 关闭上面和右面脊柱(spine)
for spine in plt.gca().spines.keys():
    if spine == "top" or spine == "right":
        plt.gca().spines[spine].set_color("none")

# 数据标签
for x, y in zip(hour, dataToday):
    plt.text(x, y, y, color='r', fontsize=12)
for x, y in zip(hour, dataYesterday):
    plt.text(x, y, y, color='b', fontsize=12)
# 面积填充    
plt.fill_between(hour, 0, dataToday, color='r', alpha=0.3)
plt.fill_between(hour, 0, dataYesterday, color='b', alpha=0.1)
plt.title("data contrast", loc='center', fontsize=16)
plt.xlabel("time(hour)")
plt.ylabel("count")
# 图例
legendText1 = 'count of today: ' + str(sum(dataToday))
legendText2 = 'count of yesterday: ' + str(sum(dataYesterday))
plt.legend(handles=[handle1, handle2], labels=[legendText1, legendText2], loc='upper left', shadow=True, fontsize='medium')
plt.show()

柱状图{#bar}

简单柱状图{#bar-ordinary}

柱状图在数据可视化中使用的频率很高,主要是用在离散型数据的分布展示中。它有垂直样式和水平样式两种展示效果。

import matplotlib.pyplot as plt
# 数据
x = [1, 2, 3, 4, 5, 6]
y = [3, 1, 4, 5, 8, 9]
# 作图
plt.figure()
plt.bar(x, y, align="center", color="c", tick_label=["q", "a", "c", "e", "r", "j"], hatch="/", edgecolor='r')
# 设置x,y轴标签
plt.xlabel("box number")
plt.ylabel("box weight(kg)")
plt.show()

常用参数

|参数|描述| |:--------|:----------------------------------------------------------| |x|标量序列,bars的x坐标轴| |height|标量或者标量序列,bars的高度| |width|标量或者类似数组,可选择项,bars的宽度,默认 0.8| |bottom|标量或者类似数组,可选择项,bars的y坐标起点,默认 0| |align|{'center', 'edge'},可选择项,默认'center';使用'edge'时默认左对齐,若要右对齐,可以设置width为负,且 align='edge'| |color|标量或者类似数组,可选择项,bar faces的颜色| |edgecolor|标量或者类似数组,可选择项,bar edges的颜色| |linewidthlw|标量或者类似数组,可选择项,bar edges的宽度,如果为 0,则表示没有边界| |linestylels|标量或者类似数组,可选择项,线的类型| |alpha|0~1的数值,透明度| |tick_label|字符串或者类似数组,bars的tick labels| |hatch|bars里面的装饰,同linestyle类型|

堆栈柱状图{#bar-stacked}

import matplotlib.pyplot as plt
# 数据
x = [1, 2, 3, 4, 5, 6]
y1 = [4, 3, 2, 1, 4, 1]
y2 = [3, 1, 4, 5, 8, 9]
# 作图
plt.figure()
plt.bar(x, y1, align="center", color="r", lw="2", ls=":", alpha=0.5)
plt.bar(x, y2, align="center", bottom=y1, color="c", tick_label=["q", "a", "c", "e", "r", "j"], hatch="/", edgecolor="white", lw="2", ls=":", alpha=0.5)
# 设置x,y轴标签
plt.xlabel("box number")
plt.ylabel("box weight(kg)")
plt.show()

并列柱状图{#bar-grouped}

原图见matplotlib官网

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
labels = ['G1', 'G2', 'G3', 'G4', 'G5']
men_means = [20, 34, 30, 35, 27]
women_means = [25, 32, 34, 20, 25]
x = np.arange(len(labels))  # the label locations
width = 0.35  # the width of the bars
fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, men_means, width, label='Men')
rects2 = ax.bar(x + width/2, women_means, width, label='Women')
# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('Scores')
ax.set_title('Scores by group and gender')
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.legend()
# 对柱状图加数字标签
def autolabel(rects):
    """Attach a text label above each bar in *rects*, displaying its height."""
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(height),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')
autolabel(rects1)
autolabel(rects2)
fig.tight_layout()
plt.show()

直方图{#histogram}

直方图是用来展现连续型数据分布特征的统计图形。

简单直方图{#histogram-ordinary}

绘制学生测试分数的分布

import numpy as np
import matplotlib.pyplot as plt
# 测试成绩
np.random.seed(1234)
scores = np.random.randint(0, 100, 100)
# 作图
plt.figure()
bins = range(0, 101, 10)
n, bins, patches = plt.hist(x=scores, 
        bins=bins, 
        color="#3773b8", 
        histtype="bar",
        rwidth=1.0,
        edgecolor="white")
# 设置x, y轴标签
plt.xlabel("score")
plt.ylabel("number of students")
plt.show()

print("n=", n, "\nbins=", bins, "\npatches=", patches)
## n= [ 7. 11.  8. 11. 13.  9.  9. 10. 15.  7.] 
## bins= [  0  10  20  30  40  50  60  70  80  90 100] 
## patches= <a list of 10 Patch objects>

plt.hist() 返回值为元组(n, bins, patches)

常用参数

|参数|描述| |:------|:----------------------------------------------------------------| |x|array or sequence of arrays,arrays不要求长度相同| |bins|int or sequence or str,可选择项,如果是int,自动计算 n+1 个边界,除了最后一个柱体是闭区间,其它柱体数据范围是左闭右开区间| |range|tuple或者None,可选择项,bins的最低和最高范围,如果bins是一个序列,range不起作用| |bottom|array_like, scalar, or None,默认None,也就是0。每个bin的底部基线| |histtype|可选择项,'bar', 'barstacked', 'step', 'stepfilled',默认'bar'| |align|可选择项,默认'mid'| |orientation|'horizontal', 'vertical',可选择项| |rwidth|scalar or None,可选择项,bin width的相对宽度,在[0.0, 1.0]之间| |stacked|bool,可选择项|

堆栈直方图{#histogram-stacked}

import numpy as np
import matplotlib.pyplot as plt
# 测试成绩
np.random.seed(1234)
scoresT1 = np.random.randint(0, 100, 100)
scoresT2 = np.random.randint(0, 100, 100)
x = [scoresT1, scoresT2]
colors = ["#8dd3c7", "#bebada"]
labels = ["班级A", "班级B"]
# 作图
plt.figure()
bins = range(0, 101, 10)
plt.hist(x=x, 
        bins=bins, 
        color=colors, 
        histtype="bar",
        rwidth=1.0,
        edgecolor="white",
        stacked=True,  # 如果为False,则最后为并列直方图
        label=labels)
# 设置x, y轴标签
plt.xlabel("score")
plt.ylabel("number of students")
plt.title("Histogram of Different Classes")
plt.legend(loc="upper left")
plt.show()

注:当 plt.hist() 中参数 stacked = False 时,为并列直方图。

饼图{#pie}

简单饼图{#pie-ordinary}

import matplotlib.pyplot as plt
fruits = ["apple", "orange", "banana", "pear"]
colors = ["#e41a1c", "#377eb8", "#4daf4a", "#984ea3"]
soldNums = [0.1, 0.4, 0.15, 0.35]
explode = [0, 0, 0.1, 0]
# 饼图
fig1, ax1 = plt.subplots()
ax1.pie(x=soldNums,
        radius=1.0,
        explode=explode,
        labels=fruits,
        autopct="%3.1f%%",
        startangle=60,
        colors=colors,
        shadow=True)
plt.title("Ratio of Different Fruits")
plt.legend(title="fruits", 
          loc="lower left", 
          bbox_to_anchor=(1, 0, 0.5, 1))
plt.show()

作数组 x 的饼图,每一个楔块(wedge)的比例由 x/sum(x) 给出。如果 sum(x) < 1,则 x 的值表示楔块的比例,最终饼图中会有 1 - sum(x) 的缺失部分。

常用参数

|参数|描述| |:------|:----------------------------------------------------------------| |explode|array-like, optional, default: None。如果不是 None,则为长度为 len(x) 的数组,表示每一楔块的半径偏移比例| |labels|list, optional, default: None。为每个楔块提供标签的字符串序列| |colors|array-like, optional, default: None。matplotlib颜色参数序列| |autopct|string, or function, optional, default: None。如果是一个格式字符串,label为fmt%pct,如上图| |pctdistance|float, optional, default: 0.6。每个 pie slice 的中心和 autopct 生成的文本之间的比例| |shadow|bool, optional, default: False| |labeldistance|float or None, optional, default: 1.1。| |startangle|float, optional, default: None。逆时针旋转 pie chart 开始的角度| |counterclock|bool, optional, default: True| |wedgeprops|dict, optional, default: None。传递给 wedge 对象的参数字典| |textprops|dict, optional, default: None。传递给 text 对象的参数字典| |rotatelabels|bool, optional, default: False|

这里的参数 labels 和 colors 都是复数形式。

嵌套环状饼图{#pie-nested}

import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
size = 0.3
vals = np.array([[60., 32.], [37., 40.], [29., 10.]])
cmap = plt.get_cmap("tab20c")
outer_colors = cmap(np.arange(3)*4)
inner_colors = cmap(np.array([1, 2, 5, 6, 9, 10]))
ax.pie(vals.sum(axis=1), radius=1, colors=outer_colors,
       wedgeprops=dict(width=size, edgecolor='w'))
# radius = 1 - size,使得第二个饼图嵌套在第一个里面
ax.pie(vals.flatten(), radius=1-size, colors=inner_colors,
       wedgeprops=dict(width=size, edgecolor='w'))
ax.set(aspect="equal", title='Pie plot with `ax.pie`')
plt.show()



shaocf/notes documentation built on Nov. 5, 2019, 8:51 a.m.