Matplotlib 是一个 Python 2D作图模块,它可以生成各种高质量的图形。
在一个图像输出窗口中,底层是一个 Figure 实例,我们通常称之为画布,包含了一些可见和不可见的元素。
在画布上作出图形,这些图形是 Axes 实例,Axes 实例几乎包含了我们需要用到的 matplotlib 组成元素,例如坐标轴、刻度、标签、线和标记等^[matplotlib 实践]。
```{python matplotlib-elements, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt import numpy as np
from matplotlib import cm as cm
x = np.linspace(0.5, 3.5, 100) y = np.sin(x) y1 = np.random.randn(100)
plt.figure(figsize=(12, 9))
plt.scatter(x, y1, c="0.25", label="scatter figure")
plt.plot(x, y, ls="--", lw=2, label="plot figure")
for spine in plt.gca().spines.keys(): if spine == "top" or spine == "right": plt.gca().spines[spine].set_color("none")
plt.gca().xaxis.set_ticks_position("bottom")
xticks = list(np.linspace(0.0, 3.5, 8)) xticks.append(3.8) plt.xticks(xticks, xticks) # 刻度位置和标签
plt.gca().yaxis.set_ticks_position("left")
plt.xlim(0.0, 4.0) plt.ylim(-3.0, 3.0)
plt.xlabel("x_axis") plt.ylabel("y_axis")
plt.grid(True, ls=":", color="r")
plt.axhline(y=0.0, c="r", ls="--", lw=2)
plt.axvspan(xmin=1.0, xmax=2.0, facecolor="y", alpha=.3)
plt.annotate("maximum", xy=(np.pi/2, 1.0), xytext=((np.pi/2)+0.15, 1.5), weight="bold", color="r", arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="r"))
plt.annotate("spines", xy=(0.75, -3), xytext=(0.35, -2.25), weight="bold", color="b", arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))
plt.annotate("", xy=(0, -2.78), xytext=(0.4, -2.32), arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))
plt.annotate("", xy=(3.8, -2.98), xytext=(3.9, -2.70), arrowprops=dict(arrowstyle="->", connectionstyle="arc3", color="b"))
plt.text(3.9, -2.70, "'|' is tickline", weight="bold", color="b") plt.text(3.9, -2.95, "3.8 is ticklabel", weight="bold", color="b")
plt.title("structure of matplotlib", fontsize=16)
plt.legend(loc="upper left")
plt.show()
## 时间序列对比图{#series-contrast} 在工作中碰到对今日和昨日每小时统计的数据作对比展示。当然,将其做成折线图就可以实现这个要求,但是,如果对其做成面积图,同时绘制出每个点并添加数据标签,在直观感受上会更好一些。 ```{python matplotlib-contrast, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt # 数据 hour = range(0, 24) dataYesterday = [24, 22, 14, 5, 12, 9, 10, 25, 21, 28, 51, 42, 40, 44, 51, 43, 41, 39, 25, 25, 30, 23, 43, 29] dataToday = [26, 16, 13, 13, 7, 9, 10, 13, 29, 38, 37, 41, 39, 43, 58, 54, 37, 47, 46, 24, 36, 28, 38, 34] # 作图 # 先进行整体设置,然后plt.figure() plt.rcParams['figure.figsize'] = (16, 8) plt.figure() handle1 = plt.scatter(hour, dataToday, color='r') handle2 = plt.scatter(hour, dataYesterday, color='b') # 添加 ticks plt.xticks(ticks=range(0, 24), labels=range(0, 24)) # 关闭上面和右面脊柱(spine) for spine in plt.gca().spines.keys(): if spine == "top" or spine == "right": plt.gca().spines[spine].set_color("none") # 数据标签 for x, y in zip(hour, dataToday): plt.text(x, y, y, color='r', fontsize=12) for x, y in zip(hour, dataYesterday): plt.text(x, y, y, color='b', fontsize=12) # 面积填充 plt.fill_between(hour, 0, dataToday, color='r', alpha=0.3) plt.fill_between(hour, 0, dataYesterday, color='b', alpha=0.1) plt.title("data contrast", loc='center', fontsize=16) plt.xlabel("time(hour)") plt.ylabel("count") # 图例 legendText1 = 'count of today: ' + str(sum(dataToday)) legendText2 = 'count of yesterday: ' + str(sum(dataYesterday)) plt.legend(handles=[handle1, handle2], labels=[legendText1, legendText2], loc='upper left', shadow=True, fontsize='medium') plt.show()
柱状图在数据可视化中使用的频率很高,主要是用在离散型数据的分布展示中。它有垂直样式和水平样式两种展示效果。
```{python matplotlib-bar-1, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5, 6] y = [3, 1, 4, 5, 8, 9]
plt.figure()
plt.bar(x, y, align="center", color="c", tick_label=["q", "a", "c", "e", "r", "j"], hatch="/", edgecolor='r')
plt.xlabel("box number") plt.ylabel("box weight(kg)")
plt.show()
**常用参数** |参数|描述| |:--------|:----------------------------------------------------------| |`x`|标量序列,bars的`x`坐标轴| |`height`|标量或者标量序列,bars的高度| |`width`|标量或者类似数组,可选择项,bars的宽度,默认 0.8| |`bottom`|标量或者类似数组,可选择项,bars的`y`坐标起点,默认 0| |`align`|{'center', 'edge'},可选择项,默认'center';使用'edge'时默认左对齐,若要右对齐,可以设置`width`为负,且 align='edge'| |`color`|标量或者类似数组,可选择项,bar faces的颜色| |`edgecolor`|标量或者类似数组,可选择项,bar edges的颜色| |`linewidth`或`lw`|标量或者类似数组,可选择项,bar edges的宽度,如果为 0,则表示没有边界| |`linestyle`或`ls`|标量或者类似数组,可选择项,线的类型| |`alpha`|0~1的数值,透明度| |`tick_label`|字符串或者类似数组,bars的tick labels| |`hatch`|bars里面的装饰,同linestyle类型| ### 堆栈柱状图{#bar-stacked} ```{python matplotlib-bar-2, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt # 数据 x = [1, 2, 3, 4, 5, 6] y1 = [4, 3, 2, 1, 4, 1] y2 = [3, 1, 4, 5, 8, 9] # 作图 plt.figure() plt.bar(x, y1, align="center", color="r", lw="2", ls=":", alpha=0.5) plt.bar(x, y2, align="center", bottom=y1, color="c", tick_label=["q", "a", "c", "e", "r", "j"], hatch="/", edgecolor="white", lw="2", ls=":", alpha=0.5) # 设置x,y轴标签 plt.xlabel("box number") plt.ylabel("box weight(kg)") plt.show()
原图见matplotlib官网
```{python matplotlib-bar-3, engine.path="~/anaconda3/bin/python3.7"} import matplotlib import matplotlib.pyplot as plt import numpy as np
labels = ['G1', 'G2', 'G3', 'G4', 'G5'] men_means = [20, 34, 30, 35, 27] women_means = [25, 32, 34, 20, 25]
x = np.arange(len(labels)) # the label locations width = 0.35 # the width of the bars
fig, ax = plt.subplots() rects1 = ax.bar(x - width/2, men_means, width, label='Men') rects2 = ax.bar(x + width/2, women_means, width, label='Women')
ax.set_ylabel('Scores') ax.set_title('Scores by group and gender') ax.set_xticks(x) ax.set_xticklabels(labels) ax.legend()
def autolabel(rects): """Attach a text label above each bar in rects, displaying its height.""" for rect in rects: height = rect.get_height() ax.annotate('{}'.format(height), xy=(rect.get_x() + rect.get_width() / 2, height), xytext=(0, 3), # 3 points vertical offset textcoords="offset points", ha='center', va='bottom')
autolabel(rects1) autolabel(rects2)
fig.tight_layout()
plt.show()
## 直方图{#histogram} 直方图是用来展现连续型数据分布特征的统计图形。 ### 简单直方图{#histogram-ordinary} 绘制学生测试分数的分布 ```{python matplotlib-hist-1, engine.path="~/anaconda3/bin/python3.7"} import numpy as np import matplotlib.pyplot as plt # 测试成绩 np.random.seed(1234) scores = np.random.randint(0, 100, 100) # 作图 plt.figure() bins = range(0, 101, 10) n, bins, patches = plt.hist(x=scores, bins=bins, color="#3773b8", histtype="bar", rwidth=1.0, edgecolor="white") # 设置x, y轴标签 plt.xlabel("score") plt.ylabel("number of students") plt.show() print("n=", n, "\nbins=", bins, "\npatches=", patches)
plt.hist() 返回值为元组(n, bins, patches)
常用参数
|参数|描述|
|:------|:----------------------------------------------------------------|
|x
|array or sequence of arrays,arrays不要求长度相同|
|bins
|int or sequence or str,可选择项,如果是int,自动计算 n+1 个边界,除了最后一个柱体是闭区间,其它柱体数据范围是左闭右开区间|
|range
|tuple或者None,可选择项,bins的最低和最高范围,如果bins是一个序列,range不起作用|
|bottom
|array_like, scalar, or None,默认None,也就是0。每个bin的底部基线|
|histtype
|可选择项,'bar', 'barstacked', 'step', 'stepfilled',默认'bar'|
|align
|可选择项,默认'mid'|
|orientation
|'horizontal', 'vertical',可选择项|
|rwidth
|scalar or None,可选择项,bin width的相对宽度,在[0.0, 1.0]之间|
|stacked
|bool,可选择项|
```{python matplotlib-hist-2, engine.path="~/anaconda3/bin/python3.7"} import numpy as np import matplotlib.pyplot as plt
np.random.seed(1234) scoresT1 = np.random.randint(0, 100, 100) scoresT2 = np.random.randint(0, 100, 100)
x = [scoresT1, scoresT2] colors = ["#8dd3c7", "#bebada"] labels = ["班级A", "班级B"]
plt.figure()
bins = range(0, 101, 10)
plt.hist(x=x, bins=bins, color=colors, histtype="bar", rwidth=1.0, edgecolor="white", stacked=True, # 如果为False,则最后为并列直方图 label=labels)
plt.xlabel("score") plt.ylabel("number of students")
plt.title("Histogram of Different Classes") plt.legend(loc="upper left")
plt.show()
> 注:当 plt.hist() 中参数 stacked = False 时,为并列直方图。 ## 饼图{#pie} ### 简单饼图{#pie-ordinary} ```{python matplotlib-pie-1, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt fruits = ["apple", "orange", "banana", "pear"] colors = ["#e41a1c", "#377eb8", "#4daf4a", "#984ea3"] soldNums = [0.1, 0.4, 0.15, 0.35] explode = [0, 0, 0.1, 0] # 饼图 fig1, ax1 = plt.subplots() ax1.pie(x=soldNums, radius=1.0, explode=explode, labels=fruits, autopct="%3.1f%%", startangle=60, colors=colors, shadow=True) plt.title("Ratio of Different Fruits") plt.legend(title="fruits", loc="lower left", bbox_to_anchor=(1, 0, 0.5, 1)) plt.show()
作数组 x 的饼图,每一个楔块(wedge)的比例由 x/sum(x) 给出。如果 sum(x) < 1,则 x 的值表示楔块的比例,最终饼图中会有 1 - sum(x) 的缺失部分。
常用参数
|参数|描述|
|:------|:----------------------------------------------------------------|
|explode
|array-like, optional, default: None。如果不是 None,则为长度为 len(x) 的数组,表示每一楔块的半径偏移比例|
|labels
|list, optional, default: None。为每个楔块提供标签的字符串序列|
|colors
|array-like, optional, default: None。matplotlib颜色参数序列|
|autopct
|string, or function, optional, default: None。如果是一个格式字符串,label为fmt%pct,如上图|
|pctdistance
|float, optional, default: 0.6。每个 pie slice 的中心和 autopct 生成的文本之间的比例|
|shadow
|bool, optional, default: False|
|labeldistance
|float or None, optional, default: 1.1。|
|startangle
|float, optional, default: None。逆时针旋转 pie chart 开始的角度|
|counterclock
|bool, optional, default: True|
|wedgeprops
|dict, optional, default: None。传递给 wedge 对象的参数字典|
|textprops
|dict, optional, default: None。传递给 text 对象的参数字典|
|rotatelabels
|bool, optional, default: False|
这里的参数 labels 和 colors 都是复数形式。
```{python matplotlib-pie-2, engine.path="~/anaconda3/bin/python3.7"} import matplotlib.pyplot as plt import numpy as np
fig, ax = plt.subplots()
size = 0.3 vals = np.array([[60., 32.], [37., 40.], [29., 10.]])
cmap = plt.get_cmap("tab20c") outer_colors = cmap(np.arange(3)*4) inner_colors = cmap(np.array([1, 2, 5, 6, 9, 10]))
ax.pie(vals.sum(axis=1), radius=1, colors=outer_colors, wedgeprops=dict(width=size, edgecolor='w'))
ax.pie(vals.flatten(), radius=1-size, colors=inner_colors, wedgeprops=dict(width=size, edgecolor='w'))
ax.set(aspect="equal", title='Pie plot with ax.pie
')
plt.show() ```
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.