一起绘制伯努利样本和伯努利 pmf 的密度直方图

原因是这plt.hist主要是为了处理连续分布。如果您不提供明确的 bin 边界，plt.hist则只需在最小值和最大值之间创建 10 个等距的 bin。这些垃圾箱大部分都是空的。如果只有两个可能的数据值，则应该只有两个 bin，因此有 3 个边界：import numpy as npimport matplotlib.pyplot as pltimport scipy.stats as statstrials = 10**3p = 0.5sample_bernoulli = stats.bernoulli.rvs(p, size=trials) # Generate benoulli RVplt.plot((0,1), stats.bernoulli.pmf((0,1), p), 'bo', ms=8, label='bernoulli pmf')# Density histogram of generated valuesplt.hist(sample_bernoulli, density=True, alpha=0.5, color='steelblue', edgecolor='none', bins=np.linspace(-0.5, 1.5, 3))plt.show()以下是默认 bin 边界以及样本如何放入 bin 的可视化。请注意density=True，使用时，直方图已标准化，所有条形的面积之和为 1。在本例中，两个条形宽且0.1高5.0，而其他 8 个条形的高度为零。所以，总面积为2*0.1*5 + 8*0.0 = 1。import numpy as npimport matplotlib.pyplot as pltimport scipy.stats as statstrials = 10 ** 3p = 0.5sample_bernoulli = stats.bernoulli.rvs(p, size=trials)  # Generate benoulli RV# Density histogram of generated values with default binsvalues, binbounds, bars = plt.hist(sample_bernoulli, density=True, alpha=0.2, color='steelblue', edgecolor='none')# show the bin boundariesplt.vlines(binbounds, 0, max(values) * 1.05, color='crimson', ls=':')# show the sample values with a random displacementplt.scatter(sample_bernoulli * 0.9 + np.random.uniform(0, 0.1, trials),            np.random.uniform(0, max(values), trials), color='lime')# show the index of each binfor i in range(len(binbounds) - 1):    plt.text((binbounds[i] + binbounds[i + 1]) / 2, max(values) / 2, i, ha='center', va='center', fontsize=20, color='crimson')plt.show()

一起绘制伯努利样本和伯努利 pmf 的密度直方图

1回答