I have a list list = [['0-50',4],['50-100',11],['100-150',73],['150-200',46]] and I want to show it on a histogram using mpld3 in python pyspark. The first part in each element of list is range which will be on x-axis of histogram and the second part is the number of people in that range which will on y-axis. How can I make a bar chart using either matplotlib or mpld3 in pyspark?
UPDATE: I tried below code based on [this] example 1 and it displays the bar chart but the output is visually very bad with lots of grey colored area around the plot boundary. How can I get it look clear and better in terms of visualization?
import numpy as np
import matplotlib.pyplot as plt
list = [['0-50',4],['50-100',11],['100-150',73],['150-200',46]]
n_groups = len(list)
fig, ax = plt.subplots()
index = np.arange(n_groups)
bar_width = 0.35
opacity = 0.4
error_config = {'ecolor': '0.3'}
number = []
ranges = []
for item in list:
number.append(item[1])
ranges.append(item[0])
rects1 = plt.bar(index, number, bar_width,
alpha=opacity,
color='b',
error_kw=error_config)
plt.xlabel('Number')
plt.ylabel('range')
plt.xticks(index + bar_width, (ranges[0],ranges[1],ranges[2],ranges[3]))
plt.legend()
plt.tight_layout()
plt.show()
|
我有一个列表[ [ 4 ],[ '0-50,'50-100 ',11 ]、[ 73 ]、[ '100-150,'150-200 ',46 ] ]我想在直方图使用mpld3 Python pyspark显示它。在列表中的每个元素的第一部分范围将在X轴上的直方图和第二部分是对Y轴的范围会人数。我怎么能用matplotlib或mpld3在pyspark使条形图?
更新:我尝试下面的代码的基础上[例如] 1,它显示的条形图,但输出是视觉上非常糟糕的地段与灰色彩色区域周围的情节边界。我怎样才能使它看起来更清晰和更好的可视化?
import numpy as np
import matplotlib.pyplot as plt
list = [['0-50',4],['50-100',11],['100-150',73],['150-200',46]]
n_groups = len(list)
fig, ax = plt.subplots()
index = np.arange(n_groups)
bar_width = 0.35
opacity = 0.4
error_config = {'ecolor': '0.3'}
number = []
ranges = []
for item in list:
number.append(item[1])
ranges.append(item[0])
rects1 = plt.bar(index, number, bar_width,
alpha=opacity,
color='b',
error_kw=error_config)
plt.xlabel('Number')
plt.ylabel('range')
plt.xticks(index + bar_width, (ranges[0],ranges[1],ranges[2],ranges[3]))
plt.legend()
plt.tight_layout()
plt.show()
|
A secret weapon to make matplotlib plots look good is import seaborn. This overrides the mpl defaults with something nice.
I would also make the bars bigger and move the xticks to the middle of the bars. Here is a slight tweak of your code to do so:
import numpy as np, matplotlib.pyplot as plt, mpld3, seaborn as sns
list = [['0-50',4],['50-100',11],['100-150',73],['150-200',46]]
n_groups = len(list)
index = np.arange(n_groups)
bar_width = 0.9
opacity = 0.4
number = []
ranges = []
for item in list:
number.append(item[1])
ranges.append(item[0])
rects1 = plt.bar(index, number, bar_width,
alpha=opacity,
color='b')
plt.xlabel('Number')
plt.ylabel('range')
plt.xticks(index + bar_width/2, (ranges[0],ranges[1],ranges[2],ranges[3]))
mpld3.display()
Here is how it looks:
And here is a notebook where you can see the interactivity that mpld3 adds (which is basically useless, but a little bit fun). |
一个秘密武器使matplotlib地块好看是进口海运。这将覆盖默认的MPL的东西好。
我也把酒吧做大移动xticks的酒吧中。这里是你的代码的一个轻微的调整这样做:
import numpy as np, matplotlib.pyplot as plt, mpld3, seaborn as sns
list = [['0-50',4],['50-100',11],['100-150',73],['150-200',46]]
n_groups = len(list)
index = np.arange(n_groups)
bar_width = 0.9
opacity = 0.4
number = []
ranges = []
for item in list:
number.append(item[1])
ranges.append(item[0])
rects1 = plt.bar(index, number, bar_width,
alpha=opacity,
color='b')
plt.xlabel('Number')
plt.ylabel('range')
plt.xticks(index + bar_width/2, (ranges[0],ranges[1],ranges[2],ranges[3]))
mpld3.display()
这里是如何看起来:
这里是一个笔记本,在那里你可以看到交互mpld3增加(这基本上是无用的,但有一点很有趣)。 |