ItGo.me Focus on IT Recommend

Home > matplotlib - Pandas bar plot -- specify bar color by column

matplotlib - Pandas bar plot -- specify bar color by column

2020腾讯云双十一活动,全年最低!!!(领取3500元代金券),
地址https://cloud.tencent.com/act/cps/redirect?redirect=1074

【阿里云】双十一活动,全年抄底价,限时3天!(老用户也有),
入口地址https://www.aliyun.com/1111/home

Is there a simply way to specify bar colors by column name using Pandas DataFrame.plot(kind='bar') method?

I have a script that generates multiple DataFrames from several different data files in a directory. For example it does something like this:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121))
df2.plot(kind='bar', ax=plt.subplot(122))

plt.show()

With the following output:

Unfortunately, the column colors aren't consistent for each label in the different plots. Is it possible to pass in a dictionary of (filenames:colors), so that any particular column always has the same color. For example, I could imagine creating this by zipping up the filenames with the Matplotlib color_cycle:

data_files = ['a', 'b', 'c', 'd']
colors = plt.rcParams['axes.color_cycle']
print zip(data_files, colors)

[('a', u'b'), ('b', u'g'), ('c', u'r'), ('d', u'c')]

I could figure out how to do this directly with Matplotlib: I just thought there might be a simpler, built-in solution.

Edit:

Below is a partial solution that works in pure Matplotlib. However, I'm using this in an IPython notebook that will be distributed to non-programmer colleagues, and I'd like to minimize the amount of excessive plotting code.

Recommend:python - How to add a line on a pandas bar plot in matplotlib

make the points right in the middle of each bar. Could anyone help >>> df price cost net0 22.5 -20.737486 1.3643601 35.5 -19.285862 16.6958472 13.5 -20.456378 -9.0160523 5.0 -19.643776 -17.5396364 13.5 -

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
mpl_colors = plt.rcParams['axes.color_cycle']
colors = dict(zip(data_files, mpl_colors))

def bar_plotter(df, colors, sub):
    ncols = df.shape[1]
    width = 1./(ncols+2.)
    starts = df.index.values - width*ncols/2.
    plt.subplot(120+sub)
    for n, col in enumerate(df):
        plt.bar(starts + width*n, df[col].values, color=colors[col],
                width=width, label=col)
    plt.xticks(df.index.values)
    plt.grid()
    plt.legend()

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

bar_plotter(df1, colors, 1)
bar_plotter(df2, colors, 2)

plt.show()

matplotlib pandas
|
  this question
edited Sep 5 '14 at 16:06 asked Sep 5 '14 at 15:45 Ryan 335 1 8      stackoverflow.com/questions/11927715/… I think this is maybe a good starting point. Maybe slice the color list [1:] for the second graph before passing it as the color? –  DataSwede Sep 5 '14 at 22:06

 | 

1 Answers
1

---Accepted---Accepted---Accepted---

You can pass a list as the colors. This will require a little bit of manual work to get it to line up, unlike if you could pass a dictionary, but may be a less cluttered way to accomplish your goal.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

color_list = ['b', 'g', 'r', 'c']


df1.plot(kind='bar', ax=plt.subplot(121), color=color_list)
df2.plot(kind='bar', ax=plt.subplot(122), color=color_list[1:])

plt.show()

EDIT Ajean came up with a simple way to return a list of the correct colors from a dictionary:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
color_list = ['b', 'g', 'r', 'c']
d2c = dict(zip(data_files, color_list))

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121), color=map(d2c.get,df1.columns))
df2.plot(kind='bar', ax=plt.subplot(122), color=map(d2c.get,df2.columns))

plt.show()

|
  this answer
edited Sep 5 '14 at 22:29 answered Sep 5 '14 at 22:14 DataSwede 1,241 1 11 40 2   Very nice. Something a bit extra to
  the robustness, you could create a data2color dict ( d2c=dict(zip(data_files, color_list))) and then in the plot command put color=map(d2c.get,df1.columns) and likewise for df2. Looks like that works :). –  Ajean Sep 5 '14 at 22:22      I actually like that more. Feels like this should be a simple to implement feature request –  DataSwede Sep 5 '14 at 22:26      I suppose the list input is enough customization for the pandas devs, not sure what else they could do. Also I totally found this little trick elsewhere on SO so I can't take all the credit, hehe! –  Ajean Sep 5 '14 at 22:37      This is a great solution. I like the map with dictionary approach. Thanks Data Swede and Ajean! –  Ryan Sep 6 '14 at 13:17

 | 

Recommend:matplotlib - Series plot pandas color

--', color='b') The error message says that [b] is not a recognized color. I am using pandas 0.11 . Do you have the same problem matplotlib pandas share | improve this question edited Jul 30 '13 at 10:38 Boud 12.8k 3 28 52 asked Jul 30 '

------splitte line----------------------------