我目前正在做一个快速项目,看看这些年来的电影运行时间。数据来自Netflix的数据集,我对其进行了过滤,以获得我感兴趣的信息。我还用groupby()和mean()计算了每年的平均电影长度(分钟),但当我试图创建条形图时,我会出错。
import pandas as pd
import matplotlib.pyplot as plt
# read the csv files - turn into dataframes
netflix = pd.read_csv('titles.csv')
print(netflix)
# we just want to consider movies with over 60 minutes of runtime
movie_filter = netflix[(netflix["type"] == "MOVIE") &
(netflix["runtime"] > 60)]
# now lets factor in averages
averages_over_time = movie_filter.groupby("release_year")["runtime"].mean()
average_film_runtime = pd.DataFrame(averages_over_time)
plt.plot(average_film_runtime["release_year"], average_film_runtime["runtime"])
plt.show()
以下是我得到的错误。
Traceback (most recent call last):
File "c:\Users\matth\Dropbox\Python Code\Netflix Analysis\netflix.py", line 16, in <module>
plt.plot(average_film_runtime["release_year"], average_film_runtime["runtime"])
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
File "C:\Users\matth\AppData\Roaming\Python\Python312\site-packages\pandas\core\frame.py", line 3893, in __getitem__
indexer = self.columns.get_loc(key)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matth\AppData\Roaming\Python\Python312\site-packages\pandas\core\indexes\base.py", line 3797, in get_loc
raise KeyError(key) from err
KeyError: 'release_year'
PS C:\Users\matth\Dropbox\Python Code\Netflix Analysis>
我还是Pandas的新手,我已经被这个问题困扰了将近一个小时,所以如果答案很明显,我很抱歉。
谢谢你的帮助。