pandas 填充空值,pandas 添加空列

　　使用每行/列的平均值来填充该行/列的空引用：

　　1:Python——创建字典的方法

　　2:熊猫。数据框架

　　用每行/每列的平均值填写空值。以栏目为例。简单来说，填充时指定每列的填充值为每列的平均值就好。对于线操作，需要注意的是，在操作过程中，参数应改为线。

　　Values=dict ([(col _ name，col _ mean) for col _ name，col _ mean in zip(group . columns . to list()，group.mean()。tolist ())]) #参考1，生成字典，key是列名，Value是平均值组。Fillna (value=values，inplace=true) # See 2，填充空值，这里的value是字典形式，字典的键表示列，字典的值表示用来填充列的值。一个示例是以下具有空值的数据：

　　数据下载：

　　链接：https://pan.baidu.com/s/1FiRFHhSsuspIdBIIUEXJ2w

　　提货代码：0a4b

　　在这个数据中，有92个站(行)，每个站有120个月(列)，还有一些站在某些月份有数据缺失。3354在数据中为空。现在我们要填充空值，用单个站点同月的平均值填充月份。例如：2009年1月缺少G1120，所以应该用2010 -2018年1月G1120的平均值来填充。

　　解决方案：

　　1.原始数据wind_df按月分组，得到12个月的组；

　　2.对于每个月组，使用列mean填充空值；

　　3.将填充的月份组合成一个新的数据框：filled _ df

　　4.按时间顺序排列filled_df，得到与元数据相同的格式；

　　代码如下：

　　def fill _ NAN():file path=r g:\ Shenzhen-wind \ code for paper \ EOF \ wind _ df . CSV wind _ df=PD . read _ CSV(file path，engine=python ，index_col=[0]，Encoding=utf-8-sig)#这几行非核心代码可以删除。不，把它们放在这里是为了显示缺少哪些列，A=wind _ df.isnull()。any()print(-)print(a[a==true])#打印缺失列print (-) wind _ df=wind _ df.t #先转置原始数据，方便操作wind _ df . index=PD . to _ datetime(wind _ df . index)#将时间列(也是索引列)转换为时间格式groups=wind _ df . Group by(wind _ df . index . month)#将原始数据分组为filled_df=pd。data frame()# new data frame:filled _ df，用于合并mon的所有填充月组，分组：#遍历月组打印(mon) #月组名称#打印(group.mean()) #默认情况下，每列的平均值为values=dict ([(col _ name，col _ mean) for col _ name，Col _ mean in zip(Group . columns . to list()，group.mean()。tolist ()]) #参考1，生成一个字典，key是列名，value是列组的平均值. fillna (value=values，inplace=true) #参考2，填写空值，这里以字典形式选择value，字典的key表示列。dictionary的值表示用于填充该列的值# print(group)filled _ df=filled _ df . append(group)#。已填充的月份合并到filled _ df中。Sort _ index (inplace=true) # index是一种时间格式。这里，sort_index方法转置排序后的filled_df=filled_df。T #根据时间顺序确保与原始数据格式一致file path=r g:\ Shenzhen-wind \ code for paper \ EOF \ Filled _ wind _ df . CSV Filled _ df . to _ CSV(file path，encoding= UTF-8-sig) #保存此数据if _ _ name _= _ _ main _: fill _ nan()

郑重声明：本文由网友发布，不代表盛行IT的观点，版权归原作者所有，仅为传播更多信息之目的，如有侵权请联系，我们将第一时间修改或删除，多谢。

相关文章阅读