我想检查从数据帧的左上角到最右下角元素的数据帧中的所有数据是否完整(数据应填充为矩形)。如果在数据主体之后有空白列或行,这很好(它会有这个)。
好的和坏的数据帧示例如下:
bad_dataframe = pd.DataFrame([[1,1,1,""],["","","",""],[1,"",1,""],["","","",""]])
good_dataframe = pd.DataFrame([[1,1,1,""],[1,1,1,""],[1,1,1,""],[1,1,1,""],["","","",""]])
我这样做的方式如下
def not_rectangle_data(DataFrame):
"""
This function will check if the data given to it is a "rectangle"
"""
#removes all rows and columns that contain only blanks
reduced_dataframe = DataFrame[DataFrame != ""].dropna(how="all",axis = 1).dropna(how="all",axis = 0)
#removes all rows and columns that contain any blanks
super_reduced_dataframe = reduced_dataframe.dropna(how="any",axis = 1).dropna(how="any",axis = 0)
#Check that dataframe is not empty and that no column or no rows are half empty
if not reduced_dataframe.empty and \
super_reduced_dataframe.equals(reduced_dataframe):
#Check that columns in remain data are still present
if ((max(reduced_dataframe.index) + 1) == reduced_dataframe.shape[0]) and \
((max(reduced_dataframe.columns) + 1) == reduced_dataframe.shape[1]):
return True
else:
return False
else:
return False
但是我觉得应该有一种更简洁的方法来做到这一点。
哔哔one
相关分类