检查数据框是否具有完整数据网格的有效方法

我想检查从数据帧的左上角到最右下角元素的数据帧中的所有数据是否完整(数据应填充为矩形)。如果在数据主体之后有空白列或行,这很好(它会有这个)。


好的和坏的数据帧示例如下:


bad_dataframe = pd.DataFrame([[1,1,1,""],["","","",""],[1,"",1,""],["","","",""]])

good_dataframe = pd.DataFrame([[1,1,1,""],[1,1,1,""],[1,1,1,""],[1,1,1,""],["","","",""]])

我这样做的方式如下


def not_rectangle_data(DataFrame):

    """

    This function will check if the data given to it is a "rectangle"

    """


    #removes all rows and columns that contain only blanks

    reduced_dataframe = DataFrame[DataFrame != ""].dropna(how="all",axis = 1).dropna(how="all",axis = 0)


    #removes all rows and columns that contain any blanks

    super_reduced_dataframe = reduced_dataframe.dropna(how="any",axis = 1).dropna(how="any",axis = 0)


    #Check that dataframe is not empty and that no column or no rows are half empty

    if not reduced_dataframe.empty and \

            super_reduced_dataframe.equals(reduced_dataframe):        


        #Check that columns in remain data are still present

        if ((max(reduced_dataframe.index) + 1) == reduced_dataframe.shape[0]) and \

            ((max(reduced_dataframe.columns) + 1) == reduced_dataframe.shape[1]):

            return True

        else:

            return False

    else:

        return False

但是我觉得应该有一种更简洁的方法来做到这一点。


人到中年有点甜
浏览 170回答 1
1回答

哔哔one

使用numpy:import numpy as npdef check_rectangle(df):    non_zeros = np.nonzero(df.values)    arr = np.zeros(np.max(non_zeros, 1)+1)    np.add.at(arr, non_zeros, 1)    return np.alltrue(arr)check_rectangle(good_dataframe)# Truecheck_rectangle(bad_dataframe)# Falsenp.nonzero获取所有不为零的索引(''此处视为零)。np.zeros(np.max(non_zeros, 1)+1)创建适合 的最小矩形non_zeros。np.add.at添加1到所有非零位置。最后,如果矩形被填充,则np.alltrue返回True,否则返回False。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python