在这个问题之后,我现在运行这个代码:
List<StructField> fields = new ArrayList<>();
fields.add(DataTypes.createStructField("A",DataTypes.LongType,true));
fields.add(DataTypes.createStructField("B",DataTypes.DoubleType,true));
StructType schema1 = DataTypes.createStructType(fields);
Dataset<Row> df1 = spark.sql("select 1 as A, 2.2 as B");
Dataset<Row> finalDf1 = spark.createDataFrame(df1.javaRDD(), schema1);
fields = new ArrayList<>();
fields.add(DataTypes.createStructField("B",DataTypes.DoubleType,true));
fields.add(DataTypes.createStructField("A",DataTypes.LongType,true));
StructType schema2 = DataTypes.createStructType(fields);
Dataset<Row> df2 = spark.sql("select 2.2 as B, 1 as A");
Dataset<Row> finalDf2 = spark.createDataFrame(df2.javaRDD(), schema2);
finalDf1.printSchema();
finalDf2.printSchema();
System.out.println(finalDf1.schema());
System.out.println(finalDf2.schema());
System.out.println(finalDf1.schema().equals(finalDf2.schema()));
这是输出:
root
|-- A: long (nullable = true)
|-- B: double (nullable = true)
root
|-- B: double (nullable = true)
|-- A: long (nullable = true)
StructType(StructField(A,LongType,true), StructField(B,DoubleType,true))
StructType(StructField(B,DoubleType,true), StructField(A,LongType,true))
false
虽然列的排列顺序不同,但这两个数据集具有完全相同的列和列类型。这里需要什么比较才能得到true?
慕沐林林
慕标5832272
梦里花落0921
相关分类