我不明白如何才能将这样的 2 DataFrame 彼此加入。
第一个 DataFrame 存储有关用户向服务中心请求时间的信息。
我们称之为 DataFrame df1:
+-----------+---------------------+
| USER_NAME | REQUEST_DATE |
+-----------+---------------------+
| Alex | 2018-03-01 00:00:00 |
| Alex | 2018-09-01 00:00:00 |
| Bob | 2018-03-01 00:00:00 |
| Mark | 2018-02-01 00:00:00 |
| Mark | 2018-07-01 00:00:00 |
| Kate | 2018-02-01 00:00:00 |
+-----------+---------------------+
第二个 DataFrame 存储有关用户可以使用服务中心服务的可能期限(许可期限)的信息。
让我们称之为df2。
+-----------+---------------------+---------------------+------------+
| USER_NAME | START_SERVICE | END_SERVICE | STATUS |
+-----------+---------------------+---------------------+------------+
| Alex | 2018-01-01 00:00:00 | 2018-06-01 00:00:00 | Active |
| Bob | 2018-01-01 00:00:00 | 2018-02-01 00:00:00 | Not Active |
| Mark | 2018-01-01 00:00:00 | 2018-05-01 23:59:59 | Active |
| Mark | 2018-05-01 00:00:00 | 2018-08-01 23:59:59 | VIP |
+-----------+---------------------+---------------------+------------+
如何加入这 2 个 DataFrame 并返回这样的结果?治疗时如何获取用户许可证类型列表?
+-----------+---------------------+----------------+
| USER_NAME | REQUEST_DATE | STATUS |
+-----------+---------------------+----------------+
| Alex | 2018-03-01 00:00:00 | Active |
| Alex | 2018-09-01 00:00:00 | No information |
| Bob | 2018-03-01 00:00:00 | Not Active |
| Mark | 2018-02-01 00:00:00 | Active |
| Mark | 2018-07-01 00:00:00 | VIP |
| Kate | 2018-02-01 00:00:00 | No information |
+-----------+---------------------+----------------+
代码:
import org.apache.spark.sql.DataFrame
val df1: DataFrame = Seq(
("Alex", "2018-03-01 00:00:00"),
("Alex", "2018-09-01 00:00:00"),
("Bob", "2018-03-01 00:00:00"),
("Mark", "2018-02-01 00:00:00"),
("Mark", "2018-07-01 00:00:00"),
("Kate", "2018-07-01 00:00:00")
).toDF("USER_NAME", "REQUEST_DATE")
繁花如伊
烙印99
相关分类