一. 什么是复杂事件处理(CEP)?
官方原文: It allows you to detect event patterns in an endless stream of events, giving you the opportunity to get hold of what’s important in your data.
作者理解:简单来说,复杂事件处理就是对行为数据书进行一系列规则匹配。匹配规则即模式,数据命中模式后再对数据进行处理。比如打车的一个完整行程:乘客发起订单—司机接单—乘客上车—送至目的地—乘客支付成功。每个单元即是一个模式。一系列模式联合即是复杂事件处理。
二. 复杂事件处理(CEP)运用的场景
1.异常行为检测:信用卡在短时间内,不同的区域进行消费。
2.运营决策:同种商品浏览多家店铺,针对这种价格敏感型用户,进行消费券赠送。
3.预警监控:对于卡单需求,半日达配送场景中,从物流接收–配送–站点–最后一公里,针对于每一个场景进行监控,一旦某个场景超时,便进行预警。另外可以对集群的健康状态进行一个监控等等。
三. 官方用例
1.maven依赖
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-cep_2.11</artifactId>
<version>1.10.0</version>
</dependency>
2.官方的demo
DataStream<Event> input = ...
Pattern<Event, ?> pattern = Pattern.<Event>begin("start").where(
new SimpleCondition<Event>() {
@Override
public boolean filter(Event event) {
return event.getId() == 42;
}
}
).next("middle").subtype(SubEvent.class).where(
new SimpleCondition<SubEvent>() {
@Override
public boolean filter(SubEvent subEvent) {
return subEvent.getVolume() >= 10.0;
}
}
).followedBy("end").where(
new SimpleCondition<Event>() {
@Override
public boolean filter(Event event) {
return event.getName().equals("end");
}
}
);
PatternStream<Event> patternStream = CEP.pattern(input, pattern);
DataStream<Alert> result = patternStream.process(
new PatternProcessFunction<Event, Alert>() {
@Override
public void processMatch(
Map<String, List<Event>> pattern,
Context ctx,
Collector<Alert> out) throws Exception {
out.collect(createAlertFrom(pattern));
}
});
四. 实战经验
1.需求描述:
A.从20000020(服务单创建)点获取服务单数据,筛选服务类型srvtype值为ZS04(拆机服务)的,取到srvtime预计服务时间。
B.用asomorderitemid(服务单行号)作为关联主键去匹配服务改期点20000028或20000029,匹配到任意,看该点改期后的预计服务时间servicetime是否比原预计服务时间超过10天以外,如果超过,则直接剔除此单;如果未超过,则将预计服务时间刷新为改期后的预计服务时间进行后续逻辑。
C.去匹配服务完成20000031或服务取消20000033点,在【servicetime时间点+考核阈值时间Y3】 之前能关联上20000031或20000033,则表示按时销单了,剔除。如果在【servicetime时间点+考核阈值时间Y3】前没有匹配上20000031或20000033,则表示此单没有按时销单,保留并传给妙星系统。
数据实例
{"data":{"faultCodeDes":"","cmmdtyLvl":"000030761","machineFaultCode":"","omsOrderItemId":"00168722537501","status_code":"E0001","orderChangeReason":"","cmmdtyCode":"000000011066187508","customerNum":"7252486826","changeId":"","serviceTime":"2019-10-18 01:00:30","asomOrderId":"1000168722537501","mobPhoneNum":"17160912141","asomOrderItemId":"9110374270A","qualityMark":"0","srv_memo":"","cmmdtyCtgry":"R0503002","srv_type":"ZS01","proName":"01","first_srv_time":"2020-03-07 18:00:00"},"id":"20000028","ip":"10.96.30.1","time":"20200304000056352"}
{"data":{"srv_org_dec":"1500杭州苏宁","omsOrderItemId":"00454837299501","orderSource":"B2C","status_code":"E0001","consignee":"葛琳","abnormal_type":"0","srvMemo":"","serviceTime":"2019-10-18 01:00:24","asomOrderId":"1000454837299501","srv_address":"0;;杭州市;拱墅区;全区;;;;中国铁建国际城4-3-1302","asomOrderItemId":"9110374270A","mem_card_id":"6052543414","abnormal_time":"","city_name":"杭州市","cmmdty_ctgry":"R0503002","company_name":"1500杭州大区","srv_type":"ZS04","phone_num":"13958081010","company_code":"1500","mob_phone_num":"13958081010","srv_org":"O 10003878"},"id":"2","ip":"10.96.20.196","time":"20200304000005658"}
{"data":{"srv_org_dec":"1500杭州苏宁","omsOrderItemId":"00454837299501","orderSource":"B2C","status_code":"E0001","consignee":"葛琳","abnormal_type":"0","srvMemo":"","serviceTime":"2019-10-18 01:00:28","asomOrderId":"1000454837299501","srv_address":"0;;杭州市;拱墅区;全区;;;;中国铁建国际城4-3-1302","asomOrderItemId":"9110374270A","mem_card_id":"6052543414","abnormal_time":"","city_name":"杭州市","cmmdty_ctgry":"R0503002","company_name":"1500杭州大区","srv_type":"ZS04","phone_num":"13958081010","company_code":"1500","mob_phone_num":"13958081010","srv_org":"O 10003878"},"id":"20000020","ip":"10.96.20.196","time":"20200304000005658"}
{"data":{"zzcustomer_h0612":"","fix_code":"","zzcustomer_h0632":"","status_code":"E0028","zzcustomer_h0630":"","zzwjh":"","zzcustomer_h0631":"","zzssje":"","job_sit_new":"0000964211","zyry1_bp":"RY00134212","nvoice_no":"","zzysje":"0","account_object":"","srv_compie_time":"","fault_code_des":"","fix_code_des":"","zznjh":"","customer_code":"10136611","zzcustomer_h0614":"","omsOrderItemId":"00453909587501","arrival_time":"","provide_use7":"","sales_date":"2020-02-22 15:19:32","zzqtsx8":"","yb_sales_order":"","serviceTime":"2019-10-18 01:00:35","srv_card_id":"","asomOrderId":"1000453909587501","asomOrderItemId":"9110374270A","srv_memo":"","sales_store":"9403","srv_type":"ZS01","zzxdje":"0","zzcustomer_h0629":"","zzqtsx2":"","fault_code":"","job_sit_name_new":"JMJM澄海宇翔","zzcustomer_h0626":"","zzqtsx1":""},"id":"20000031","ip":"10.96.46.185","time":"20200304000248"}
{"data":{"srv_org_dec":"1500杭州苏宁","omsOrderItemId":"00454837299501","orderSource":"B2C","status_code":"E0001","consignee":"葛琳","abnormal_type":"0","srvMemo":"","serviceTime":"2019-10-18 01:00:45","asomOrderId":"1000454837299501","srv_address":"0;;杭州市;拱墅区;全区;;;;中国铁建国际城4-3-1302","asomOrderItemId":"9110374270A","mem_card_id":"6052543414","abnormal_time":"","city_name":"杭州市","cmmdty_ctgry":"R0503002","company_name":"1500杭州大区","srv_type":"ZS04","phone_num":"13958081010","company_code":"1500","mob_phone_num":"13958081010","srv_org":"O 10003878"},"id":"5","ip":"10.96.20.196","time":"20200304000005658"}
{"data":{"srv_org_dec":"1500杭州苏宁","omsOrderItemId":"00454837299501","orderSource":"B2C","status_code":"E0001","consignee":"葛琳","abnormal_type":"0","srvMemo":"","serviceTime":"2019-10-18 02:00:00","asomOrderId":"1000454837299501","srv_address":"0;;杭州市;拱墅区;全区;;;;中国铁建国际城4-3-1302","asomOrderItemId":"9110374270A","mem_card_id":"6052543414","abnormal_time":"","city_name":"杭州市","cmmdty_ctgry":"R0503002","company_name":"1500杭州大区","srv_type":"ZS04","phone_num":"13958081010","company_code":"1500","mob_phone_num":"13958081010","srv_org":"O 10003878"},"id":"6","ip":"10.96.20.196","time":"20200304000005658"}
2.代码示例
// 定义数据延时时间
final long delay = 5 * 1000L;
final OutputTag<String> outputTag = new OutputTag<String>("side-output") {
};
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(2);
// 设置event time
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
Properties orderProp = new Properties();
orderProp.setProperty("bootstrap.servers", Configuration.getString("order.bootstrap.servers"));
orderProp.setProperty("zookeeper.connect", Configuration.getString("order.zookeeper.connect"));
orderProp.setProperty("fetch.message.max.bytes", "10485760");
orderProp.setProperty("group.id", Configuration.getString("order.group.id"));
FlinkKafkaConsumer08<String> source = new FlinkKafkaConsumer08<>(Configuration.getString("dtm.order.links"), new SimpleStringSchema(), orderProp);
数据预处理部分:
DataStream<Event> input = env
.addSource(source)
.filter(new MessageHandle.DataFilter())
.map((MapFunction<String, Event>) message -> {
Event event = new Event();
JSONObject pointMsg = JSON.parseObject(message);
String id = pointMsg.getString("id");
JSONObject data = pointMsg.getJSONObject("data");
String asomOrderItemId = data.getString("asomOrderItemId");
String srvType = data.getString("srv_type");
String serviceTime = data.getString("serviceTime");
event.setAsomOrderItemId(asomOrderItemId);
event.setServiceTime(serviceTime);
event.setSrvType(srvType);
event.setId(id);
return event;
});
其中实体类部分
/**
* @author 18074935 XU.MIN
* @date 2020/4/7 11:05
*/
public class Event {
private String srvType;
private String serviceTime;
private String asomOrderItemId;
private String id;
public String getSrvType() {
return srvType;
}
public void setSrvType(String srvType) {
this.srvType = srvType;
}
public String getServiceTime() {
return serviceTime;
}
public void setServiceTime(String serviceTime) {
this.serviceTime = serviceTime;
}
public String getAsomOrderItemId() {
return asomOrderItemId;
}
public void setAsomOrderItemId(String asomOrderItemId) {
this.asomOrderItemId = asomOrderItemId;
}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public void Event() {
}
}
指定水位:
KeyedStream<Event, String> watermark = input
.assignTimestampsAndWatermarks(
// new AscendingTimestampExtractor<Event>() {
// @Override
// public long extractAscendingTimestamp(Event event) {
// String serviceTime = event.getServiceTime();
// SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss SSS");
// long timestamp = 0L;
// try {
// Date date = format.parse(serviceTime);
// timestamp = date.getTime();
// } catch (ParseException e) {
// e.printStackTrace();
// }
// return timestamp;
// }
// }
new AssignerWithPeriodicWatermarks<Event>() {
private final long maxOutOfOrderness = delay;
private long currentMaxTimestamp = 0L;
@Override
public Watermark getCurrentWatermark() {
return new Watermark(currentMaxTimestamp - maxOutOfOrderness);
}
@Override
public long extractTimestamp(Event element, long previousElementTimestamp) {
String serviceTime = element.getServiceTime() + " 000";
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss SSS");
long timestamp = 0L;
try {
Date date = format.parse(serviceTime);
timestamp = date.getTime();
currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
System.out.println("***" + format.format(getCurrentWatermark().getTimestamp()) + "****" + element.getId() + "***");
} catch (ParseException e) {
e.printStackTrace();
}
return timestamp;
}
})
.keyBy((KeySelector<Event, String>) Event::getAsomOrderItemId);
模式编写:
Pattern<Event, ?> pattern = Pattern.<Event>begin("start").where(
new SimpleCondition<Event>() {
@Override
public boolean filter(Event event) {
System.out.println("start" + event.getId());
return (StringUtils.equals(event.getSrvType(), "ZS04") && StringUtils.equals(event.getId(), "20000020"));
}
}
).followedBy("middle").where(
new SimpleCondition<Event>() {
@Override
public boolean filter(Event event) {
System.out.println("middle" + event.getId());
return (StringUtils.equals(event.getId(), "20000028") || StringUtils.equals(event.getId(), "20000029"));
}
}
).followedBy("end").where(
new IterativeCondition<Event>() {
@Override
public boolean filter(Event event, Context<Event> context) throws Exception {
if (StringUtils.equals(event.getId(), "20000031") || StringUtils.equals(event.getId(), "20000033")) {
System.out.println("end" + event.getId());
String finishTime = event.getServiceTime();
String changeTime = "";
for (Event e : context.getEventsForPattern("middle")) {
if (e.getServiceTime().compareTo(changeTime) > 0) {
changeTime = e.getServiceTime();
}
}
return finishTime.compareTo(changeTime) > 0;
}
return false;
}
}
).within(Time.minutes(30L));
指定模式,同时,可以指定模式匹配的策略。
PatternStream<Event> patternStream = CEP.pattern(watermark, pattern);
对匹配的数据进行处理:
DataStream<String> result = patternStream.process(
new PatternProcessFunction<Event, String>() {
@Override
public void processMatch(
Map<String, List<Event>> pattern,
Context ctx,
Collector<String> out) throws Exception {
out.collect(pattern.toString());
}
});
如果需要对超时数据进行处理,如下:
SingleOutputStreamOperator<String> flatResult = patternStream.select(
outputTag,
new PatternTimeoutFunction<Event, String>() {
@Override
public String timeout(Map<String, List<Event>> map, long l) throws Exception {
JSONObject output = new JSONObject();
for (String key : map.keySet()) {
JSONArray collect = new JSONArray();
for (Event event : map.get(key)) {
collect.add(event.getId());
}
output.put(key, collect);
}
return output.toString();
}
},
new PatternSelectFunction<Event, String>() {
@Override
public String select(Map<String, List<Event>> map) throws Exception {
JSONObject output = new JSONObject();
for (String key : map.keySet()) {
JSONArray collect = new JSONArray();
for (Event event : map.get(key)) {
collect.add(event.getId());
}
output.put(key, collect);
}
return output.toString();
}
}
);
超时数据以及正常匹配数据输出:
DataStream<String> timeoutFlatResult = flatResult.getSideOutput(outputTag);
timeoutFlatResult.print();
flatResult.print();
特别说明,在如下的模式中,如果input属于non-keyed stream,后面的算子并行度为1,水位根据设定的水位顺序增长。如果设置keyBy之后,水位则为每个并行度的水位。
其中水位取值符合不同并行度水位取最小数位,同一并行度取水位最大为原则。
PatternStream<Event> patternStream = CEP.pattern(input, pattern, comparator);
整体代码如下
package com.suning.flink.starter;
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import com.suning.flink.function.MessageHandle;
import com.suning.flink.util.Configuration;
import org.apache.commons.lang.StringUtils;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.functions.KeySelector;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.PatternSelectFunction;
import org.apache.flink.cep.PatternStream;
import org.apache.flink.cep.PatternTimeoutFunction;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.cep.pattern.conditions.IterativeCondition;
import org.apache.flink.cep.pattern.conditions.SimpleCondition;
import org.apache.flink.streaming.api.TimeCharacteristic;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.KeyedStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks;
import org.apache.flink.streaming.api.watermark.Watermark;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08;
import org.apache.flink.streaming.util.serialization.SimpleStringSchema;
import org.apache.flink.util.OutputTag;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;
import java.util.Map;
import java.util.Properties;
/**
* @author 18074935 XU.MIN
* @date 2020/4/7 11:00
*/
public class LogisticsAbnormalCEPdeal {
public static void main(String[] args) throws Exception {
final long delay = 5 * 1000L;
final OutputTag<String> outputTag = new OutputTag<String>("side-output") {
};
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(2);
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
Properties orderProp = new Properties();
orderProp.setProperty("bootstrap.servers", Configuration.getString("order.bootstrap.servers"));
orderProp.setProperty("zookeeper.connect", Configuration.getString("order.zookeeper.connect"));
orderProp.setProperty("fetch.message.max.bytes", "10485760");
orderProp.setProperty("group.id", Configuration.getString("order.group.id"));
FlinkKafkaConsumer08<String> source = new FlinkKafkaConsumer08<>(Configuration.getString("dtm.order.links"), new SimpleStringSchema(), orderProp);
source.setStartFromLatest();
DataStream<Event> input = env
.addSource(source)
.filter(new MessageHandle.DataFilter())
.map((MapFunction<String, Event>) message -> {
Event event = new Event();
JSONObject pointMsg = JSON.parseObject(message);
String id = pointMsg.getString("id");
JSONObject data = pointMsg.getJSONObject("data");
String asomOrderItemId = data.getString("asomOrderItemId");
String srvType = data.getString("srv_type");
String serviceTime = data.getString("serviceTime");
event.setAsomOrderItemId(asomOrderItemId);
event.setServiceTime(serviceTime);
event.setSrvType(srvType);
event.setId(id);
return event;
});
KeyedStream<Event, String> watermark = input
.assignTimestampsAndWatermarks(
// new AscendingTimestampExtractor<Event>() {
// @Override
// public long extractAscendingTimestamp(Event event) {
// String serviceTime = event.getServiceTime();
// SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss SSS");
// long timestamp = 0L;
// try {
// Date date = format.parse(serviceTime);
// timestamp = date.getTime();
// } catch (ParseException e) {
// e.printStackTrace();
// }
// return timestamp;
// }
// }
new AssignerWithPeriodicWatermarks<Event>() {
private final long maxOutOfOrderness = delay;
private long currentMaxTimestamp = 0L;
@Override
public Watermark getCurrentWatermark() {
return new Watermark(currentMaxTimestamp - maxOutOfOrderness);
}
@Override
public long extractTimestamp(Event element, long previousElementTimestamp) {
String serviceTime = element.getServiceTime() + " 000";
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss SSS");
long timestamp = 0L;
try {
Date date = format.parse(serviceTime);
timestamp = date.getTime();
currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
System.out.println("***" + format.format(getCurrentWatermark().getTimestamp()) + "****" + element.getId() + "***");
} catch (ParseException e) {
e.printStackTrace();
}
return timestamp;
}
})
.keyBy((KeySelector<Event, String>) Event::getAsomOrderItemId);
Pattern<Event, ?> pattern = Pattern.<Event>begin("start").where(
new SimpleCondition<Event>() {
@Override
public boolean filter(Event event) {
System.out.println("start" + event.getId());
return (StringUtils.equals(event.getSrvType(), "ZS04") && StringUtils.equals(event.getId(), "20000020"));
}
}
).followedBy("middle").where(
new SimpleCondition<Event>() {
@Override
public boolean filter(Event event) {
System.out.println("middle" + event.getId());
return (StringUtils.equals(event.getId(), "20000028") || StringUtils.equals(event.getId(), "20000029"));
}
}
).followedBy("end").where(
new IterativeCondition<Event>() {
@Override
public boolean filter(Event event, Context<Event> context) throws Exception {
if (StringUtils.equals(event.getId(), "20000031") || StringUtils.equals(event.getId(), "20000033")) {
System.out.println("end" + event.getId());
String finishTime = event.getServiceTime();
String changeTime = "";
for (Event e : context.getEventsForPattern("middle")) {
if (e.getServiceTime().compareTo(changeTime) > 0) {
changeTime = e.getServiceTime();
}
}
return finishTime.compareTo(changeTime) > 0;
}
return false;
}
}
).within(Time.minutes(30L));
watermark.map((MapFunction<Event, JSONObject>) event -> {
JSONObject result = new JSONObject();
result.put("id", event.getId());
result.put("asomOrderItemId", event.getAsomOrderItemId());
result.put("serviceTime", event.getServiceTime());
result.put("srvType", event.getSrvType());
result.put("第二次", "打印");
return result;
}).print();
PatternStream<Event> patternStream = CEP.pattern(watermark, pattern);
// DataStream<String> result = patternStream.process(
// new PatternProcessFunction<Event, String>() {
// @Override
// public void processMatch(
// Map<String, List<Event>> pattern,
// Context ctx,
// Collector<String> out) throws Exception {
//
//
// out.collect(pattern.toString());
// }
// });
SingleOutputStreamOperator<String> flatResult = patternStream.select(
outputTag,
new PatternTimeoutFunction<Event, String>() {
@Override
public String timeout(Map<String, List<Event>> map, long l) throws Exception {
JSONObject output = new JSONObject();
for (String key : map.keySet()) {
JSONArray collect = new JSONArray();
for (Event event : map.get(key)) {
collect.add(event.getId());
}
output.put(key, collect);
}
return output.toString();
}
},
new PatternSelectFunction<Event, String>() {
@Override
public String select(Map<String, List<Event>> map) throws Exception {
JSONObject output = new JSONObject();
for (String key : map.keySet()) {
JSONArray collect = new JSONArray();
for (Event event : map.get(key)) {
collect.add(event.getId());
}
output.put(key, collect);
}
return output.toString();
}
}
);
DataStream<String> timeoutFlatResult = flatResult.getSideOutput(outputTag);
timeoutFlatResult.print();
flatResult.print();
env.execute("Flink CEP");
}
}
Event实体类:
package com.suning.flink.starter;
/**
* @author 18074935 XU.MIN
* @date 2020/4/7 11:05
*/
public class Event {
private String srvType;
private String serviceTime;
private String asomOrderItemId;
private String id;
public String getSrvType() {
return srvType;
}
public void setSrvType(String srvType) {
this.srvType = srvType;
}
public String getServiceTime() {
return serviceTime;
}
public void setServiceTime(String serviceTime) {
this.serviceTime = serviceTime;
}
public String getAsomOrderItemId() {
return asomOrderItemId;
}
public void setAsomOrderItemId(String asomOrderItemId) {
this.asomOrderItemId = asomOrderItemId;
}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public void Event() {
}
}
热门评论
优秀的哥哥