小 T 導讀:想(xiang)用 Flink 對接(jie) TDengine?保姆級教(jiao)程來了。
0、前言
TDengine 是由濤思數據開發并開源的一款高性能、分布式、支持 SQL 的時序數據庫(Time Series Database)。
除了核心的(de)時(shi)序數(shu)據(ju)庫功(gong)能(neng)外,TDengine 還(huan)(huan)提供、、等(deng)大數(shu)據(ju)平(ping)臺所需要的(de)系(xi)列功(gong)能(neng)。但是很多小伙伴出于架構的(de)考慮,還(huan)(huan)是需要將數(shu)據(ju)導出到 Apache Flink、Apache Spark 等(deng)平(ping)臺進行計算分析。
為了(le)幫助大家對接,我們特別推出了(le)保姆(mu)級(ji)課程(cheng),包學包會。

1、技術實現
Apache Flink 提供了 SourceFunction 和 SinkFunction,用來提供 Flink 和外部數據源的連接,其中 SouceFunction 為從數據源讀取數據,SinkFunction 為將數據寫入數據源。 與此同時,Flink 提供了 RichSourceFunction 和 RichSinkFunction 這兩個類(繼承自),提供了額外的初始化()和銷毀方法()。 通過重(zhong)寫這兩個方法(fa),可以避免每次讀寫數(shu)據時(shi)都重(zhong)新建(jian)立連接。
2、代碼實現
完整源碼:
代碼邏輯:
1) 自定義類 SourceFromTDengine
用途:數(shu)據源連(lian)接,數(shu)據讀取
package com.taosdata.flink;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;
import com.taosdata.model.Sensor;
import java.sql.*;
import java.util.Properties;
public class SourceFromTDengine extends RichSourceFunction<Sensor> {
Statement statement;
private Connection connection;
private String property;
public SourceFromTDengine(){
super();
}
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
String driver = "com.taosdata.jdbc.rs.RestfulDriver";
String host = "u05";
String username = "root";
String password = "taosdata";
String prop = System.getProperty("java.library.path");
Logger LOG = LoggerFactory.getLogger(SourceFromTDengine.class);
LOG.info("java.library.path:{}", prop);
System.out.println(prop);
Class.forName( driver );
Properties properties = new Properties();
connection = DriverManager.getConnection("jdbc:TAOS-RS://" + host + ":6041/tt" + "?user=root&password=taosdata"
, properties);
statement = connection.createStatement();
}
@Override
public void close() throws Exception {
super.close();
if (connection != null) {
connection.close();
}
if (statement != null) {
statement.close();
}
}
@Override
public void run(SourceContext<Sensor> sourceContext) throws Exception {
try {
String sql = "select * from tt.meters";
ResultSet resultSet = statement.executeQuery(sql);
while (resultSet.next()) {
Sensor sensor = new Sensor( resultSet.getLong(1),
resultSet.getInt( "vol" ),
resultSet.getFloat( "current" ),
resultSet.getString( "location" ).trim());
sourceContext.collect( sensor );
}
} catch (Exception e) {
e.printStackTrace();
}
}
@Override
public void cancel() {
}
}
2) 自定義類 SinkToTDengine
用途:數據源連(lian)接,數據寫入
SinkToTDengine
package com.taosdata.flink;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
import com.taosdata.model.Sensor;
import java.sql.*;
import java.util.Properties;
public class SinkToTDengine extends RichSinkFunction<Sensor> {
Statement statement;
private Connection connection;
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
String driver = "com.taosdata.jdbc.rs.RestfulDriver";
String host = "TAOS-FQDN";
String username = "root";
String password = "taosdata";
String prop = System.getProperty("java.library.path");
System.out.println(prop);
Class.forName( driver );
Properties properties = new Properties();
connection = DriverManager.getConnection("jdbc:TAOS-RS://" + host + ":6041/tt" + "?user=root&password=taosdata"
, properties);
statement = connection.createStatement();
}
@Override
public void close() throws Exception {
super.close();
if (connection != null) {
connection.close();
}
if (statement != null) {
statement.close();
}
}
@Override
public void invoke(Sensor sensor, Context context) throws Exception {
try {
String sql = String.format("insert into sinktest.%s using sinktest.meters tags('%s') values(%d,%d,%f)",
sensor.getLocation(),
sensor.getLocation(),
sensor.getTs(),
sensor.getVal(),
sensor.getCurrent()
);
statement.executeUpdate(sql);
} catch (Exception e) {
e.printStackTrace();
}
}
}
3) 自定義類 Sensor
用途:定義數(shu)據結(jie)構,用來接受數(shu)據
package com.taosdata.model;
public class Sensor {
public long ts;
public int val;
public float current;
public String location;
public Sensor() {
}
public Sensor(long ts, int val, float current, String location) {
this.ts = ts;
this.val = val;
this.current = current;
this.location = location;
}
public long getTs() {
return ts;
}
public void setTs(long ts) {
this.ts = ts;
}
public int getVal() {
return val;
}
public void setVal(int val) {
this.val = val;
}
public float getCurrent() {
return current;
}
public void setCurrent(float current) {
this.current = current;
}
public String getLocation() {
return location;
}
public void setLocation(String location) {
this.location = location;
}
@Override
public String toString() {
return "Sensor{" +
"ts=" + ts +
", val=" + val +
", current=" + current +
", location='" + location + '\'' +
'}';
}
}
4) 主程序類 ReadFromTDengine
用(yong)途:調用(yong) Flink 進行讀取和寫入數(shu)據
package com.taosdata;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.streaming.api.datastream.DataStream;
import com.taosdata.model.Sensor;
import org.slf4j.LoggerFactory;
import org.slf4j.Logger;
public class ReadFromTDengine {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStreamSource<Sensor> SensorList = env.addSource( new com.taosdata.flink.SourceFromTDengine() );
SensorList.print();
SensorList.addSink( new com.taosdata.flink.SinkToTDengine() );
env.execute();
}
}
3、簡單測試 RESTful 接口
1) 環境準備:
a) Flink 安裝&啟動:
- wget
- tar zxf flink-1.14.3-bin-scala_2.12.tgz -C /usr/local
- /usr/local/flink-1.14.3/bin/start-cluster.sh
b) TDengine Database 環境準備:
- 創建原始數據:
- create database tt;
- create table `meters` (`ts` TIMESTAMP,`vol` INT,`current` FLOAT) TAGS (`location` BINARY(20));
- insert into beijing using meters tags(‘beijing’) values(now,220,30.2);
- 創建目標數據庫表:
- create database sinktest;
- create table `meters` (`ts` TIMESTAMP,`vol` INT,`current` FLOAT) TAGS (`location` BINARY(20));
2) 打包編譯:
源碼位置:
mvn clean package
3) 程序啟動:
flink run target/test-flink-1.0-SNAPSHOT-dist.jar
- 讀取數據
- vi log/flink-root-taskexecutor-0-xxxxx.out
- 查看到數據打印:Sensor{ts=1645166073101, val=220, current=5.7, location=’beijing’}
- 寫入數據
- show sinktest.tables;
- 已經創建了beijing 子表
- select * from sinktest.beijing;
- 可以查詢到剛插入的數據
- show sinktest.tables;
4、使用 JNI 方式
舉(ju)一(yi)(yi)反三的小(xiao)伙伴此(ci)時已經猜到,只要(yao)把 JDBC URL 修改一(yi)(yi)下就(jiu)可以(yi)了(le)。
但是 Flink 每次分派(pai)作(zuo)業時都在使用一個(ge)新的 ClassLoader,而(er)我(wo)們(men)在計算節點上就會得到“Native library already loaded in another classloader”錯(cuo)誤。
為了避免(mian)此問(wen)題,可以(yi)將 JDBC 的 jar 包放到 Flink 的 lib 目錄下,不去調用 dist 包就可以(yi)了。
- cp taos-jdbcdriver-2.0.37-dist.jar /usr/local/flink-1.14.3/lib
- flink run target/test-flink-1.0-SNAPSHOT.jar
5、小結
通過在項目中引入 SourceFromTDengine 和 SinkToTDengine 兩個類,即可完成在 Flink 中對 TDengine 的讀寫操作。后面我們會有文章介紹 Spark 和 TDengine 的對接。
注(zhu):文(wen)中(zhong)使用(yong)的是(shi) JDBC 的 RESTful 接(jie)口,這樣就不用(yong)在(zai) Flink 的節點(dian)安裝 TDengine,JNI 方(fang)式需(xu)要在(zai) Flink 節點(dian)安裝 TDengine Database 的客(ke)戶端(duan)。


























