实现状态#
以下表格总结了各种官方 Arrow 库中可用的功能。所有库目前都遵循 Arrow 格式的 1.0.0 版本,或与 1.0.0 版本兼容的更高次要版本。有关版本控制的详细信息,请参见 格式版本控制和稳定性。除非另有说明,Python、R、Ruby 和 C/GLib 库遵循 C++ Arrow 库。
数据类型#
数据类型(原始) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Null |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Boolean |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Int8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
UInt8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Float16 |
✓ |
✓ (1) |
✓ |
✓ |
✓ (2) |
✓ |
✓ |
✓ |
|
Float32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Decimal128 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Decimal256 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Date32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Time32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Timestamp |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Duration |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Interval |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
固定大小二进制 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
二进制 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
大型二进制 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
Utf8 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
大型 Utf8 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
二进制视图 |
✓ |
✓ |
✓ |
||||||
大型二进制视图 |
✓ |
✓ |
|||||||
Utf8 视图 |
✓ |
✓ |
✓ |
||||||
大型 Utf8 视图 |
✓ |
✓ |
数据类型(嵌套) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
固定大小列表 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
列表 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
大型列表 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
列表视图 |
✓ |
✓ |
✓ |
||||||
大型列表视图 |
✓ |
✓ |
|||||||
结构体 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
映射 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
密集联合 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
稀疏联合 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
数据类型(特殊) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
字典 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ (3) |
✓ |
✓ |
|
扩展 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
运行结束编码 |
✓ |
✓ |
规范扩展类型 |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
固定形状张量 |
✓ |
|||||||
可变形状张量 |
备注
(1) 在 Java 中不支持转换为/从 Float16 转换。
(2) C# 中的 Float16 支持仅在针对 .NET 6+ 时可用。
(3) 不支持嵌套字典
另请参见
有关 Arrow 列式格式 和 规范扩展类型 规范的详细信息。
IPC 格式#
IPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Arrow 流格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ (4) |
Arrow 文件格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
记录批次 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
字典 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
替换字典 |
✓ |
✓ |
✓ |
✓ |
|||||
增量字典 |
✓ (1) |
✓ (1) |
✓ |
✓ |
✓ |
||||
张量 |
✓ |
||||||||
稀疏张量 |
✓ |
||||||||
缓冲区压缩 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ |
|||
字节序转换 |
✓ (2) |
✓ (2) |
✓ (2) |
||||||
自定义模式元数据 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
备注
(1) 增量字典不支持嵌套字典
(2) 读取时,具有非本机字节序的数据可以自动字节交换。
(3) LZ4 编解码器目前效率很低。ARROW-11901 跟踪性能改进。
(4) nanoarrow IPC 实现仅针对读取 IPC 流进行实现。
另请参见
有关 序列化和进程间通信 (IPC) 规范的详细信息。
Flight RPC#
Flight RPC 传输 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
gRPC 传输 (grpc:, grpc+tcp:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC 域套接字传输 (grpc+unix:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC + TLS 传输 (grpc+tls:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
UCX 传输 (ucx:) |
✓ |
gRPC 传输中支持的功能
Flight RPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ |
✓ |
✓ |
✓ (1) |
✓ |
|||
身份验证处理程序 |
✓ |
✓ |
✓ |
✓ (2) |
✓ |
|||
调用超时 |
✓ |
✓ |
✓ |
✓ |
||||
调用取消 |
✓ |
✓ |
✓ |
✓ |
||||
并发客户端调用 (3) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
自定义中间件 |
✓ |
✓ |
✓ |
✓ |
||||
RPC 错误代码 |
✓ |
✓ |
✓ |
✓ |
✓ |
UCX 传输中支持的功能
Flight RPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ (4) |
|||||||
身份验证处理程序 |
||||||||
调用超时 |
||||||||
调用取消 |
||||||||
并发客户端调用 |
✓ (5) |
|||||||
自定义中间件 |
||||||||
RPC 错误代码 |
✓ |
备注
(1) 不支持 Handshake 或 DoExchange。
(2) 支持使用 AspNetCore 身份验证处理程序。
(3) 单个客户端是否可以支持多个并发调用。
(4) 仅支持 DoExchange、DoGet、DoPut 和 GetFlightInfo。
(5) 每个并发调用都是与服务器的单独连接(与 gRPC 中并发调用通过单个连接进行多路复用不同)。这通常会提供更高的吞吐量,但会消耗服务器和客户端的更多资源。
另请参见
有关 Arrow Flight RPC 规范的详细信息。
Flight SQL#
注意
Flight SQL 仍处于实验阶段。
功能支持仅指客户端/服务器库;反过来,实现 Flight SQL 协议的数据库将支持/不支持各个功能。
功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
BeginSavepoint |
✓ |
✓ |
||||||
BeginTransaction |
✓ |
✓ |
||||||
CancelQuery |
✓ |
✓ |
||||||
ClosePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedSubstraitPlan |
✓ |
✓ |
||||||
EndSavepoint |
✓ |
✓ |
||||||
EndTransaction |
✓ |
✓ |
||||||
GetCatalogs |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetCrossReference |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetDbSchemas |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetExportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetImportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetPrimaryKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetSqlInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTables |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTableTypes |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetXdbcTypeInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementSubstraitPlan |
✓ |
✓ |
||||||
StatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参见
有关 Arrow Flight SQL 规范的详细信息。
C 数据接口#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
模式导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
模式导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参见
有关 C 数据接口 规范的详细信息。
C 流接口#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
流导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
流导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参见
有关 C 流接口 规范的详细信息。
第三方数据格式#
格式 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
Avro |
R |
|||||||
CSV |
R/W |
R (2) |
R/W |
R/W |
R/W |
|||
ORC |
R/W |
R (1) |
||||||
Parquet |
R/W |
R (2) |
R/W |
R/W |
备注
R = 支持读取
W = 支持写入
(1) 通过 JNI 绑定。(由
org.apache.arrow.orc:arrow-orc
提供)(2) 通过 JNI 绑定到 Arrow C++ 数据集。(由
org.apache.arrow:arrow-dataset
提供)