实现状态#
下表总结了各种官方 Arrow 库中可用的功能。所有库目前都遵循 Arrow 格式的 1.0.0 版本或与 1.0.0 版本兼容的更高次要版本。有关版本控制的详细信息,请参阅格式版本控制和稳定性。除非另有说明,Python、R、Ruby 和 C/GLib 库都遵循 C++ Arrow 库。
数据类型#
数据类型(原始) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
|---|---|---|---|---|---|---|---|---|---|
Null |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Boolean |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Int8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
UInt8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Float16 |
✓ |
✓ (1) |
✓ |
✓ |
✓ (2) |
✓ |
✓ |
✓ |
|
Float32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
十进制数32 |
✓ |
✓ |
✓ |
✓ |
|||||
十进制数64 |
✓ |
✓ |
✓ |
✓ |
|||||
Decimal128 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
十进制数256 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Date32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Time32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Timestamp |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
持续时间 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
间隔 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
固定大小二进制 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Binary |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
大型二进制 |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
Utf8 字符串 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Large Utf8 |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
二进制视图 |
✓ |
✓ |
✓ |
✓ |
✓ |
||||
Utf8 View |
✓ |
✓ |
✓ |
✓ |
✓ |
数据类型(嵌套) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
|---|---|---|---|---|---|---|---|---|---|
固定大小列表 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
列表 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
大型列表 |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
||
列表视图 |
✓ |
✓ |
✓ |
✓ |
|||||
Large List View |
✓ |
✓ |
✓ |
||||||
结构体 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Map |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
密集联合体 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
稀疏联合 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
数据类型(特殊) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
|---|---|---|---|---|---|---|---|---|---|
字典 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ (3) |
✓ |
✓ |
|
扩展 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
行程长度编码 |
✓ |
✓ |
✓ |
规范扩展类型 |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
Swift |
|---|---|---|---|---|---|---|---|---|
固定形状张量 |
✓ |
|||||||
可变形状张量 |
||||||||
JSON |
✓ |
✓ |
||||||
不透明 |
✓ |
✓ |
✓ |
|||||
UUID |
✓ |
✓ |
||||||
8 位布尔值 |
✓ |
✓ |
||||||
Parquet 变体 |
✓ |
备注
(1) 不支持在 Java 中与 Float16 之间的转换。
(2) C# 中的 Float16 支持仅在目标为 .NET 6+ 时可用。
(3) 不支持嵌套字典
(4) C# large array 类型是为了帮助与其他库进行互操作,但它们不支持大于 2 GiB 的缓冲区,并且如果尝试导入过大的数组,将会引发异常。
另请参阅
Arrow 列式格式和规范扩展类型规范。
IPC 格式#
IPC 特性 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
|---|---|---|---|---|---|---|---|---|---|
Arrow 流格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ (4) |
Arrow 文件格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
记录批次 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
字典 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
替换字典 |
✓ |
✓ |
✓ |
✓ |
|||||
增量字典 |
✓ (1) |
✓ (1) |
✓ |
✓ |
✓ |
||||
张量 |
✓ |
||||||||
稀疏张量 |
✓ |
||||||||
缓冲区压缩 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ |
|||
字节序转换 |
✓ (2) |
✓ (2) |
✓ (2) |
||||||
自定义模式元数据 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
备注
(1) 嵌套字典不支持增量字典
(2) 读取时可以自动进行字节交换以处理非本机字节序的数据。
(3) LZ4 编解码器目前效率很低。ARROW-11901 正在跟踪性能改进。
(4) nanoarrow IPC 实现仅用于读取 IPC 流。
另请参阅
序列化和进程间通信 (IPC) 规范。
Flight RPC#
Flight RPC 传输 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
|---|---|---|---|---|---|---|---|---|
gRPC 传输 (grpc:, grpc+tcp:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC 域套接字传输 (grpc+unix:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC + TLS 传输 (grpc+tls:) |
✓ |
✓ |
✓ |
✓ |
✓ |
gRPC 传输支持的特性
Flight RPC 特性 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
|---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
身份验证处理程序 |
✓ |
✓ |
✓ |
✓ (1) |
✓ |
|||
调用超时 |
✓ |
✓ |
✓ |
✓ |
||||
调用取消 |
✓ |
✓ |
✓ |
✓ |
||||
并发客户端调用 (2) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
自定义中间件 |
✓ |
✓ |
✓ |
✓ |
||||
RPC 错误代码 |
✓ |
✓ |
✓ |
✓ |
✓ |
备注
(1) 支持使用 AspNetCore 身份验证处理程序。
(2) 单个客户端是否可以支持多个并发调用。
另请参阅
Arrow Flight RPC 规范。
Flight SQL#
注意
Flight SQL 仍处于实验阶段。
功能支持仅指客户端/服务器库;实现 Flight SQL 协议的数据库将反过来支持/不支持个别功能。
特性 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
|---|---|---|---|---|---|---|---|---|
BeginSavepoint |
✓ |
✓ |
||||||
BeginTransaction |
✓ |
✓ |
||||||
CancelQuery |
✓ |
✓ |
||||||
ClosePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedSubstraitPlan |
✓ |
✓ |
||||||
EndSavepoint |
✓ |
✓ |
||||||
EndTransaction |
✓ |
✓ |
||||||
GetCatalogs |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetCrossReference |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetDbSchemas |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetExportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetImportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetPrimaryKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetSqlInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTables |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTableTypes |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetXdbcTypeInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementSubstraitPlan |
✓ |
✓ |
||||||
StatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
Arrow Flight SQL 规范。
C 数据接口#
特性 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
模式导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
模式导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 数据接口 规范。
C 流接口#
特性 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
流导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
流导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 流接口 规范。
第三方数据格式#
格式 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
|---|---|---|---|---|---|---|---|---|
Avro |
R |
R |
||||||
CSV |
读/写 |
读 (2) |
读/写 |
读/写 |
读/写 |
|||
ORC |
读/写 |
读 (1) |
||||||
Parquet |
读/写 |
读 (2) |
读/写 |
读/写 |
备注
R = 支持读取
W = 支持写入
(1) 通过 JNI 绑定。(由
org.apache.arrow.orc:arrow-orc提供)(2) 通过 JNI 绑定到 Arrow C++ 数据集。(由
org.apache.arrow:arrow-dataset提供)