实现状态#
以下表格总结了各种官方 Arrow 库中可用的功能。所有库目前都遵循 Arrow 格式的 1.0.0 版本,或与 1.0.0 版本兼容的后续次要版本。有关版本控制的详细信息,请参阅格式版本控制和稳定性。除非另有说明,否则 Python、R、Ruby 和 C/GLib 库遵循 C++ Arrow 库。
数据类型#
数据类型(基本) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Null |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Boolean |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Int8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
UInt8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Float16 |
✓ |
✓ (1) |
✓ |
✓ |
✓ (2) |
✓ |
✓ |
✓ |
|
Float32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Decimal32 |
✓ |
✓ |
✓ |
||||||
Decimal64 |
✓ |
✓ |
✓ |
||||||
Decimal128 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Decimal256 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Date32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Time32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Timestamp |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Duration |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Interval |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
固定大小二进制 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
二进制 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
大型二进制 |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
Utf8 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
大型 Utf8 |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
二进制视图 |
✓ |
✓ |
✓ |
✓ |
|||||
大型二进制视图 |
✓ |
✓ |
|||||||
Utf8 视图 |
✓ |
✓ |
✓ |
✓ |
|||||
大型 Utf8 视图 |
✓ |
✓ |
数据类型(嵌套) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
固定大小列表 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
列表 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
大型列表 |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
||
列表视图 |
✓ |
✓ |
✓ |
||||||
大型列表视图 |
✓ |
✓ |
|||||||
结构体 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
映射 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
密集联合 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
稀疏联合 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
数据类型(特殊) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
字典 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ (3) |
✓ |
✓ |
|
扩展 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
运行结束编码 |
✓ |
✓ |
规范扩展类型 |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
固定形状张量 |
✓ |
|||||||
可变形状张量 |
||||||||
JSON |
✓ |
✓ |
||||||
UUID |
✓ |
✓ |
||||||
8 位布尔值 |
✓ |
✓ |
备注
(1) 在 Java 中不支持转换为/从 Float16 转换。
(2) C# 中的 Float16 支持仅在面向 .NET 6+ 时可用。
(3) 不支持嵌套字典
(4) C# 大型数组类型用于帮助与其他库的互操作性,但这些类型不支持大于 2 GiB 的缓冲区,如果尝试导入过大的数组,将引发异常。
另请参阅
Arrow 列式格式和规范扩展类型规范。
IPC 格式#
IPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Arrow 流格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ (4) |
Arrow 文件格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
记录批次 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
字典 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
替换字典 |
✓ |
✓ |
✓ |
✓ |
|||||
增量字典 |
✓ (1) |
✓ (1) |
✓ |
✓ |
✓ |
||||
张量 |
✓ |
||||||||
稀疏张量 |
✓ |
||||||||
缓冲区压缩 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ |
|||
字节序转换 |
✓ (2) |
✓ (2) |
✓ (2) |
||||||
自定义架构元数据 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
备注
(1) 嵌套字典不支持增量字典
(2) 读取时,可以自动字节交换非本机字节序数据。
(3) LZ4 编解码器目前效率非常低。ARROW-11901 跟踪性能改进。
(4) nanoarrow IPC 实现仅针对读取 IPC 流。
另请参阅
Flight RPC#
Flight RPC 传输 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
gRPC 传输 (grpc:, grpc+tcp:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC 域套接字传输 (grpc+unix:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC + TLS 传输 (grpc+tls:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
UCX 传输 (ucx:) |
✓ |
gRPC 传输中支持的功能
Flight RPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
身份验证处理程序 |
✓ |
✓ |
✓ |
✓ (1) |
✓ |
|||
调用超时 |
✓ |
✓ |
✓ |
✓ |
||||
调用取消 |
✓ |
✓ |
✓ |
✓ |
||||
并发客户端调用 (2) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
自定义中间件 |
✓ |
✓ |
✓ |
✓ |
||||
RPC 错误代码 |
✓ |
✓ |
✓ |
✓ |
✓ |
UCX 传输中支持的功能
Flight RPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ (3) |
|||||||
身份验证处理程序 |
||||||||
调用超时 |
||||||||
调用取消 |
||||||||
并发客户端调用 |
✓ (4) |
|||||||
自定义中间件 |
||||||||
RPC 错误代码 |
✓ |
备注
(1) 支持使用 AspNetCore 身份验证处理程序。
(2) 单个客户端是否可以支持多个并发调用。
(3) 仅支持 DoExchange、DoGet、DoPut 和 GetFlightInfo。
(4) 每个并发调用都是与服务器的单独连接(与 gRPC 不同,gRPC 中并发调用是在单个连接上复用的)。这通常会提供更好的吞吐量,但在服务器和客户端上都会消耗更多资源。
另请参阅
Flight SQL#
注意
Flight SQL 仍处于实验阶段。
功能支持仅指客户端/服务器库;反过来,实现 Flight SQL 协议的数据库将支持/不支持各个功能。
功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
BeginSavepoint |
✓ |
✓ |
||||||
BeginTransaction |
✓ |
✓ |
||||||
CancelQuery |
✓ |
✓ |
||||||
ClosePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedSubstraitPlan |
✓ |
✓ |
||||||
EndSavepoint |
✓ |
✓ |
||||||
EndTransaction |
✓ |
✓ |
||||||
GetCatalogs |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetCrossReference |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetDbSchemas |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetExportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetImportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetPrimaryKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetSqlInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTables |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTableTypes |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetXdbcTypeInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementSubstraitPlan |
✓ |
✓ |
||||||
StatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 数据接口#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
架构导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
架构导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 数据接口规范。
C 流接口#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
流导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
流导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 流接口规范。
第三方数据格式#
格式 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
Avro |
R |
R |
||||||
CSV |
R/W |
R (2) |
R/W |
R/W |
R/W |
|||
ORC |
R/W |
R (1) |
||||||
Parquet |
R/W |
R (2) |
R/W |
R/W |
备注
R = 支持读取
W = 支持写入
(1) 通过 JNI 绑定。 (由
org.apache.arrow.orc:arrow-orc
提供)(2) 通过 JNI 绑定到 Arrow C++ 数据集。 (由
org.apache.arrow:arrow-dataset
提供)