实现状态#
下表总结了各种官方 Arrow 库中可用的功能。所有库目前都遵循 Arrow 格式的 1.0.0 版本,或与 1.0.0 版本兼容的更高次要版本。有关版本控制的详细信息,请参阅格式版本控制和稳定性。除非另有说明,Python、R、Ruby 和 C/GLib 库都遵循 C++ Arrow 库。
数据类型#
数据类型(原始) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Null |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Boolean |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Int8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
UInt8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Float16 |
✓ |
✓ (1) |
✓ |
✓ |
✓ (2) |
✓ |
✓ |
✓ |
|
Float32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Decimal32 |
✓ |
✓ |
✓ |
✓ |
|||||
Decimal64 |
✓ |
✓ |
✓ |
✓ |
|||||
Decimal128 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Decimal256 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Date32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Time32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Timestamp |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Duration |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Interval |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
Fixed Size Binary |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Binary |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Large Binary |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
Utf8 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Large Utf8 |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
Binary View |
✓ |
✓ |
✓ |
✓ |
✓ |
||||
Utf8 View |
✓ |
✓ |
✓ |
✓ |
✓ |
数据类型(嵌套) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Fixed Size List |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
List |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Large List |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
||
List View |
✓ |
✓ |
✓ |
✓ |
|||||
Large List View |
✓ |
✓ |
✓ |
||||||
Struct |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Map |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Dense Union |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Sparse Union |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
数据类型(特殊) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Dictionary |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ (3) |
✓ |
✓ |
|
Extension |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
Run-End Encoded |
✓ |
✓ |
✓ |
规范扩展类型 |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
固定形状张量 |
✓ |
|||||||
可变形状张量 |
||||||||
JSON |
✓ |
✓ |
||||||
Opaque |
✓ |
✓ |
✓ |
|||||
UUID |
✓ |
✓ |
||||||
8 位布尔值 |
✓ |
✓ |
注释
(1) Java 中不支持 Float16 的相互转换。
(2) 仅当目标为 .NET 6+ 时,C# 才支持 Float16。
(3) 不支持嵌套字典
(4) C# 大型数组类型旨在帮助与其他库的互操作性,但这些类型不支持大于 2 GiB 的缓冲区,如果尝试导入过大的数组,将会引发异常。
另请参阅
Arrow 列式格式和规范扩展类型规范。
IPC 格式#
IPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Arrow 流格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ (4) |
Arrow 文件格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
记录批次 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
字典 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
替换字典 |
✓ |
✓ |
✓ |
✓ |
|||||
增量字典 |
✓ (1) |
✓ (1) |
✓ |
✓ |
✓ |
||||
张量 |
✓ |
||||||||
稀疏张量 |
✓ |
||||||||
缓冲区压缩 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ |
|||
字节序转换 |
✓ (2) |
✓ (2) |
✓ (2) |
||||||
自定义模式元数据 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
注释
(1) 嵌套字典不支持增量字典
(2) 读取时可以自动交换具有非本机字节序的数据的字节。
(3) LZ4 编解码器目前效率很低。 ARROW-11901 跟踪性能改进。
(4) nanoarrow IPC 实现仅实现用于读取 IPC 流。
另请参阅
Flight RPC#
Flight RPC 传输 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
gRPC 传输(grpc:, grpc+tcp:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC 域套接字传输 (grpc+unix:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC + TLS 传输 (grpc+tls:) |
✓ |
✓ |
✓ |
✓ |
✓ |
gRPC 传输中支持的功能
Flight RPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
身份验证处理程序 |
✓ |
✓ |
✓ |
✓ (1) |
✓ |
|||
调用超时 |
✓ |
✓ |
✓ |
✓ |
||||
调用取消 |
✓ |
✓ |
✓ |
✓ |
||||
并发客户端调用 (2) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
自定义中间件 |
✓ |
✓ |
✓ |
✓ |
||||
RPC 错误代码 |
✓ |
✓ |
✓ |
✓ |
✓ |
注释
(1) 支持使用 AspNetCore 身份验证处理程序。
(2) 单个客户端是否可以支持多个并发调用。
另请参阅
Flight SQL#
注意
Flight SQL 仍处于实验阶段。
功能支持仅指客户端/服务器库;反过来,实现 Flight SQL 协议的数据库将支持/不支持单个功能。
功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
BeginSavepoint |
✓ |
✓ |
||||||
BeginTransaction |
✓ |
✓ |
||||||
CancelQuery |
✓ |
✓ |
||||||
ClosePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedSubstraitPlan |
✓ |
✓ |
||||||
EndSavepoint |
✓ |
✓ |
||||||
EndTransaction |
✓ |
✓ |
||||||
GetCatalogs |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetCrossReference |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetDbSchemas |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetExportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetImportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetPrimaryKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetSqlInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTables |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTableTypes |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetXdbcTypeInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementSubstraitPlan |
✓ |
✓ |
||||||
StatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 数据接口#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
模式导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
模式导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
数组导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 数据接口规范。
C 流接口#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
流导出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
流导入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
另请参阅
C 流接口规范。
第三方数据格式#
格式 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
Avro |
R |
R |
||||||
CSV |
R/W |
R (2) |
R/W |
R/W |
R/W |
|||
ORC |
R/W |
R (1) |
||||||
Parquet |
R/W |
R (2) |
R/W |
R/W |
注释
R = 支持读取
W = 支持写入
(1) 通过 JNI 绑定。(由
org.apache.arrow.orc:arrow-orc
提供)(2) 通过 JNI 绑定到 Arrow C++ 数据集。(由
org.apache.arrow:arrow-dataset
提供)