Arrow 文件 I/O#
Apache Arrow 提供文件 I/O 函数,以便于在应用程序的整个生命周期中使用 Arrow。本文中,您将学习如何:
将 Arrow 文件读取到
RecordBatch中,然后再将其写出将 CSV 文件读取到
Table中,然后再将其写出将 Parquet 文件读取到
Table中,然后再将其写出
先决条件#
在继续之前,请确保您已具备:
已安装 Arrow,您可以在此处进行设置:在您自己的项目中使用 Arrow C++
通过 基本的 Arrow 数据结构 了解 Arrow 基本数据结构
一个用于运行最终应用程序的目录——此程序将生成一些文件,请做好准备。
设置#
在进行文件 I/O 写入之前,我们需要填补几个空白:
我们需要包含必要的头文件。
需要一个
main()函数来将所有内容连接起来。我们需要一些文件来操作。
包含文件#
在编写 C++ 代码之前,我们需要一些包含文件。我们将使用 iostream 进行输出,然后导入 Arrow 的 I/O 功能,以处理本文中将使用的每种文件类型。
#include <arrow/api.h>
#include <arrow/csv/api.h>
#include <arrow/io/api.h>
#include <arrow/ipc/api.h>
#include <parquet/arrow/reader.h>
#include <parquet/arrow/writer.h>
#include <iostream>
Main()#
对于连接代码,我们将使用之前关于数据结构教程中的 main() 模式。
int main() {
arrow::Status st = RunMain();
if (!st.ok()) {
std::cerr << st << std::endl;
return 1;
}
return 0;
}
它与 RunMain() 搭配使用,就像我们之前使用它一样。
arrow::Status RunMain() {
生成用于读取的文件#
我们需要一些实际操作的文件。实际上,您可能会为自己的应用程序提供一些输入。然而,在这里,我们想探索文件 I/O,所以让我们生成一些文件,以便于理解。要创建这些文件,我们将定义一个辅助函数,并首先运行它。请随意阅读此部分,但所使用的概念将在本文后面解释。请注意,我们使用的是之前教程中的日/月/年数据。现在,只需复制该函数即可。
arrow::Status GenInitialFile() {
// Make a couple 8-bit integer arrays and a 16-bit integer array -- just like
// basic Arrow example.
arrow::Int8Builder int8builder;
int8_t days_raw[5] = {1, 12, 17, 23, 28};
ARROW_RETURN_NOT_OK(int8builder.AppendValues(days_raw, 5));
std::shared_ptr<arrow::Array> days;
ARROW_ASSIGN_OR_RAISE(days, int8builder.Finish());
int8_t months_raw[5] = {1, 3, 5, 7, 1};
ARROW_RETURN_NOT_OK(int8builder.AppendValues(months_raw, 5));
std::shared_ptr<arrow::Array> months;
ARROW_ASSIGN_OR_RAISE(months, int8builder.Finish());
arrow::Int16Builder int16builder;
int16_t years_raw[5] = {1990, 2000, 1995, 2000, 1995};
ARROW_RETURN_NOT_OK(int16builder.AppendValues(years_raw, 5));
std::shared_ptr<arrow::Array> years;
ARROW_ASSIGN_OR_RAISE(years, int16builder.Finish());
// Get a vector of our Arrays
std::vector<std::shared_ptr<arrow::Array>> columns = {days, months, years};
// Make a schema to initialize the Table with
std::shared_ptr<arrow::Field> field_day, field_month, field_year;
std::shared_ptr<arrow::Schema> schema;
field_day = arrow::field("Day", arrow::int8());
field_month = arrow::field("Month", arrow::int8());
field_year = arrow::field("Year", arrow::int16());
schema = arrow::schema({field_day, field_month, field_year});
// With the schema and data, create a Table
std::shared_ptr<arrow::Table> table;
table = arrow::Table::Make(schema, columns);
// Write out test files in IPC, CSV, and Parquet for the example to use.
std::shared_ptr<arrow::io::FileOutputStream> outfile;
ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_in.arrow"));
ARROW_ASSIGN_OR_RAISE(std::shared_ptr<arrow::ipc::RecordBatchWriter> ipc_writer,
arrow::ipc::MakeFileWriter(outfile, schema));
ARROW_RETURN_NOT_OK(ipc_writer->WriteTable(*table));
ARROW_RETURN_NOT_OK(ipc_writer->Close());
ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_in.csv"));
ARROW_ASSIGN_OR_RAISE(auto csv_writer,
arrow::csv::MakeCSVWriter(outfile, table->schema()));
ARROW_RETURN_NOT_OK(csv_writer->WriteTable(*table));
ARROW_RETURN_NOT_OK(csv_writer->Close());
ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_in.parquet"));
PARQUET_THROW_NOT_OK(
parquet::arrow::WriteTable(*table, arrow::default_memory_pool(), outfile, 5));
return arrow::Status::OK();
}
为了让您的其余代码正常运行,请务必在 RunMain() 的第一行调用 GenInitialFile() 来初始化环境。
// Generate initial files for each format with a helper function -- don't worry,
// we'll also write a table in this example.
ARROW_RETURN_NOT_OK(GenInitialFile());
使用 Arrow 文件进行 I/O#
我们将按以下步骤逐步进行:先读取,再写入。
读取文件
打开文件
将文件绑定到
ipc::RecordBatchFileReader将文件读取到
RecordBatch
写入文件
获取一个
io::FileOutputStream从
RecordBatch写入文件
打开文件#
要实际读取文件,我们需要某种方式来指向它。在 Arrow 中,这意味着我们将获取一个 io::ReadableFile 对象——就像 ArrayBuilder 可以清除并创建新数组一样,我们可以将其重新分配给新文件,因此我们将在整个示例中重复使用此实例。
// First, we have to set up a ReadableFile object, which just lets us point our
// readers to the right data on disk. We'll be reusing this object, and rebinding
// it to multiple files throughout the example.
std::shared_ptr<arrow::io::ReadableFile> infile;
单独的 io::ReadableFile 作用不大——我们实际上是通过 io::ReadableFile::Open() 将其绑定到文件。对于我们这里的目的,默认参数就足够了。
// Get "test_in.arrow" into our file pointer
ARROW_ASSIGN_OR_RAISE(infile, arrow::io::ReadableFile::Open(
"test_in.arrow", arrow::default_memory_pool()));
打开 Arrow 文件读取器#
一个 io::ReadableFile 太通用,无法提供读取 Arrow 文件的所有功能。我们需要使用它来获取一个 ipc::RecordBatchFileReader 对象。此对象实现了读取具有正确格式的 Arrow 文件所需的所有逻辑。我们通过 ipc::RecordBatchFileReader::Open() 获取一个。
// Open up the file with the IPC features of the library, gives us a reader object.
ARROW_ASSIGN_OR_RAISE(auto ipc_reader, arrow::ipc::RecordBatchFileReader::Open(infile));
将打开的 Arrow 文件读取到 RecordBatch#
我们必须使用 RecordBatch 来读取 Arrow 文件,所以我们将获取一个 RecordBatch。一旦有了它,我们就可以实际读取文件。Arrow 文件可以有多个 RecordBatch,所以我们必须传递一个索引。此文件只有一个,所以传递 0。
// Using the reader, we can read Record Batches. Note that this is specific to IPC;
// for other formats, we focus on Tables, but here, RecordBatches are used.
std::shared_ptr<arrow::RecordBatch> rbatch;
ARROW_ASSIGN_OR_RAISE(rbatch, ipc_reader->ReadRecordBatch(0));
准备 FileOutputStream#
对于输出,我们需要一个 io::FileOutputStream。就像我们的 io::ReadableFile 一样,我们将重复使用它,所以请做好准备。我们以与读取文件相同的方式打开文件。
// Just like with input, we get an object for the output file.
std::shared_ptr<arrow::io::FileOutputStream> outfile;
// Bind it to "test_out.arrow"
ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_out.arrow"));
从 RecordBatch 写入 Arrow 文件#
现在,我们获取之前读取到 RecordBatch 中的数据,并使用它以及目标文件来创建一个 ipc::RecordBatchWriter。ipc::RecordBatchWriter 需要两样东西:
目标文件
我们
RecordBatch的Schema(以防我们需要写入更多相同格式的RecordBatch)。
Schema 来自我们现有的 RecordBatch,目标文件是我们刚刚创建的输出流。
// Set up a writer with the output file -- and the schema! We're defining everything
// here, loading to fire.
ARROW_ASSIGN_OR_RAISE(std::shared_ptr<arrow::ipc::RecordBatchWriter> ipc_writer,
arrow::ipc::MakeFileWriter(outfile, rbatch->schema()));
我们可以直接调用 ipc::RecordBatchWriter::WriteRecordBatch() 并传入我们的 RecordBatch 来填充我们的文件。
// Write the record batch.
ARROW_RETURN_NOT_OK(ipc_writer->WriteRecordBatch(*rbatch));
特别是对于 IPC,由于它预期可能写入多个批次,因此必须关闭写入器。要做到这一点:
// Specifically for IPC, the writer needs to be explicitly closed.
ARROW_RETURN_NOT_OK(ipc_writer->Close());
现在我们已经读取并写入了一个 IPC 文件!
使用 CSV 进行 I/O#
我们将按以下步骤逐步进行:先读取,再写入。
读取文件
打开文件
准备表格
使用
csv::TableReader读取文件
写入文件
获取一个
io::FileOutputStream从
Table写入文件
打开 CSV 文件#
对于 CSV 文件,我们需要打开一个 io::ReadableFile,就像 Arrow 文件一样,并重复使用我们之前创建的 io::ReadableFile 对象来完成此操作。
// Bind our input file to "test_in.csv"
ARROW_ASSIGN_OR_RAISE(infile, arrow::io::ReadableFile::Open("test_in.csv"));
准备表格#
CSV 可以读取到 Table 中,因此声明一个指向 Table 的指针。
std::shared_ptr<arrow::Table> csv_table;
将 CSV 文件读取到 Table#
CSV 读取器需要传递选项结构体——幸运的是,我们可以直接传递这些选项的默认值。有关其他选项的参考,请访问此处:文件格式。由于没有特殊分隔符且文件较小,我们可以使用默认值创建读取器。
// The CSV reader has several objects for various options. For now, we'll use defaults.
ARROW_ASSIGN_OR_RAISE(
auto csv_reader,
arrow::csv::TableReader::Make(
arrow::io::default_io_context(), infile, arrow::csv::ReadOptions::Defaults(),
arrow::csv::ParseOptions::Defaults(), arrow::csv::ConvertOptions::Defaults()));
CSV 读取器准备就绪后,我们可以使用其 csv::TableReader::Read() 方法来填充我们的 Table。
// Read the table.
ARROW_ASSIGN_OR_RAISE(csv_table, csv_reader->Read())
从 Table 写入 CSV 文件#
CSV 写入 Table 的方式与 IPC 写入 RecordBatch 完全相同,只是我们使用 Table,并使用 ipc::RecordBatchWriter::WriteTable() 而不是 ipc::RecordBatchWriter::WriteRecordBatch()。请注意,使用的是相同的写入器类——我们使用 ipc::RecordBatchWriter::WriteTable() 是因为我们有一个 Table。我们将指定一个文件,使用我们的 Table 的 Schema,然后写入 Table。
// Bind our output file to "test_out.csv"
ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_out.csv"));
// The CSV writer has simpler defaults, review API documentation for more complex usage.
ARROW_ASSIGN_OR_RAISE(auto csv_writer,
arrow::csv::MakeCSVWriter(outfile, csv_table->schema()));
ARROW_RETURN_NOT_OK(csv_writer->WriteTable(*csv_table));
// Not necessary, but a safe practice.
ARROW_RETURN_NOT_OK(csv_writer->Close());
现在,我们已经读取并写入了一个 CSV 文件!
使用 Parquet 进行文件 I/O#
我们将按以下步骤逐步进行:先读取,再写入。
打开 Parquet 文件#
再次,这种文件格式 Parquet 需要一个 io::ReadableFile,我们已经有了它,并且需要对文件调用 io::ReadableFile::Open() 方法。
// Bind our input file to "test_in.parquet"
ARROW_ASSIGN_OR_RAISE(infile, arrow::io::ReadableFile::Open("test_in.parquet"));
设置 Parquet 读取器#
一如既往,我们需要一个 Reader 来实际读取文件。我们一直从 Arrow 命名空间获取每种文件格式的 Reader。这次,我们进入 Parquet 命名空间以获取 parquet::arrow::FileReader。
std::unique_ptr<parquet::arrow::FileReader> reader;
现在,要设置我们的读取器,我们调用 parquet::arrow::OpenFile()。是的,即使我们使用了 io::ReadableFile::Open(),这也是必要的。请注意,我们通过引用传递 parquet::arrow::FileReader,而不是在输出中将其赋值。
// Note that Parquet's OpenFile() takes the reader by reference, rather than returning
// a reader.
PARQUET_ASSIGN_OR_THROW(reader,
parquet::arrow::OpenFile(infile, arrow::default_memory_pool()));
将 Parquet 文件读取到 Table#
手头有准备好的 parquet::arrow::FileReader,我们可以将其读取到 Table 中,只是我们必须通过引用传递 Table,而不是将其作为输出。
std::shared_ptr<arrow::Table> parquet_table;
// Read the table.
PARQUET_THROW_NOT_OK(reader->ReadTable(&parquet_table));
从 Table 写入 Parquet 文件#
对于单次写入,写入 Parquet 文件不需要写入器对象。相反,我们为其提供我们的表、指向它将用于任何必要内存消耗的内存池、告诉它写入位置以及如果需要将文件拆分的块大小。
// Parquet writing does not need a declared writer object. Just get the output
// file bound, then pass in the table, memory pool, output, and chunk size for
// breaking up the Table on-disk.
ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_out.parquet"));
PARQUET_THROW_NOT_OK(parquet::arrow::WriteTable(
*parquet_table, arrow::default_memory_pool(), outfile, 5));
结束程序#
最后,我们只需返回 Status::OK(),这样 main() 就知道我们已完成,并且一切正常。就像第一个教程中一样。
return arrow::Status::OK();
}
至此,您已经使用 Arrow 读取和写入了 IPC、CSV 和 Parquet 文件,并且可以正确加载数据并写入输出!现在,我们可以在下一篇文章中继续使用计算函数处理数据。
请参考以下内容以获取完整的代码副本。
19// (Doc section: Includes)
20#include <arrow/api.h>
21#include <arrow/csv/api.h>
22#include <arrow/io/api.h>
23#include <arrow/ipc/api.h>
24#include <parquet/arrow/reader.h>
25#include <parquet/arrow/writer.h>
26
27#include <iostream>
28// (Doc section: Includes)
29
30// (Doc section: GenInitialFile)
31arrow::Status GenInitialFile() {
32 // Make a couple 8-bit integer arrays and a 16-bit integer array -- just like
33 // basic Arrow example.
34 arrow::Int8Builder int8builder;
35 int8_t days_raw[5] = {1, 12, 17, 23, 28};
36 ARROW_RETURN_NOT_OK(int8builder.AppendValues(days_raw, 5));
37 std::shared_ptr<arrow::Array> days;
38 ARROW_ASSIGN_OR_RAISE(days, int8builder.Finish());
39
40 int8_t months_raw[5] = {1, 3, 5, 7, 1};
41 ARROW_RETURN_NOT_OK(int8builder.AppendValues(months_raw, 5));
42 std::shared_ptr<arrow::Array> months;
43 ARROW_ASSIGN_OR_RAISE(months, int8builder.Finish());
44
45 arrow::Int16Builder int16builder;
46 int16_t years_raw[5] = {1990, 2000, 1995, 2000, 1995};
47 ARROW_RETURN_NOT_OK(int16builder.AppendValues(years_raw, 5));
48 std::shared_ptr<arrow::Array> years;
49 ARROW_ASSIGN_OR_RAISE(years, int16builder.Finish());
50
51 // Get a vector of our Arrays
52 std::vector<std::shared_ptr<arrow::Array>> columns = {days, months, years};
53
54 // Make a schema to initialize the Table with
55 std::shared_ptr<arrow::Field> field_day, field_month, field_year;
56 std::shared_ptr<arrow::Schema> schema;
57
58 field_day = arrow::field("Day", arrow::int8());
59 field_month = arrow::field("Month", arrow::int8());
60 field_year = arrow::field("Year", arrow::int16());
61
62 schema = arrow::schema({field_day, field_month, field_year});
63 // With the schema and data, create a Table
64 std::shared_ptr<arrow::Table> table;
65 table = arrow::Table::Make(schema, columns);
66
67 // Write out test files in IPC, CSV, and Parquet for the example to use.
68 std::shared_ptr<arrow::io::FileOutputStream> outfile;
69 ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_in.arrow"));
70 ARROW_ASSIGN_OR_RAISE(std::shared_ptr<arrow::ipc::RecordBatchWriter> ipc_writer,
71 arrow::ipc::MakeFileWriter(outfile, schema));
72 ARROW_RETURN_NOT_OK(ipc_writer->WriteTable(*table));
73 ARROW_RETURN_NOT_OK(ipc_writer->Close());
74
75 ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_in.csv"));
76 ARROW_ASSIGN_OR_RAISE(auto csv_writer,
77 arrow::csv::MakeCSVWriter(outfile, table->schema()));
78 ARROW_RETURN_NOT_OK(csv_writer->WriteTable(*table));
79 ARROW_RETURN_NOT_OK(csv_writer->Close());
80
81 ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_in.parquet"));
82 PARQUET_THROW_NOT_OK(
83 parquet::arrow::WriteTable(*table, arrow::default_memory_pool(), outfile, 5));
84
85 return arrow::Status::OK();
86}
87// (Doc section: GenInitialFile)
88
89// (Doc section: RunMain)
90arrow::Status RunMain() {
91 // (Doc section: RunMain)
92 // (Doc section: Gen Files)
93 // Generate initial files for each format with a helper function -- don't worry,
94 // we'll also write a table in this example.
95 ARROW_RETURN_NOT_OK(GenInitialFile());
96 // (Doc section: Gen Files)
97
98 // (Doc section: ReadableFile Definition)
99 // First, we have to set up a ReadableFile object, which just lets us point our
100 // readers to the right data on disk. We'll be reusing this object, and rebinding
101 // it to multiple files throughout the example.
102 std::shared_ptr<arrow::io::ReadableFile> infile;
103 // (Doc section: ReadableFile Definition)
104 // (Doc section: Arrow ReadableFile Open)
105 // Get "test_in.arrow" into our file pointer
106 ARROW_ASSIGN_OR_RAISE(infile, arrow::io::ReadableFile::Open(
107 "test_in.arrow", arrow::default_memory_pool()));
108 // (Doc section: Arrow ReadableFile Open)
109 // (Doc section: Arrow Read Open)
110 // Open up the file with the IPC features of the library, gives us a reader object.
111 ARROW_ASSIGN_OR_RAISE(auto ipc_reader, arrow::ipc::RecordBatchFileReader::Open(infile));
112 // (Doc section: Arrow Read Open)
113 // (Doc section: Arrow Read)
114 // Using the reader, we can read Record Batches. Note that this is specific to IPC;
115 // for other formats, we focus on Tables, but here, RecordBatches are used.
116 std::shared_ptr<arrow::RecordBatch> rbatch;
117 ARROW_ASSIGN_OR_RAISE(rbatch, ipc_reader->ReadRecordBatch(0));
118 // (Doc section: Arrow Read)
119
120 // (Doc section: Arrow Write Open)
121 // Just like with input, we get an object for the output file.
122 std::shared_ptr<arrow::io::FileOutputStream> outfile;
123 // Bind it to "test_out.arrow"
124 ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_out.arrow"));
125 // (Doc section: Arrow Write Open)
126 // (Doc section: Arrow Writer)
127 // Set up a writer with the output file -- and the schema! We're defining everything
128 // here, loading to fire.
129 ARROW_ASSIGN_OR_RAISE(std::shared_ptr<arrow::ipc::RecordBatchWriter> ipc_writer,
130 arrow::ipc::MakeFileWriter(outfile, rbatch->schema()));
131 // (Doc section: Arrow Writer)
132 // (Doc section: Arrow Write)
133 // Write the record batch.
134 ARROW_RETURN_NOT_OK(ipc_writer->WriteRecordBatch(*rbatch));
135 // (Doc section: Arrow Write)
136 // (Doc section: Arrow Close)
137 // Specifically for IPC, the writer needs to be explicitly closed.
138 ARROW_RETURN_NOT_OK(ipc_writer->Close());
139 // (Doc section: Arrow Close)
140
141 // (Doc section: CSV Read Open)
142 // Bind our input file to "test_in.csv"
143 ARROW_ASSIGN_OR_RAISE(infile, arrow::io::ReadableFile::Open("test_in.csv"));
144 // (Doc section: CSV Read Open)
145 // (Doc section: CSV Table Declare)
146 std::shared_ptr<arrow::Table> csv_table;
147 // (Doc section: CSV Table Declare)
148 // (Doc section: CSV Reader Make)
149 // The CSV reader has several objects for various options. For now, we'll use defaults.
150 ARROW_ASSIGN_OR_RAISE(
151 auto csv_reader,
152 arrow::csv::TableReader::Make(
153 arrow::io::default_io_context(), infile, arrow::csv::ReadOptions::Defaults(),
154 arrow::csv::ParseOptions::Defaults(), arrow::csv::ConvertOptions::Defaults()));
155 // (Doc section: CSV Reader Make)
156 // (Doc section: CSV Read)
157 // Read the table.
158 ARROW_ASSIGN_OR_RAISE(csv_table, csv_reader->Read())
159 // (Doc section: CSV Read)
160
161 // (Doc section: CSV Write)
162 // Bind our output file to "test_out.csv"
163 ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_out.csv"));
164 // The CSV writer has simpler defaults, review API documentation for more complex usage.
165 ARROW_ASSIGN_OR_RAISE(auto csv_writer,
166 arrow::csv::MakeCSVWriter(outfile, csv_table->schema()));
167 ARROW_RETURN_NOT_OK(csv_writer->WriteTable(*csv_table));
168 // Not necessary, but a safe practice.
169 ARROW_RETURN_NOT_OK(csv_writer->Close());
170 // (Doc section: CSV Write)
171
172 // (Doc section: Parquet Read Open)
173 // Bind our input file to "test_in.parquet"
174 ARROW_ASSIGN_OR_RAISE(infile, arrow::io::ReadableFile::Open("test_in.parquet"));
175 // (Doc section: Parquet Read Open)
176 // (Doc section: Parquet FileReader)
177 std::unique_ptr<parquet::arrow::FileReader> reader;
178 // (Doc section: Parquet FileReader)
179 // (Doc section: Parquet OpenFile)
180 // Note that Parquet's OpenFile() takes the reader by reference, rather than returning
181 // a reader.
182 PARQUET_ASSIGN_OR_THROW(reader,
183 parquet::arrow::OpenFile(infile, arrow::default_memory_pool()));
184 // (Doc section: Parquet OpenFile)
185
186 // (Doc section: Parquet Read)
187 std::shared_ptr<arrow::Table> parquet_table;
188 // Read the table.
189 PARQUET_THROW_NOT_OK(reader->ReadTable(&parquet_table));
190 // (Doc section: Parquet Read)
191
192 // (Doc section: Parquet Write)
193 // Parquet writing does not need a declared writer object. Just get the output
194 // file bound, then pass in the table, memory pool, output, and chunk size for
195 // breaking up the Table on-disk.
196 ARROW_ASSIGN_OR_RAISE(outfile, arrow::io::FileOutputStream::Open("test_out.parquet"));
197 PARQUET_THROW_NOT_OK(parquet::arrow::WriteTable(
198 *parquet_table, arrow::default_memory_pool(), outfile, 5));
199 // (Doc section: Parquet Write)
200 // (Doc section: Return)
201 return arrow::Status::OK();
202}
203// (Doc section: Return)
204
205// (Doc section: Main)
206int main() {
207 arrow::Status st = RunMain();
208 if (!st.ok()) {
209 std::cerr << st << std::endl;
210 return 1;
211 }
212 return 0;
213}
214// (Doc section: Main)