std::tuple 类范围转换为 Table 实例#

虽然上面的示例展示了一种比较手动的行到列的转换方式,但 Arrow 也提供了一些模板逻辑来将 std::tuple<..> 类对象的范围转换为表格。

在最简单的情况下,您只需要提供输入数据,类型转换将在编译时推断。

std::vector<std::tuple<double, std::string>> rows = ..
std::shared_ptr<Table> table;

if (!arrow::stl::TableFromTupleRange(
      arrow::default_memory_pool(),
      rows, names, &table).ok()
) {
  // Error handling code should go here.
}

反过来,您可以使用 TupleRangeFromTable 来用 Table 实例中的数据填充已预先分配的范围。

// An important aspect here is that the table columns need to be in the
// same order as the columns will later appear in the tuple. As the tuple
// is unnamed, matching is done on positions.
std::shared_ptr<Table> table = ..

// The range needs to be pre-allocated to the respective amount of rows.
// This allows us to pass in an arbitrary range object, not only
// `std::vector`.
std::vector<std::tuple<double, std::string>> rows(2);
if (!arrow::stl::TupleRangeFromTable(*table, &rows).ok()) {
  // Error handling code should go here.
}

Arrow 本身已经支持一些 C(++) 数据类型用于此转换。如果您想支持其他数据类型,则需要实现 arrow::stl::ConversionTraits<T> 的特化以及更通用的 arrow::CTypeTraits<T>

namespace arrow {

template<>
struct CTypeTraits<boost::posix_time::ptime> {
  using ArrowType = ::arrow::TimestampType;

  static std::shared_ptr<::arrow::DataType> type_singleton() {
    return ::arrow::timestamp(::arrow::TimeUnit::MICRO);
  }
};

}

namespace arrow { namespace stl {

template <>
struct ConversionTraits<boost::posix_time::ptime> : public CTypeTraits<boost::posix_time::ptime> {
  constexpr static bool nullable = false;

  // This is the specialization to load a scalar value into an Arrow builder.
  static Status AppendRow(
        typename TypeTraits<TimestampType>::BuilderType& builder,
        boost::posix_time::ptime cell) {
    boost::posix_time::ptime const epoch({1970, 1, 1}, {0, 0, 0, 0});
    return builder.Append((cell - epoch).total_microseconds());
  }

  // Specify how we can fill the tuple from the values stored in the Arrow
  // array.
  static boost::posix_time::ptime GetEntry(
        const TimestampArray& array, size_t j) {
    return psapp::arrow::internal::timestamp_epoch
        + boost::posix_time::time_duration(0, 0, 0, array.Value(j));
  }
};

}}