Pandas 2.0 and the Arrow revolution
Date: 3/1/2023 · Tags: #python, #newsPandas 2.0 with new Arrow backend shows significantly performance improvements on various operations.
Apache Arrow (including arrow flight) is eating the data science and OLAP world: Pandas, Polars, DuckDB, Influx-iox, Databend ...
All problems in computer science can be solved by another level of indirection --- Butler Lampson
One intersting point is in the future container construction or transformation among Pandas, Polar and DuckDB would become cheaper or zero-copy due to they could share the same low-level data types and memory layout.
Refs:
- Pandas 2.0 and the Arrow revolution (part I)
- Apache Arrow and the "10 Things I Hate About pandas"
- Debatable benchmark from from Ritchie Vink
Updated: 2023-04-20