Apache Iceberg version
main (development)
Please describe the bug 🐞
PyIceberg assumes the same FS implementation is used for reading both metadata and data.
However, I want to use a catalog with local FS as the warehouse while referencing S3 files as data.
See this example Jupyter notebook to reproduce
Problem
The fs implementation is determined by metadata location, which is then passed down to the function which reads the data file.
|
scheme, netloc, _ = PyArrowFileIO.parse_location(table_metadata.location) |
|
if isinstance(io, PyArrowFileIO): |
|
fs = io.fs_by_scheme(scheme, netloc) |
Possible solution
Determine fs implementation based on the file path of the current file