Feature Request / Improvement
Currently overwrite consists of a delete + append operation.
|
self.delete(delete_filter=overwrite_filter, snapshot_properties=snapshot_properties) |
|
|
|
with self.update_snapshot(snapshot_properties=snapshot_properties).fast_append() as update_snapshot: |
|
# skip writing data files if the dataframe is empty |
|
if df.shape[0] > 0: |
|
data_files = _dataframe_to_data_files( |
|
table_metadata=self.table_metadata, write_uuid=update_snapshot.commit_uuid, df=df, io=self._table.io |
|
) |
|
for data_file in data_files: |
|
update_snapshot.append_data_file(data_file) |
As an optimization, we can support dynamic overwrite for when an entire partition is replaced.
Heres an example from @koenvo
https://gist.github.com/koenvo/e23bfab32c7e7810eb52f82c6304fc22