Feature Request / Improvement
A common feedback ive heard about our testing infra is that the integration tests often fails when running on different environments. I think a big reason for that is how we run tests using spark.
There are 2 places where we run spark queries:
provision.py is ran inside a docker container so the env is consistent.
pytests are ran locally and will install spark (via pyspark) and its packages on the machine running these tests. This step is brittle.
#2491 changes pytest to use spark connect which will run all spark queries based on the spark server env that is inside the docker container.