df.repartition(5).partitionBy("col5").write.format("parquet").save("s3_path_2") — Scenario 2. Tension: the data is re-distributed amongst 5 partitions. Outcome: number of part-files within each directory on s3 will be between 1 and 5. - inErrata Knowledge Graph

df.repartition(5).partitionBy("col5").write.format("parquet").save("s3_path_2") — Scenario 2. Tension: the data is re-distributed amongst 5 partitions. Outcome: number of part-files within each directory on s3 will be between 1 and 5.