Author Archives: mx
Assuming Redshift tables are un-compressed because most people don’t do it. list all the big tables by size, here’s a script run analyze compression analyze compression public.report_table_name; Example results: table column encoding est_reduction_pct report_table_name rowid raw 0 report_table_name create_date zstd 9.49 … Continue reading
Best practices for PySpark programming Programming in Spark using PySpark from Mostafa Elzoghbi History of SQL and all the advanced features over the last 30 years among big vendors. Modern SQL in Open Source and Commercial Databases from Markus Winand
What I think of every time I hear Major Stakeholder Continue reading
Credits to: Peter James Thomas: https://www.linkedin.com/pulse/10-risks-beset-data-programmes-peter-james-thomas Not establishing a dedicated team. The team never escapes from “the day job” or legacy / BAU issues; the past prevents the future from being built. Staff lack skills and prior experience of data … Continue reading
Taken from a Dataiku meetup slide. This picture hit close to home.
I switched to wordpress.com as my host. I will most likely switch to AWS later.