Author Archives: mx

Improve Amazon Redshift table performance fast & easy.

Assuming Redshift tables are un-compressed because most people don’t do it. list all the big tables by size, here’s a script run analyze compression analyze compression public.report_table_name; Example results: table column encoding est_reduction_pct report_table_name rowid raw 0 report_table_name create_date zstd 9.49 … Continue reading

Posted in big data | Tagged , | Leave a comment

Slide Share’s full of useful knowledge

Best practices for PySpark programming Programming in Spark using PySpark from Mostafa Elzoghbi History of SQL and all the advanced features over the last 30 years among big vendors. Modern SQL in Open Source and Commercial Databases from Markus Winand

Posted in spark, Uncategorized | Tagged , , , | Leave a comment

What I think of every time I hear Stakeholders

What I think of every time I hear Major Stakeholder Continue reading

Image | Posted on | Tagged | Leave a comment

Re-Blog: 10 Risks that Beset Data Programmes

Credits to: Peter James Thomas: https://www.linkedin.com/pulse/10-risks-beset-data-programmes-peter-james-thomas Not establishing a dedicated team. The team never escapes from “the day job” or legacy / BAU issues; the past prevents the future from being built. Staff lack skills and prior experience of data … Continue reading

Posted in Business Intelligence | Tagged , | Leave a comment

The reality of a data worker.

Taken from a Dataiku meetup slide.  This picture hit close to home.

Image | Posted on | Tagged | Leave a comment

Things to note when migrating web hosts

Things to note when migrating web hosts Continue reading

Posted in Uncategorized | Leave a comment

New Year, New Site

I switched to wordpress.com as my host. I will most likely switch to AWS later.

Posted in Uncategorized | Leave a comment