Author Archives: mx
What I think of every time I hear Major Stakeholder Continue reading
Credits to: Peter James Thomas: https://www.linkedin.com/pulse/10-risks-beset-data-programmes-peter-james-thomas Not establishing a dedicated team. The team never escapes from “the day job” or legacy / BAU issues; the past prevents the future from being built. Staff lack skills and prior experience of data … Continue reading
Taken from a Dataiku meetup slide. This picture hit close to home.
I switched to wordpress.com as my host. I will most likely switch to AWS later.
Redshift is based off branch of PostGreSQL 8.0.2 [ PostgreSQL 8.0.2 was released in 2005] here’s all the unsupported fancy PostGres Stuff: taken directly from amazon’s manual. The bigs ones are: No Store Procedures, No Constraints enforcement, No triggers and no … Continue reading
Best Practices for Micro-Batch Loading on Amazon Redshift Article by AWS blog I work with Redshift everyday now at Amazon. It’s very useful big data warehouse tool. Here’s a blog post about loading data into it. It’s very s3 dependent … Continue reading
Redshift is : Fast like Ferrari Cheap like a Ford Fiesta Useful like a Minivan Self Driving Auto-magics like Tesla with Autopilot Key features: Really fancy features under-the-hood: -interleaved sort keys -columnar distributed storage -smart parallel execution -IO optimization (return … Continue reading
Snowflake: data warehouse in the cloud (specially amazon) Snowflake compute is basically an analytics computing database that has scalability. Data is stored / shared on AWS S3 buckets instead of in snowfalek. You spin up snowflake tool injest and load into it’s proprietary … Continue reading
TensorFLow : google’s machine learning api can be super powerful. Just make json REST calls to the end-point and get results based on google’s machine learning lib. Uses cases: 1. identify an image (image classification) 2. parse speech into text … Continue reading