Category Archives: Uncategorized
The Data Engineering 2021
I consider these tools the New Data Engineering the current standard : Scheduling Tools: Airflow ETL-adjacent processes: dbt Data Quality Testing: Great Expectations Infrastructure: Terraform Data Catalog/Discovery: Amundsen Here’s a visual guideline for modern data engineer roadmap https://github.com/datastacktv/data-engineer-roadmap credit to … Continue reading
How Netflix does data in AWS
GCP for AWS professionals
Someone made a GCP lookup list for AWS cloud people Service comparisons The following table provides a side-by-side comparison of the various services available on AWS and Google Cloud. Service Category Service AWS Google Cloud Compute IaaS Amazon Elastic Compute … Continue reading
Modern Data Engineering is Complicated
Modern Data Engineering is Complicated. There are so many things to know to be good. Languages : SQL , Python, Scala Operating Systems : Linux, bash shell Cloud : AWS, Azure, GCP Data Pipelines : Airflow, Kubeflow DevOps : Kubernetes, … Continue reading
My other hobby is Stocks
My other hobby is investing and picking stocks.Here’s my other blog: http://hunandelightmd.com/
mysql tricks: do instant table swap to mitigate mysql deadlock error.
mysql tricks: do instant table swap to mitigate mysql deadlock error. Continue reading
Slide Share’s full of useful knowledge
Best practices for PySpark programming Programming in Spark using PySpark from Mostafa Elzoghbi History of SQL and all the advanced features over the last 30 years among big vendors. Modern SQL in Open Source and Commercial Databases from Markus Winand
The reality of a data worker.
Taken from a Dataiku meetup slide. This picture hit close to home.
Things to note when migrating web hosts
Things to note when migrating web hosts Continue reading
New Year, New Site
I switched to wordpress.com as my host. I will most likely switch to AWS later.