Tag Archives: database

Improve Amazon Redshift table performance fast & easy.

Assuming Redshift tables are un-compressed because most people don’t do it. list all the big tables by size, here’s a script run analyze compression analyze compression public.report_table_name; Example results: table column encoding est_reduction_pct report_table_name rowid raw 0 report_table_name create_date zstd 9.49 … Continue reading

Posted in big data | Tagged , | Leave a comment

Amazon Redshift’s Unsupported Features of PostGres

Redshift is based off branch of PostGreSQL 8.0.2 [ PostgreSQL 8.0.2 was released in 2005] here’s all the unsupported fancy PostGres Stuff: taken directly from amazon’s manual. The bigs ones are: No Store Procedures, No Constraints enforcement, No triggers and no … Continue reading

Posted in data wrangling, mpp databases | Tagged , , , | Leave a comment

Best Practices for Micro-Batch Loading on Amazon Redshift

Best Practices for Micro-Batch Loading on Amazon Redshift Article by AWS blog I work with Redshift everyday now at Amazon. It’s very useful big data warehouse tool. Here’s a blog post about loading data into it. It’s very s3 dependent … Continue reading

Posted in big data, data wrangling, etl | Tagged , , , | Leave a comment

Amazon Redshift is an amazing database product

Redshift is : Fast like Ferrari Cheap like a Ford Fiesta Useful like a Minivan Self Driving Auto-magics like Tesla with Autopilot Key features: Really fancy features under-the-hood: -interleaved sort keys -columnar distributed storage -smart parallel execution -IO optimization (return … Continue reading

Posted in big data, Business Intelligence, Cloud, data analysis, relational databases | Tagged , , , , , | Leave a comment

basic database table creation and load from csv using mysql and postgres

Basic database table creation with MySql and PostGreSQL. The starting point to most data applications is getting the data feeds and populating the tables. here’s an example of the process I’m loading a stock_history table from yahoo finance api source. … Continue reading

Posted in data wrangling, relational databases | Tagged , , , , | Leave a comment

Why PostgreSQL is the better MySQL

Ever since MySql has been purchased by Oracle, it has been lagging in development in the open source space. MariaDB , Percona, Aurora are spin offs that try to address it.  MySql is the original M of the LAMP stack. … Continue reading

Posted in data wrangling, relational databases | Tagged , , , , , | Leave a comment

SQL tip: To get first 10 Rows from a Table and profile the columns

For people working with database tables: Most will want to check out the columns in the table and do a quick scan to get 10 rows to sample data in the table. Here’s the SQL syntax for doing that with … Continue reading

Posted in data wrangling | Tagged , , | Leave a comment