Category Archives: data wrangling
Taken from a Dataiku meetup slide. This picture hit close to home.
Redshift is based off branch of PostGreSQL 8.0.2 [ PostgreSQL 8.0.2 was released in 2005] here’s all the unsupported fancy PostGres Stuff: taken directly from amazon’s manual. The bigs ones are: No Store Procedures, No Constraints enforcement, No triggers and no … Continue reading
Best Practices for Micro-Batch Loading on Amazon Redshift Article by AWS blog I work with Redshift everyday now at Amazon. It’s very useful big data warehouse tool. Here’s a blog post about loading data into it. It’s very s3 dependent … Continue reading
Basic database table creation with MySql and PostGreSQL. The starting point to most data applications is getting the data feeds and populating the tables. here’s an example of the process I’m loading a stock_history table from yahoo finance api source. … Continue reading
Ever since MySql has been purchased by Oracle, it has been lagging in development in the open source space. MariaDB , Percona, Aurora are spin offs that try to address it. MySql is the original M of the LAMP stack. … Continue reading
A general rule with BI reports and dashboards is 10 seconds or less for a report or 30 seconds for dashboards. But quite often an analyst will run a report and it never comes back. They’ll say something like so … Continue reading
For people working with database tables: Most will want to check out the columns in the table and do a quick scan to get 10 rows to sample data in the table. Here’s the SQL syntax for doing that with … Continue reading