Assuming Redshift tables are un-compressed because most people don’t do it.
- list all the big tables by size, here’s a script
- run analyze compression
analyze compression public.report_table_name; Example results:
table column encoding est_reduction_pct report_table_name rowid raw 0 report_table_name create_date zstd 9.49 report_table_name create_day delta 38.72 report_table_name create_date_key zstd 28.98 report_table_name type zstd 67.59 report_table_name type_desc zstd 58.25 report_table_name created_by zstd 35.17 - recreate table with new compression encoding and see performance gains.
DROP TABLE IF EXISTS public.report_table_name cascade;
CREATE TABLE public.report_table_name
( rowid INT IDENTITY(1,1),
create_date timestamp encode zstd,
create_day date encode zstd,
type integer encode zstd,
type_desc varchar(28) encode zstd,
created_by varchar(30) encode zstd,
data_refresh_date date encode raw
) ;
Amazon manual regarding compression command:
http://docs.aws.amazon.com/redshift/latest/dg/r_ANALYZE_COMPRESSION.html