Skip to content

Instantly share code, notes, and snippets.

@cevian
Last active September 1, 2025 09:49
Show Gist options
  • Save cevian/e0fdfc6ccacc9f04c57130ddc9ce08c5 to your computer and use it in GitHub Desktop.
Save cevian/e0fdfc6ccacc9f04c57130ddc9ce08c5 to your computer and use it in GitHub Desktop.
TigerData/TimescaleDB Coding Agent Context
# TimescaleDB Complete Setup Guide for AI Coding Assistants
You are tasked with setting up a complete TimescaleDB time-series database solution. This guide provides step-by-step instructions for creating hypertables, configuring compression, setting up retention policies, and implementing continuous aggregates with their associated policies. Adapt the schema and configurations to your specific use case.
## Step 1: Create Base Table and Hypertable
Create a table schema appropriate for your time-series data, then convert it to a hypertable:
```sql
-- Create hypertable with compression settings directly using WITH clause
-- IMPORTANT: Choose segment_by column carefully (see guidance below)
CREATE TABLE your_table_name (
timestamp TIMESTAMPTZ NOT NULL,
entity_id TEXT NOT NULL, -- device_id, user_id, symbol, etc.
category TEXT, -- sensor_type, event_type, asset_class, etc.
value_1 DOUBLE PRECISION, -- price, temperature, latency, etc.
value_2 DOUBLE PRECISION, -- volume, humidity, throughput, etc.
value_3 INTEGER, -- count, status, level, etc.
metadata JSONB -- flexible additional data
) WITH (
tsdb.hypertable,
tsdb.partition_column='timestamp',
tsdb.enable_columnstore=true,
tsdb.segmentby='entity_id', -- Usually prefer single column - see selection guide below
tsdb.orderby='timestamp DESC'
);
-- HOW TO CHOOSE PARTITION COLUMN:
--
-- The partition column determines how data is divided into chunks over time.
-- This is almost always a timestamp column in time-series workloads.
--
-- Requirements:
-- - Must be a time-based column (TIMESTAMP, TIMESTAMPTZ, DATE) or integer (INT, BIGINT)
-- - Should represent when the event actually occurred or sequential ordering
-- - Must have good temporal/sequential distribution (not all the same value)
--
-- Common patterns:
-- - 'timestamp' - when the measurement/event happened
-- - 'created_at' - when the record was created
-- - 'event_time' - when the business event occurred
-- - 'ingested_at' - when data entered the system (less ideal)
-- - 'id' - autoincrement integer key (for sequential data without timestamps)
-- - 'sequence_number' - monotonically increasing integer
--
-- AVOID using 'updated_at' as partition column:
-- - Records can be updated out of time order
-- - Creates uneven chunk distribution
-- - Breaks time-based query optimization
--
-- Use 'updated_at' only if:
-- - It's your primary query dimension
-- - You rarely query by creation time
-- - Update patterns are predictable and time-ordered
--
-- HOW TO CHOOSE SEGMENT_BY COLUMN:
--
-- The segment_by column determines how data is grouped during compression.
-- PREFER SINGLE COLUMN - multi-column segment_by is rarely optimal.
--
-- Multi-column segment_by can work when columns are highly correlated
-- (e.g., metric_name + metric_type where they always appear together),
-- but requires careful analysis of row density patterns.
--
-- Choose a column that:
-- 1. Is frequently used in WHERE clauses (your most common filter)
-- 2. Has good row density per segment (>100 rows per segment_by value per chunk)
-- 3. Represents the primary way you partition/group your data logically
-- 4. Balances compression ratio with query performance
--
-- EXAMPLES BY USE CASE:
-- IoT/Sensors: 'device_id'
-- Finance/Trading: 'symbol'
-- Application Metrics:
-- - 'service_name'
-- - 'service_name + metric_type' (if sufficient row density)
-- - 'metric_name + metric_type' (if sufficient row density)
-- User Analytics: 'user_id' if sufficient row density, otherwise 'session_id'
-- E-commerce: 'product_id' if sufficient row density, otherwise 'category_id'
--
-- ROW DENSITY GUIDELINES:
-- - Target >100 rows per segment_by value within each chunk
-- - Poor compression: segment_by values with <10 rows per chunk
-- - Good compression: segment_by values with 100-10,000+ rows per chunk
-- - If your entity_id only has 5-10 rows per chunk, choose a less granular column
--
-- QUERY PATTERN ANALYSIS:
-- Your most common query pattern should drive the choice:
-- SELECT * FROM table WHERE entity_id = 'X' AND timestamp > ...
-- ↳ Good segment_by: 'entity_id' (if entity_id has >100 rows per chunk)
--
-- BAD CHOICES FOR SEGMENT_BY:
-- - Timestamp columns (already time-partitioned)
-- - Unique identifiers (transaction_id, uuid fields)
-- - Columns with low row density (<100 rows per value per chunk)
-- - Columns rarely used in filtering
-- - Multiple columns (creates too many small segments)
--
-- HOW TO CHOOSE ORDER_BY COLUMN:
--
-- The order_by column should create a natural time-series progression when
-- combined with segment_by. This ensures adjacent rows have similar values,
-- which compress well.
--
-- The combination (segment_by, order_by) should form a sequence where
-- values change gradually between consecutive rows.
--
-- EXAMPLES:
-- - segment_by='device_id', order_by='timestamp DESC'
-- ↳ Forms natural progression: device readings over time
-- - segment_by='symbol', order_by='timestamp DESC'
-- ↳ Forms natural progression: stock prices over time
-- - segment_by='user_id', order_by='session_timestamp DESC'
-- ↳ Forms natural progression: user events over time
--
-- Most common pattern: 'timestamp DESC' (newest data first)
-- This works well because time-series data naturally has temporal correlation.
--
-- Alternative patterns when timestamp isn't the natural ordering:
-- - 'sequence_id DESC' for event streams with sequence numbers
-- - 'timestamp DESC, event_order DESC' for sub-ordering within time
--
-- IMPORTANT: When a column can't be used in segment_by due to low row density,
-- consider prepending it to order_by to preserve natural progression:
--
-- Example: metric_name has only 20 rows per chunk (too low for segment_by)
-- - segment_by='service_name' (has >100 rows per chunk)
-- - order_by='metric_name, timestamp DESC'
--
-- This creates natural progression within each service: all temperature readings
-- together, then all pressure readings, etc. Values are more similar when
-- grouped by metric type, improving compression.
--
-- ADVANCED: Append columns that benefit from min/max indexing for query optimization:
-- After the natural progression columns, you can append additional columns that:
-- - Are frequently used in WHERE clauses for filtering
-- - Have some correlation with the main progression
-- - Can help exclude compressed chunks during queries
--
-- Example: 'created_at DESC, updated_at DESC'
-- - created_at provides the main natural progression
-- - updated_at is appended because it often correlates and is used for filtering
-- - TimescaleDB tracks min/max of updated_at per compressed chunk
-- - Queries like "WHERE updated_at > '2024-01-01'" can exclude entire compressed batches.
--
-- Other examples:
-- - 'timestamp DESC, price DESC' (for financial data where price filters are common)
-- - 'timestamp DESC, severity DESC' (for logs where severity filtering is frequent)
--
-- BAD CHOICES FOR ORDER_BY:
-- - Random columns that break natural progression
-- - Columns that create high variance between adjacent rows
-- - Columns unrelated to the segment_by grouping
-- Optionally set chunk time interval (default is 7 days)
-- Adjust based on data volume: 1 hour to 1 day for high frequency, 1 day to 1 week for medium, 1 week to 1 month for low frequency
SELECT set_chunk_time_interval('your_table_name', INTERVAL '1 day');
-- Note: You can also configure compression later using ALTER TABLE:
-- ALTER TABLE your_table_name SET (
-- timescaledb.enable_columnstore,
-- timescaledb.segmentby = 'entity_id, category',
-- timescaledb.orderby = 'timestamp DESC'
-- );
-- Create indexes for your common query patterns
CREATE INDEX idx_entity_timestamp ON your_table_name (entity_id, timestamp DESC);
CREATE INDEX idx_category_timestamp ON your_table_name (category, timestamp DESC);
-- IMPORTANT: Primary Key Considerations for Hypertables
-- Any primary key or unique index MUST include the partitioning column.
-- Single-column primary keys (like 'id SERIAL PRIMARY KEY') don't work well with hypertables
-- UNLESS the primary key column is also the partitioning column.
--
-- Options for primary keys:
-- 1. If using timestamp partitioning, use composite primary key:
-- ALTER TABLE your_table_name ADD PRIMARY KEY (entity_id, timestamp);
--
-- 2. If using integer partitioning, single-column PK works:
-- -- Example: CREATE TABLE ... (id SERIAL PRIMARY KEY, ...) WITH (tsdb.hypertable, tsdb.partition_column='id');
--
-- 3. Use unique constraints for business logic uniqueness (must include partition column):
-- ALTER TABLE your_table_name ADD CONSTRAINT unique_entity_time UNIQUE (entity_id, timestamp);
--
-- 4. No primary key (often acceptable for time-series data):
-- -- Many time-series use cases don't require strict uniqueness constraints
```
## Step 2: Configure Compression Policy
Add automatic compression policy (compression settings were configured in Step 1):
```sql
-- Add compression policy - compress when BOTH criteria are typically met:
-- (a) Most data will not be updated again (some updates/backfill is ok but not regular)
-- (b) You no longer need fine-grained B-tree indexes for queries (less common criterion)
--
-- You should include a comment with the following SQL to let developers know to adjust 'after'
-- to be older than the period of most updates:
--
--Adjust 'after' interval based on your update patterns so that most data is updated
--before it is converted to columnstore.
SELECT add_columnstore_policy('your_table_name', after => INTERVAL '1 day');
```
## Step 3: Set Up Data Retention Policy
Configure automatic data retention based on your specific requirements:
```sql
-- IMPORTANT: Don't guess retention periods - either:
-- 1. Look for user specifications/requirements in the project
-- 2. Ask the user about their data retention needs
--
-- If you aren't sure of the data retention period than
-- include the add_data_retention_policy call but you MUST comment it out.
--
-- Common patterns (for reference only):
-- - High-frequency IoT data: 30-90 days to 1 year
-- - Financial data: 7+ years for regulatory compliance
-- - Application metrics: 30-180 days
-- - User analytics: 1-2 years
-- - Log data: 30-90 days
--
-- Example (replace with actual requirements):
SELECT add_retention_policy('your_table_name', INTERVAL '365 days');
```
## Step 4: Create Continuous Aggregates
Set up continuous aggregates for different time granularities:
### Short-term Aggregates (Minutes/Hours)
```sql
-- For high-frequency data (IoT sensors, trading data, application metrics)
CREATE MATERIALIZED VIEW your_table_hourly
WITH (timescaledb.continuous) AS
SELECT
time_bucket(INTERVAL '1 hour', timestamp) AS bucket,
entity_id,
category,
COUNT(*) as record_count,
AVG(value_1) as avg_value_1,
MIN(value_1) as min_value_1,
MAX(value_1) as max_value_1,
STDDEV(value_1) as stddev_value_1,
SUM(value_2) as sum_value_2, -- useful for volumes, counts
AVG(value_3) as avg_value_3
FROM your_table_name
GROUP BY bucket, entity_id, category;
```
### Long-term Aggregates (Days/Weeks)
```sql
-- For trend analysis and reporting
CREATE MATERIALIZED VIEW your_table_daily
WITH (timescaledb.continuous) AS
SELECT
time_bucket(INTERVAL '1 day', timestamp) AS bucket,
entity_id,
category,
COUNT(*) as record_count,
AVG(value_1) as avg_value_1,
MIN(value_1) as min_value_1,
MAX(value_1) as max_value_1,
STDDEV(value_1) as stddev_value_1,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY value_1) as median_value_1,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY value_1) as p95_value_1,
SUM(value_2) as sum_value_2,
AVG(value_3) as avg_value_3
FROM your_table_name
GROUP BY bucket, entity_id, category;
```
## Step 5: Configure Continuous Aggregate Policies
Set up refresh policies based on your data freshness requirements:
```sql
-- Hourly aggregates - refresh frequently for near real-time dashboards
-- Most common case: no start_offset (refreshes all data as needed)
SELECT add_continuous_aggregate_policy('your_table_hourly',
end_offset => INTERVAL '15 minutes', -- lag from real-time
schedule_interval => INTERVAL '15 minutes'); -- how often to refresh
-- Daily aggregates - refresh less frequently for reports
SELECT add_continuous_aggregate_policy('your_table_daily',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');
-- Alternative: Use start_offset only if you don't care about refreshing old data
-- Example: Only refresh the last 7 days for a high-volume system where users don't care about query result accuracy
-- on older data.
-- SELECT add_continuous_aggregate_policy('your_table_hourly',
-- start_offset => INTERVAL '7 days', -- only refresh last 7 days
-- end_offset => INTERVAL '15 minutes',
-- schedule_interval => INTERVAL '15 minutes');
```
## Step 6: Configure Real-Time Aggregation (Optional)
Enable real-time aggregation to include the most recent raw data in continuous aggregate queries:
```sql
-- Real-time aggregates combine materialized data with recent raw data at query time
-- This provides up-to-date results but with slightly higher query cost
--
-- Note: In TimescaleDB v2.13+, real-time aggregates are DISABLED by default
-- In earlier versions, they were ENABLED by default
-- Enable real-time aggregation for more current results:
ALTER MATERIALIZED VIEW your_table_hourly SET (timescaledb.materialized_only = false);
ALTER MATERIALIZED VIEW your_table_daily SET (timescaledb.materialized_only = false);
-- Disable real-time aggregation (materialized data only) for better performance:
-- ALTER MATERIALIZED VIEW your_table_hourly SET (timescaledb.materialized_only = true);
-- ALTER MATERIALIZED VIEW your_table_daily SET (timescaledb.materialized_only = true);
```
**When to use real-time aggregation:**
- Need to query data newer than your refresh policy's end_offset/lag time
- Need up-to-the-minute results in dashboards
- Can tolerate slightly higher query latency
- Want to include the most recent raw data that hasn't been materialized yet
**When to disable real-time aggregation:**
- Performance is more important than data freshness
- Refresh policies provide sufficient data currency
- High query volume where every millisecond matters
## Step 7: Enable Compression on Continuous Aggregates
Compress aggregated data for storage efficiency:
```sql
-- Compress hourly aggregates
-- Rule of thumb: segment_by = all GROUP BY columns except time_bucket, order_by = time_bucket DESC
ALTER MATERIALIZED VIEW your_table_hourly SET (
timescaledb.enable_columnstore,
timescaledb.segmentby = 'entity_id, category', -- all non-time GROUP BY columns
timescaledb.orderby = 'bucket DESC' -- time_bucket column
);
SELECT add_columnstore_policy('your_table_hourly', after => INTERVAL '3 days');
-- Compress daily aggregates
ALTER MATERIALIZED VIEW your_table_daily SET (
timescaledb.enable_columnstore,
timescaledb.segmentby = 'entity_id, category', -- all non-time GROUP BY columns
timescaledb.orderby = 'bucket DESC' -- time_bucket column
);
SELECT add_columnstore_policy('your_table_daily', after => INTERVAL '7 days');
```
## Step 8: Set Retention Policies for Aggregates
Keep aggregates longer than raw data for historical analysis:
```sql
-- IMPORTANT: Base retention periods on user requirements, not guesses
-- Aggregates are typically kept longer than raw data for historical analysis
--
-- Common approach: Aggregates retained 2-5x longer than raw data
-- Ask user about long-term analytical needs before setting these
--
-- If you aren't sure of the data retention period than
-- include the add_data_retention_policy call but you MUST comment it out.
--
--
-- Keep hourly aggregates (example - replace with actual requirements)
SELECT add_retention_policy('your_table_hourly', INTERVAL '2 years');
-- Keep daily aggregates for longer-term trends (example - replace with actual requirements)
SELECT add_retention_policy('your_table_daily', INTERVAL '5 years');
```
## Step 9: Create Performance Indexes
Add indexes based on your actual query patterns:
```sql
-- HOW TO FIGURE OUT WHAT INDEXES TO CREATE:
-- 1. Analyze your most common queries against the continuous aggregates
-- 2. Look for WHERE clause patterns in your application code
-- 3. Create indexes that match your query filters + time ordering
--
-- Common pattern: (filter_column, time_bucket DESC)
-- This supports queries like: SELECT ... WHERE entity_id = 'X' AND bucket >= '...' ORDER BY bucket DESC
-- Example indexes on continuous aggregates (replace with your actual query patterns):
CREATE INDEX idx_hourly_entity_bucket ON your_table_hourly (entity_id, bucket DESC);
CREATE INDEX idx_hourly_category_bucket ON your_table_hourly (category, bucket DESC);
CREATE INDEX idx_daily_entity_bucket ON your_table_daily (entity_id, bucket DESC);
CREATE INDEX idx_daily_category_bucket ON your_table_daily (category, bucket DESC);
-- For multi-column filters, create composite indexes:
-- Example: if you query WHERE entity_id = 'X' AND category = 'Y'
-- CREATE INDEX idx_hourly_entity_category_bucket ON your_table_hourly (entity_id, category, bucket DESC);
-- DON'T create indexes blindly - each index has maintenance overhead
-- Only create indexes you'll actually use in queries
```
## Step 10: Optional Performance Enhancements
### Enable Chunk Skipping for Compressed Data
```sql
-- Enable chunk skipping on compressed chunks to skip entire chunks during queries
-- This creates min/max indexes that help exclude chunks based on column ranges
-- WHEN TO USE CHUNK SKIPPING (in order of importance):
-- 1. Column values have correlation/ordering within chunks (MOST IMPORTANT)
-- 2. Column is frequently used in WHERE clauses with range queries (>, <, =)
--
-- BEST CANDIDATES:
-- - updated_at (when created_at is the partitioning column - they often correlate)
-- - Sequential IDs or counters
-- - Any column that tends to have similar values within the same time period
-- Example 1: created_at is the partitioning column, enable chunk skipping on updated_at
-- This works because records created around the same time often have similar update times
SELECT enable_chunk_skipping('your_table_name', 'updated_at');
-- Example 2: id (serial) is the partitioning column, enable chunk skipping on created_at
-- This works because sequential IDs are often created around the same time
SELECT enable_chunk_skipping('your_table_name', 'created_at');
-- Both allow efficient chunk exclusion:
-- "WHERE updated_at > '2024-01-01'" skips chunks where max(updated_at) < '2024-01-01'
-- "WHERE created_at > '2024-01-01'" skips chunks where max(created_at) < '2024-01-01'
```
### Add Space-Partitioning (NOT RECOMMENDED)
```sql
-- Space partitioning is generally NOT RECOMMENDED for most use cases
-- It adds complexity and can hurt performance more than it helps
--
-- Only consider space partitioning if:
-- - You have very specific query patterns that ALWAYS filter by the space dimension
-- - You have expert-level TimescaleDB knowledge
-- - You've measured that it actually improves your specific workload
--
-- For most users: stick with time-only partitioning
-- Example (NOT recommended for typical use):
-- SELECT add_dimension('your_table_name', 'entity_id', number_partitions => 4);
```
## Step 11: Verify Configuration
```sql
-- Check hypertable configuration
SELECT * FROM timescaledb_information.hypertables
WHERE hypertable_name = 'your_table_name';
-- Verify compression settings
SELECT * FROM timescaledb_information.columnstore_settings
WHERE hypertable_name LIKE 'your_table%';
-- Check continuous aggregates
SELECT * FROM timescaledb_information.continuous_aggregates;
-- Review all automated policies
SELECT * FROM timescaledb_information.jobs ORDER BY job_id;
-- Monitor chunk information
SELECT chunk_name, table_size, compressed_heap_size, compressed_index_size
FROM timescaledb_information.chunks
WHERE hypertable_name = 'your_table_name';
```
## Use Case Specific Adaptations
### IoT/Sensor Data
- entity_id → device_id
- category → sensor_type
- Short chunk intervals (1 hour - 1 day)
- Frequent compression (1-7 days)
### Financial/Trading Data
- entity_id → symbol/instrument
- category → exchange/market
- Very short chunk intervals (1-6 hours)
- Longer retention for compliance
- More percentile aggregations
### Application Metrics/DevOps
- entity_id → service_name/hostname
- category → metric_type
- Medium chunk intervals (1-7 days)
- Focus on percentiles and error rates
### User Analytics
- entity_id → user_id/session_id
- category → event_type
- Variable chunk intervals based on traffic
- Privacy-compliant retention periods
## Performance Guidelines
- **Chunk Size**: Size chunks so that the indexes of all recent hypertable chunks fit within 25% of machine RAM
- **Compression Ratio**: Expect 90%+ compression (10x or better reduction) with properly configured columnstore
- **Query Performance**: Use continuous aggregates for common queries spanning lots of historical data, often used to support user-facing dashboards
- **Memory Usage**: Run `timescaledb-tune` if self-hosting (automatically configured on cloud)
This configuration provides a robust foundation for time-series workloads with automatic maintenance, optimal query performance, and efficient storage management.
## Schema Design Best Practices
### Column Types and Naming
**❌ Don't use `timestamp` (without time zone)** - Use `timestamptz` instead:
```sql
-- Bad: Stores local time without timezone context
CREATE TABLE sensors (time timestamp, ...);
-- Good: Stores point-in-time with timezone awareness
CREATE TABLE sensors (time timestamptz, ...);
```
`timestamptz` records a single moment in time and handles timezone conversions properly. `timestamp` without timezone can cause incorrect arithmetic across time zones and DST changes.
**❌ Don't use `BETWEEN` with timestamps** - Use `>=` and `<` instead:
```sql
-- Bad: Includes exact midnight of end date, may double-count
SELECT * FROM sensors WHERE time BETWEEN '2024-06-01' AND '2024-06-08';
-- Good: Clear exclusive upper bound
SELECT * FROM sensors WHERE time >= '2024-06-01' AND time < '2024-06-08';
```
**❌ Don't use `char(n)` or `varchar(n)` by default** - Use `text` with constraints:
```sql
-- Bad: Fixed padding, arbitrary length limits
device_id char(10), category varchar(50)
-- Good: Flexible length with meaningful constraints
device_id text CHECK (length(device_id) BETWEEN 3 AND 20),
category text CHECK (category IN ('temperature', 'humidity', 'pressure'))
```
**❌ Don't use uppercase in table/column names** - Use `snake_case`:
```sql
-- Bad: Requires double quotes everywhere
CREATE TABLE DeviceReadings (DeviceId text, ReadingValue float);
-- Good: Natural PostgreSQL style
CREATE TABLE device_readings (device_id text, reading_value float);
```
**❌ Don't use `serial` types** - Use `identity` columns for PostgreSQL 10+:
```sql
-- Bad: Creates hidden sequences with complex dependencies
CREATE TABLE events (id serial primary key, ...);
-- Good: Built-in identity column with bigint (recommended default)
CREATE TABLE events (id bigint generated always as identity primary key, ...);
```
**💡 Use `bigint` for ID columns by default** - Even if you don't expect billions of records:
- `bigint` has no performance penalty over `int`
- Prevents future migration pain when you exceed 2.1 billion rows
- Time-series data can accumulate very quickly (millions of IoT readings per day)
**💡 Use `double precision` for floating-point values by default** - Instead of `real` or `float`:
- `double precision` provides 15-17 decimal digits vs 6-7 for `real`
- No significant storage or performance cost for most use cases
- Critical for scientific measurements, financial calculations, and accumulated values
- Time-series aggregations (sums, averages) benefit from higher precision
**❌ Don't use `money` type** - Use `numeric` for monetary values:
```sql
-- Bad: Fixed to database locale, limited precision
price money
-- Good: Precise decimal arithmetic, store currency separately
price numeric(10,2),
currency text default 'USD'
```
### Query Patterns
**❌ Don't use `NOT IN`** - Use `NOT EXISTS` instead:
```sql
-- Bad: Returns 0 rows if any NULL values present
SELECT * FROM devices WHERE device_id NOT IN (SELECT device_id FROM offline_devices);
-- Good: Handles NULLs correctly and optimizes better
SELECT * FROM devices d
WHERE NOT EXISTS (SELECT 1 FROM offline_devices o WHERE o.device_id = d.device_id);
```
**❌ Don't use precision specifications like `timestamp(0)`** - Use `date_trunc()` or `time_bucket()`:
```sql
-- Bad: Rounds fractional seconds (can be .5s in future)
time_rounded timestamp(0)
-- Good: Truncates to desired precision with PostgreSQL date_trunc()
date_trunc('second', timestamp_col) AS time_rounded,
-- Better for TimescaleDB: Use time_bucket() for time-series aggregations
time_bucket(INTERVAL '5 minutes', timestamp_col) AS five_min_bucket,
```
## Using the latest TimescaleDB APIs
This guide uses the NEW TimescaleDB API. Here is a mapping from the old API to the new API:
**Deprecated Parameters → New Parameters:**
- `timescaledb.compress` → `timescaledb.enable_columnstore`
- `timescaledb.compress_segmentby` → `timescaledb.segmentby`
- `timescaledb.compress_orderby` → `timescaledb.orderby`
**Deprecated Functions → New Functions:**
- `add_compression_policy()` → `add_columnstore_policy()`
- `remove_compression_policy()` → `remove_columnstore_policy()`
- `compress_chunk()` → `convert_to_columnstore()`
- `decompress_chunk()` → `convert_to_rowstore()`
**Deprecated Views → New Views:**
- `compression_settings` → `columnstore_settings`
- `hypertable_compression_settings` → `hypertable_columnstore_settings`
- `chunk_compression_settings` → `chunk_columnstore_settings`
**Deprecated Stats Functions → New Stats Functions:**
- `hypertable_compression_stats()` → `hypertable_columnstore_stats()`
- `chunk_compression_stats()` → `chunk_columnstore_stats()`
# Questions to ask the user
Ask the following questions if the answers haven't been provided:
- What kind of data will you be storing?
- How do you expect to use the data?
- What kind of queries will you be running?
- How long do you expect to keep the data?
- If the types of column are not clear, ask the user to provide the types of the columns.
@vanderhoop
Copy link

Ummm, yes please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment