db-troubleshooting.md

Everything I knew to do here, I learned from this pluralsight course:https://app.pluralsight.com/library/courses/postgresql-playbook/table-of-contents and this blog post: http://www.craigkerstiens.com/2012/10/01/understanding-postgres-performance/

I used cg run -e <env> bash and then psql $DATABASE_URL to connect to the DB

I used the postgres query here to find big tables with low rates of using indexes, see “Understanding Index Usage” : http://www.craigkerstiens.com/2012/10/01/understanding-postgres-performance/

googledrivelti=> SELECT
googledrivelti->   relname,
googledrivelti->   100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used,
googledrivelti->   n_live_tup rows_in_table
googledrivelti-> FROM
googledrivelti->   pg_stat_user_tables
googledrivelti-> WHERE
googledrivelti->     seq_scan + idx_scan > 0
googledrivelti-> ORDER BY
googledrivelti->   n_live_tup DESC;
          relname           | percent_of_times_index_used | rows_in_table
----------------------------+-----------------------------+---------------
 google_permissions         |                          99 |       5576237
 credentials                |                          99 |       2994397
 canvas_users_credentials   |                          99 |       2970155
 share_items                |                          93 |       2850149
 canvas_users               |                          99 |       2597312
 cloud_assignment_instances |                          99 |       2148947
 collaborators              |                          99 |         44248
 collaborations             |                          99 |         20144
 lti_credentials            |                         100 |         10173
 schema_migrations          |                           0 |            41
 delayed_jobs               |                          12 |             2
 failed_jobs                |                           0 |             0
(12 rows)

I went to kibana, and searched the logs for the name of that table, “share_items”.

I excluded anything with a method of “GET” or a @stream of “stdout” to exclude request logs, leaving just db related logs.

I scroll through the list, and within the first 50 spotted many of these two queries.

A Load running for over 10ms is a problem.

CloudAssignmentTemplate Load (344.6ms) SELECT "share_items".* FROM "share_items" WHERE "share_items"."type" IN ('CloudAssignmentTemplate') AND "share_items"."resource_id" = $1 AND "share_items"."context_id" = $2 LIMIT 1 [["resource_id", "9ee978b4-be4e-45bb-8bbf-2de75334a922"], ["context_id", "00c4d39b049b685fa3237d071a9721ea2f80f2ee”]]

ShareItem Load (226.6ms) SELECT "share_items".* FROM "share_items" WHERE "share_items"."resource_id" = $1 AND "share_items"."context_id" = $2 LIMIT 1 [["resource_id", "4c7cc5f2-805f-472f-acc6-991f92585550"], ["context_id", "4aeda0c17e715f21bcbfade079ea53fe4e46c8e2”]]

An EXPLAIN ANALYZE SELECT * FROM "share_items" WHERE "share_items"."type" IN ('CloudAssignmentTemplate') AND "share_items"."resource_id" = '9ee978b4-be4e-45bb-8bbf-2de75334a922' AND "share_items"."context_id" = '00c4d39b049b685fa3237d071a9721ea2f80f2ee' LIMIT 1;

 Limit  (cost=0.00..105985.88 rows=1 width=147) (actual time=290.635..290.636 rows=1 loops=1)
   ->  Seq Scan on share_items  (cost=0.00..105985.88 rows=1 width=147) (actual time=290.631..290.631 rows=1 loops=1)
         Filter: (((type)::text = 'CloudAssignmentTemplate'::text) AND (resource_id = '9ee978b4-be4e-45bb-8bbf-2de75334a922'::uuid) AND ((context_id)::text = '00c4d39b049b685fa3237d071a9721ea2f80f2ee'::text))
         Rows Removed by Filter: 2826120
 Planning time: 0.117 ms
 Execution time: 291.073 ms

This suggests the need for an index on type, resoure_id, and context_id. Or perhaps a conditional index on resource_id and context_id where type in ('CloudAssignmentTemplate')? only 3.5% of rows have type='CloudAssignmentTemplate'. only 0.007% have a non-null type that is something else. All rows have a non-null resource_id and context_id. I asked about this in #devops, we'll see what the response is.

An EXPLAIN ANALYZE SELECT * FROM "share_items" WHERE "share_items"."resource_id" = '4c7cc5f2-805f-472f-acc6-991f92585550' AND "share_items"."context_id" = '4aeda0c17e715f21bcbfade079ea53fe4e46c8e2' LIMIT 1;

returns

 Limit  (cost=0.00..98827.18 rows=1 width=147) (actual time=186.026..186.027 rows=1 loops=1)
   ->  Seq Scan on share_items  (cost=0.00..98827.18 rows=1 width=147) (actual time=186.022..186.022 rows=1 loops=1)
         Filter: ((resource_id = '4c7cc5f2-805f-472f-acc6-991f92585550'::uuid) AND ((context_id)::text = '4aeda0c17e715f21bcbfade079ea53fe4e46c8e2'::text))
         Rows Removed by Filter: 1042038
 Planning time: 0.104 ms
 Execution time: 186.053 ms
(6 rows)

This suggests the need for an index on resource_id & context_id.

I did a describe on the table, just to get my bearings

googledrivelti=> \d share_items
                                       Table "public.share_items"
     Column     |            Type             |                        Modifiers
----------------+-----------------------------+----------------------------------------------------------
 id             | integer                     | not null default nextval('share_items_id_seq'::regclass)
 canvas_user_id | integer                     | not null
 context_id     | character varying           | not null
 google_id      | character varying           | not null
 created_at     | timestamp without time zone | not null
 updated_at     | timestamp without time zone | not null
 type           | character varying           |
 resource_id    | uuid                        | default gen_random_uuid()
Indexes:
    "share_items_pkey" PRIMARY KEY, btree (id)
    "index_share_items_on_canvas_user_id" btree (canvas_user_id)
Foreign-key constraints:
    "fk_rails_1a94f3144b" FOREIGN KEY (canvas_user_id) REFERENCES canvas_users(id)
Referenced by:
    TABLE "cloud_assignment_instances" CONSTRAINT "fk_rails_75cd0bb701" FOREIGN KEY (share_item_id) REFERENCES share_items(id)
    TABLE "google_permissions" CONSTRAINT "fk_rails_bf348d4eae" FOREIGN KEY (share_item_id) REFERENCES share_items(id)
    TABLE "cloud_assignment_instances" CONSTRAINT "fk_rails_fa5578b35a" FOREIGN KEY (template_id) REFERENCES share_items(id)
    ```

crwang/db-troubleshooting.md