Skip to content

Instantly share code, notes, and snippets.

@boonhapus
Last active October 30, 2024 17:58
Show Gist options
  • Save boonhapus/d5cd967fbffff058d96306f0665da02a to your computer and use it in GitHub Desktop.
Save boonhapus/d5cd967fbffff058d96306f0665da02a to your computer and use it in GitHub Desktop.
Parse ThoughtSpot Query comment tag

Example Query

Here is an example query that ThoughtSpot generated, with the attached query comment that the pyUDF is parsing.

Search in ThoughtSpot

response size
incident id

Generated SQL

SELECT 
  "ta_1"."INCIDENT_ID" "ca_1", 
  CASE
    WHEN sum("ta_1"."RESPONSE_SIZE") IS NOT NULL THEN sum("ta_1"."RESPONSE_SIZE")
    ELSE 0
  END "ca_2"
FROM "GTM_DB"."CS_TOOLS_V150"."TS_BI_SERVER" "ta_1"
GROUP BY "ca_1"
LIMIT 5000
/* type: QUESTION_ANSWER_BOOK,
 id: {"id":"5075a96a-f888-4126-93e5-da933c3d6405","name":"Untitled","owner":"5075a96a-f888-4126-93e5-da933c3d6405","type":"ImmutableAnswerSpec"},
 vizId: 54051fe3-03e5-4903-bfa2-06f1a4a84793,
 userId: 27741c8c-cc2c-44b9-8a0f-803e770e319e,
 requestId: 49feb89b-36fe-48fc-a3ea-6d3ecfb27ea5,
 task: SEARCH_DATA*/

Field Description

comment description
requestId A uuid for the event in ThoughtSpot's query data manager.
task The activity that ThoughtSpot was taking when firing the query.
type The ThoughtSpot object type which fired this query.
vizId The specific visualization object which fired this query.
id Additional information about the object which fired the query.
userId The ThoughtSpot user which fired this query.
CREATE OR REPLACE FUNCTION ts_query_tag_comment_json(query STRING)
returns variant not null
language python
runtime_version = '3.8'
handler = 'extract_query_tag_as_json'
AS
$$
from __future__ import annotations
import re
# ThoughtSpot query tag always appears at the end of the SQL text,
# wrapped in a multi-line quote.
TAG_REGEX = re.compile(r"(?P<query_tag_comment>\/\*(?:.|[\n])+\*/)$")
def extract_query_tag_as_json(query: str) -> dict[str, Any]:
"""Convert JSON-like query tag to a VARIANT."""
match = TAG_REGEX.search(query)
if match is None:
return {}
comment = match.group("query_tag_comment")
trimmed = comment[3:-2]
variant = {}
for element in trimmed.split(",\n"):
key, val = map(str.strip, element.split(":", 1))
variant[key] = val
return variant
$$
SELECT
QUERY_ID
, TS_QUERY_TAG_COMMENT_JSON(QUERY_TEXT):requestId :: STRING AS TS_INTERNAL_ID
, TS_QUERY_TAG_COMMENT_JSON(QUERY_TEXT):task :: STRING AS TS_ACTIVITY_TYPE
, TS_QUERY_TAG_COMMENT_JSON(QUERY_TEXT):type :: STRING AS TS_METADATA_TYPE
, TS_QUERY_TAG_COMMENT_JSON(QUERY_TEXT):vizId :: STRING AS TS_METADATA_VIZ_GUID
, TS_QUERY_TAG_COMMENT_JSON(QUERY_TEXT):id :: STRING AS TS_METADATA_INFO
, TS_QUERY_TAG_COMMENT_JSON(QUERY_TEXT):userId :: STRING AS TS_USER_GUID
FROM
TABLE(INFORMATION_SCHEMA.QUERY_HISTORY())
WHERE
QUERY_TYPE = 'SELECT'
AND EXECUTION_STATUS = 'SUCCESS'
AND TS_INTERNAL_ID IS NOT NULL
ORDER BY
START_TIME
LIMIT 10
;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment