This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"_description: Name of City, or Town": { | |
"State": "The Indian State under which this City/Town exists.", | |
"GeoCode": [ | |
"Latitude", | |
"Longitude" | |
], | |
"PinCodes": [ | |
"All", | |
"The", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
id | ngram | freq | |
---|---|---|---|
106951 | the aggregate average advances | 3 | |
109565 | This Petition | 2 | |
105276 | M/s Gillette India Ltd. | 11 | |
105869 | that expenditure | 3 | |
107390 | the trading loss | 2 | |
110562 | a shareholder | 3 | |
109200 | the Co-ordinate Bench | 3 | |
104688 | the direction | 3 | |
106465 | his business | 3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ngram | ||
---|---|---|
service charges | ||
labour charges | ||
late payment charges | ||
the additional discount charges | ||
development charges | ||
additional discount charges | ||
advertisement charges | ||
Process Fee Charges | ||
metal labour charges |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ngram | ||
---|---|---|
service charges | ||
chargeable interest | ||
labour charges | ||
late payment charges | ||
the chargeable accounting periods | ||
the additional discount charges | ||
development charges | ||
additional discount charges | ||
advertisement charges |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def prompts(cursor, search_string): | |
''' | |
In: cursor module, search_string a tsquery acceptable search string | |
Out: Prompts dict | |
''' | |
default_data = {'prompts':[{'id' : 0, 'key': 'No prompts available'}]} | |
prompts_data = {} | |
query_get_prompts = """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import psycopg2 | |
from collections import Counter | |
query_populate = """ | |
INSERT into <schema_name>.prompts (ngram, freq) | |
values (%(ngram)s, %(freq)s) | |
ON CONFLICT (ngram) DO UPDATE | |
SET freq = prompts.freq + %(freq)s | |
WHERE prompts.ngram = %(ngram)s | |
""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import spacy | |
#Load the (small) model | |
_nlp = spacy.load('en_core_web_sm') | |
#Adjust the max length of the document | |
#_nlp.max_length = 1000000 | |
#text contains the data that we want to extract phrases from, in string/buffer format | |
all_prompts = [] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--Create the materialized view | |
CREATE materialized view <schema_name>.mv_prompts AS | |
SELECT id, ngram, freq | |
FROM <schema_name>.prompts | |
WHERE add_conditions_here | |
ORDER BY freq DESC; | |
--Create a search index using GIN | |
CREATE INDEX prompts_ngram_idx | |
ON <schema_name>.mv_prompts |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DELETE FROM <table> a USING ( | |
SELECT MIN(ctid) as ctid, var1, var2 | |
FROM <same_table> b | |
GROUP BY 2,3 HAVING COUNT(*) > 1 | |
) b | |
WHERE a.var1 = b.var1 | |
and a.var2 = b.var2 | |
AND a.ctid <> b.ctid |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DO $$ | |
DECLARE r RECORD; | |
BEGIN | |
FOR r IN <query that will return a record set for r to iterate over> | |
LOOP | |
BEGIN | |
RAISE NOTICE 'Dealing with ID: %',r.id; | |
UPDATE <table> as t1 | |
SET var1 = t2.var1 | |
from <table2> t2 |
OlderNewer