Skip to content

Instantly share code, notes, and snippets.

@adityajn105
Last active September 4, 2021 08:08
Show Gist options
  • Save adityajn105/4dede6d53e0d142946b130e63995aebb to your computer and use it in GitHub Desktop.
Save adityajn105/4dede6d53e0d142946b130e63995aebb to your computer and use it in GitHub Desktop.
Basic SQL and Mongo DB
SQL COMMANDS TYPE
1. DDL (Data Definition Language)
Data Definition Language, DDL, is the part of SQL that allows a database user to create and restructure database objects,
such as the creation or the deletion of a table.
CREATE TABLE, ALTER TABLE, DROP TABLE, CREATE INDEX, ALTER INDEX, DROP INDEX, CREATE VIEW, DROP VIEW
2. DML (Data Manipulation Language)
Data Manipulation Language, DML, is the part of SQL used to manipulate data within objects of a relational database.
INSERT, UPDATE, DELETE
3. DQL (Data Query Language)
SELECT
4. DCL (Data Control Language)
Data control commands in SQL allow you to control access to data within the database.
ALTER PASSWORD, GRANT, REVOKE, CREATE SYNONYM
5. Data administration commands
6. Transactional control commands
COMMIT Saves database transactions
ROLLBACK Undoes database transactions
_______________________________________________________________________
DDL-
1. CREATE DATABASE
mysql> CREATE DATABASE testDB;
2. DROP DATABASE
mysql> DROP DATABASE databasename;
3. CREATE TABLE
mysql> mysql> CREATE TABLE BOOKINGS(
hotel_no int(11) UNIQUE AUTO_INCREMENT,
room_no int(11) NOT NULL,
guest_no int(11),
date_from DATE DEFAULT GETDATE(),
date_to DATE,
PRIMARY KEY(hotel_no,guest_no,date_from),
FOREIGN KEY(hotel_no) REFERENCES HOTEL(hotel_no),
FOREIGN KEY(guest_no) REFERENCES GUESTS(guest_no)
);
4. DROP TABLE
mysql> DROP TABLE Shippers;
5. ALTER TABLE
a. ADD COLUMN
mysql> ALTER TABLE table_name
ADD col_name datatype;
b. DROP COLUMN
mysql> ALTER TABLE table_name
DROP COLUMN col_name;
c. MODIFY COLUMN
mysql> ALTER TABLE table_name
MODIFY COLUMN column_name datatype;
6. CREATE INDEX
Indexes are used to retrieve data from the database very fast.
The users cannot see the indexes, they are just used to speed up searches/queries.
mysql> CREATE INDEX index_name
ON table_name (column1, column2);
7. CREATE VIEW
In SQL, a view is a virtual table based on the result-set of an SQL statement.
A view contains rows and columns, just like a real table.
The fields in a view are fields from one or more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from one single table.
mysql> CREATE VIEW view_name AS
SELECT column1, column2
FROM table_name
WHERE condition;
_______________________________________________________________________________
DQL/DML -
1. SELECT
mysql> SELECT col_1,col_2 FROM table_name;
mysql> SELECT * FROM table_name;
2. WHERE with AND,OR,NOT,LIKE,BETWEEN,<=,>=,<>,IN
mysql> SELECT * FROM HOTEL
WHERE city='London' AND no=1234;
3. ORDER BY
mysql> SELECT * FROM HOTEL WHERE city='London'
ORDER BY col_1 ASC|DESC;
4. INSERT
mysql> INSERT INTO table_name (column1, column2, column3)
VALUES (value1, value2, value3);
mysql> INSERT INTO table_name
VALUES (value1, value2, value3);
5. NULL VALUES
A field with a NULL value is a field with no value. Not possible to test with comparision operators
mysql> SELECT column_names FROM table_name
WHERE col_1 IS NULL;
mysql> SELECT column_names FROM table_name
WHERE col_1 IS NOT NULL;
6. UPDATE
mysql> UPDATE table_name
SET col_1=val1, col_2=val2
WHERE condition;
7. DELETE
mysql> DELETE FROM table_name
WHERE condition;
8. LIMIT - different for mysql server,mysql,oracle
mysql> SELECT * FROM table LIMIT 5;
9. MIN and MAX
mysql> SELECT MIN(col) FROM table_name;
mysql> SELECT MAX(column_name) FROM table_name;
10. COUNT,AVG,SUM
mysql> SELECT COUNT(col) FROM table_name
mysql> SELECT AVG(col) FROM table_name
mysql> SELECT SUM(col) FROM table_name
11. LIKE
%- represent zero or more chars
_- represent 1 char
mysql> SELECT * FROM table_name
WHERE col_1 LIKE pattern;
12. IN - it is used to specify multiple values in where clause
mysql> SELECT * FROM table_name
WHERE col_1 IN (value1, value2);
mysql> SELECT * FROM table_name
WHERE col_1 IN (SELECT col from table_2);
13. JOINS
a. (INNER) JOIN -
Returns records that have matching values in both tables
mysql> SELECT t1.col_1,t2.col_2 FROM t1
INNER JOIN t2 ON t1.col_1= t2.col_2;
b. LEFT (OUTER) JOIN:
Return all records from the left table, and the matched records from the right table
mysql> SELECT t1.col_1,t2.col_2 FROM t1
LEFT JOIN t2 ON t1.col_1= t2.col_2;
c. RIGHT (OUTER) JOIN:
Return all records from the right table, and the matched records from the left table
mysql> SELECT t1.col_1,t2.col_2 FROM t1
RIGHT JOIN t2 ON t1.col_1= t2.col_2;
d. FULL (OUTER) JOIN:
Return all records when there is a match in either left or right table
mysql> SELECT t1.col_1,t2.col_2 FROM t1
FULL JOIN t2 ON t1.col_1= t2.col_2;
e. Cross JOIN:
Return Carteisan Product of both tables
mysql> SELECT * FROM table1
CROSS JOIN table2;
14. GROUP BY
mysql> SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
ORDER BY COUNT(CustomerID) DESC;
/*Use GROUP BY field1, field2 for unique combination of 2 columns*/
15. HAVING
The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.
mysql> SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
HAVING COUNT(CustomerID) > 5
ORDER BY COUNT(CustomerID) DESC;
16. EXISTS
The EXISTS operator is used to test for the existence of any record in a subquery.
The EXISTS operator returns true if the subquery returns one or more records.
mysql> SELECT column_name(s)
FROM table_name
WHERE EXISTS (SELECT column_name FROM table_name WHERE condition);
17. ANY,ALL
18. COMMENTS
comments begin with --
_____________________________________________________________________________________________
EXAMPLES
1. Give city and length of city with longest and shortest city names, give result in alphabetical order if length are same
SELECT CITY,LENGTH(CITY) from STATION ORDER BY LENGTH(CITY) ASC, CITY ASC LIMIT 1;
SELECT CITY,LENGTH(CITY) from STATION ORDER BY LENGTH(CITY) DESC, CITY ASC LIMIT 1;
2. Selecting top 50 percent rows
SELECT TOP 50 PERCENT * FROM Employee;
3. Selecting top 5 rows
SELECT TOP 5 * FROM EMPLOYEE;
4. SELECT subtring from varchar
SELECT SUBSTRING(CITY,2,3) from STATION; /* this will select substring of length 3 form 2nd position */
5. LEFT or RIGHT substring
SELECT RIGHT(CITY,2) FROM STATION; /* gives ending substring of length 2 */
SELECT LEFT(CITY,3) FROM STATION; /* gives starting substring of length 2 */
6. Concat string, lower string, Group BY, 2 orders of sorting
SELECT CONCAT('There are a total of ',COUNT(Occupation),' ',lower(Occupation),'s.') FROM OCCUPATIONS
GROUP BY Occupation ORDER BY COUNT(Occupation) ASC,Occupation ASC;
7. UNION - used to concat table with same columns one upon other, subquery
SELECT xx.N,xx.T from (
SELECT N,"Root" as T from BST WHERE P IS NULL
UNION
SELECT N,"Leaf" as T from BST WHERE N NOT IN (SELECT DISTINCT P FROM BST where P IS NOT NULL)
UNION
SELECT N,"Inner" as T from BST WHERE N IN (SELECT DISTINCT P FROM BST) and P IS NOT NULL
) xx ORDER BY xx.N ASC;
8. JOIN
SELECT f.cc, f.founder, lm.lm_cnt, sm.sm_cnt, m.m_cnt, emp.emp_cnt from
(SELECT company_code as cc, founder from Company) as f
LEFT JOIN
(SELECT COUNT(DISTINCT lead_manager_code) as lm_cnt,company_code as cc from Lead_Manager GROUP BY company_code) as lm
ON f.cc=lm.cc LEFT JOIN
(SELECT COUNT(DISTINCT senior_manager_code) as sm_cnt,company_code as cc from Senior_Manager GROUP BY company_code ) as sm
ON sm.cc=f.cc LEFT JOIN
(SELECT COUNT(DISTINCT manager_code) as m_cnt,company_code as cc from Manager GROUP BY company_code ) as m
ON m.cc=f.cc LEFT JOIN
(SELECT COUNT(DISTINCT employee_code) as emp_cnt,company_code as cc from Employee GROUP BY company_code ) as emp
ON emp.cc=f.cc ORDER BY f.cc ASC;
9. Aggregation- SUM,ROUND,AVG,MAX,MIN,CEIL,TRUNCATE,FLOOR
SELECT ROUND(AVG(POPULATION),2) from CITY; /*Round upto 2 decimal points*/
SELECT MAX(POPULATION)-MIN(POPULATION) FROM CITY;
SELECT TRUNCATE(SUM(LAT_N),4) from STATION where LAT_N<137.2345 AND LAT_N>38.7880;
SELECT ROUND(SQRT(POWER(MAX(LAT_N)-MIN(LAT_N),2)+POWER(MAX(LONG_W)-MIN(LONG_W),2)),4) FROM STATION; /*Eucladuan Distance*/
10. Build temprory table - returns max salary, no of emp with max salary where salary = months*monthly_salary
SELECT sal,COUNT(id) FROM (SELECT months*salary as sal,employee_id as id FROM Employee) AS Emp
GROUP BY sal ORDER BY sal DESC LIMIT 1;
11. Get median where there are 499 entries of LAT_N
SELECT ROUND(S.LAT_N,4) FROM STATION AS S WHERE
(SELECT COUNT(LAT_N) FROM STATION WHERE LAT_N < S.LAT_N)=(SELECT COUNT(LAT_N) FROM STATION WHERE LAT_N > S.LAT_N);
12. WHEN, CASE
SELECT CASE
WHEN A + B > C THEN
CASE
WHEN A = B AND B = C THEN 'Equilateral'
WHEN A = B OR B = C OR A = C THEN 'Isosceles'
WHEN A != B OR B != C OR A != C THEN 'Scalene'
END
ELSE
'Not A Triangle'
END FROM TRIANGLES;
13. JOIN
LEFT JOIN - select all rows of left table
RIGHT JOIN - select all rows of right table
INNER JOIN - select all rows common with ON Condition
FULL JOIN - select all left and right table rows (mysql do not support)
Gives CONTINENT and avg pop of cities in that continent
SELECT t2.CONTINENT, FLOOR(SUM(t1.ALLPOP)/SUM(t1.TOT)) as AVGPOP
FROM (SELECT COUNT(NAME) as TOT, SUM(POPULATION) as ALLPOP, COUNTRYCODE as CODE FROM CITY GROUP BY COUNTRYCODE) AS t1
INNER JOIN
COUNTRY as t2
ON t1.CODE=t2.CODE
GROUP BY t2.CONTINENT
ORDER BY AVGPOP ASC;
14. JOIN + CASE + BETWEEN + 2 order by
Give list of students with grade,marks and name for grade>8
SELECT CASE
WHEN t2.Grade >= 8 THEN t1.Name
WHEN t2.Grade >= 8 THEN NULL
END, t2.Grade, t1.Marks FROM Students AS t1
LEFT JOIN
Grades AS t2 ON t1.Marks BETWEEN t2.Min_Mark AND t2.Max_Mark
ORDER BY t2.Grade DESC,t1.Name ASC;
_________________________________________________________________________________________
TSQL
```
window_function_name ( expression ) OVER (
partition_clause
order_clause
frame_clause
)
window_function_name : The name of the supported window function.
Value: FIRST_VALUE(), LAST_VALUE(), LAG(), LEAD()
Ranking: DENSE_RANK(), RANK(), ROW_NUMBER(), PERCENT_RANK(), NTILE()
Aggregate: AVG(), COUNT(), MAX(), MIN(), SUM()
expression : The target expression or column on which the window function operates.
PARTITION BY : The partition clause divides the rows into partitions to which the window function applies. If the PARTITION BY clause is not specified, then the whole result set is treated as a single partition.
PARTITION BY expr1, expr2, ...
ORDER BY : The order clause specifies the orders of rows in a partition on which the window function operates
ORDER BY expression [ASC | DESC] [NULL {FIRST| LAST}]
```
1. ROW_NUMBER() : The ROW_NUMBER function simply returns the row number of the sorted records starting with 1
2. RANK() : The RANK function is used to retrieve ranked rows based on the condition of the ORDER BY clause. If there is a tie between N previous records for the value in the ORDER BY column, the RANK functions skips the next N-1 positions before incrementing the counter.
3. DENSE_RANK() : The DENSE_RANK function is similar to RANK function however the DENSE_RANK function does not skip any ranks if there is a tie between the ranks of the preceding records.
4. PERCENT_RANK() : The PERCENT_RANK() is a window function that calculates the percentile ranking of rows in a result set.
5. NTILE() : The SQL NTILE() is a window function that allows you to break the result set into a specified number of approximately equal groups, or buckets. It assigns each group a bucket number starting from one
_________________________________________________________________________________________
Nice Problems
*. https://www.hackerrank.com/challenges/average-population-of-each-continent/problem
*. https://www.hackerrank.com/challenges/harry-potter-and-wands/problem
*. https://www.hackerrank.com/challenges/challenges/problem
*. https://www.hackerrank.com/challenges/symmetric-pairs/problem
_________________________________________________________________________________________
Normalized form
why?
If data is unorganized in database there can be different problems or anomaly can happen in database.
Normalization?
Normalization is a process of organizing the data in database to avoid :
a. Data Redundancy
b. Insertion anomaly
c. Update anomaly
d. Deletion anomaly
1. INVOICE ( c_no, c_name, c_addr, ISBN, title, auth_name, auth_country, qty, price )
Types?
a. 1NF
A relation is in 1NF if it has no repeating groups.
Here 1 has repeating groups.
1. INVOICE ( c_no(p), ISBN )
2. CUSTOMER ( c_no(p), c_name, c_addr )
3. CUSTOMER_BOOK ( c_no(p), ISBN(p), title, auth_name, auth_country, qty, price )
Repearting groups lead to inconsistent data.
b. 2NF
2NF = 1NF + no partial dependency ( one particular set of non-key fields or a non-key field depend on only one of the key)
Here 3 has partial dependency. qty depend on both rest depend on only ISBN
3. CUSTOMER_BOOK ( c_no(p), ISBN(p), qty )
4. BOOK ( ISBN(p), title, auth_name, auth_country, price )
c. 3NF
3NF = 2NF + no transitive dependency (relation among non key fields)
Here 4 has transitive dependency. auth_name,auth_country depend on each other
4. BOOK ( ISBN(p), title, auth_name, price )
5. AUTHOR (auth_name(p), auth_country)
//When you use field name in value put $ in front of it
//eg. "{$sum:"$likes"}" here likes is a field so in value field $ is put in front of it
Mongo Queries
1. create a db
> use db
2. drop db
> db.dropDatabase()
3. show all dbs
> show dbs
4. create collections
> db.createCollection("post")
5. show collections
> show collections
6. drop collections
> db.post.drop()
7. insert document
> db.post.insert({
"title":"MongoDb Overview" ,
"description":"MongoDb is a noSql Database",
"by":"aditya",
"url":"http://www.adityajain.com" ,
"tags":['mongo','nosql','aditya'],
"likes":100,
"comments":[{ "user":"user1", "message":"good content", "dateCreated":new Date(2016,1,20,2,15), "like":0}]
})
8. find document
find-> find( { CRITERIA } ,{ PROJECTION } );
a. find all
> db.post.find()
b. final specific
> db.post.find({by:"aditya"},{_id:0,by:1,likes:1})
9. and, or, comparsion ops
a. and
> db.post.find({
$and: [
{key1: value1}, {key2:value2}
]
})
b. or
> db.post.find({
$or: [
{key1: value1}, {key2:value2}
]
})
c. $gt, $lt, $lte, $gte, $ne
> db.post.find({
likes:{$gt:1000}
})
10. update document
update-> update( {CRITERIA} , {UPDATED_DATA} , {multi:false} )
FIELD update
a. $inc -> inc value
{ $inc: { <field1>: <amount1>, <field2>: <amount2>, ... } }
b. $mul -> multiple value by number
{ $mul : { field: <number> } }
c. $rename-> rename field name
{ $rename: { 'nickname': 'alias', 'cell': 'mobile' } }
d. $set-> change field value
{ $set: { <field1>: <value1>,...} }
e. $unset-> remove field
{ $unset: { <field1>: "", ... } }
Array updates
a. $ -> The positional $ operator identifies an element in an array to update without explicitly specifying the position of
the element in the array.
> db.collection.update(
{ <array>: value ... },
{ <update operator>: { "<array>.$" : value } }
)
b. $pop -> The $pop operator removes the first or last element of an array. -1 first and 1 last
{ $pop: { <field>: <-1 | 1>, ... } }
c. $pull -> remove specific entries
{ $pull: { <field1>: <value|condition>, <field2>: <value|condition>, ... } }
d. $push
{ $push: { <field1>: <value1>, ... } }
e. $each
{ $push: { <field>: { $each: [ <value1>, <value2> ... ] } } }
> db.post.update(
{'title':'MongoDB Overview'},
{$set:{'title':'New MongoDB Tutorial'}},
{multi:true}
)
> db.post.update(
{"by":"arya"},
{$inc:{"likes":1}}
)
> db.post.update(
{"by":"arya"},
{$push:
{"tags":"dynamic webpages"}
}
)
> db.post.update(
{"comments.user":"aditya"},
{$set:{ "comments.$.user":"jack" }}
)
11. delete Document
delete-> delete( {CRITERIA} , JUST_ONE ) #JUST_ONE is true by default
> db.post.remove(
{"by":"arya"},true
)
12. limit and skip records
> db.post.find().limit(5)
> db.post.find().skip(5)
13. sort records
sort-> sort( {KEY:1} )
> db.mycol.find(
{},
{"title":1,_id:0}
).sort(
{"title":-1}
)
14. indexing
> db.mycol.ensureIndex({"title":1})
15. aggregate
Pipelinig stages appear
a. $group
{ $group: { _id: <expression>, <field1>: { <accumulator1> : <expression1> }, ... } }
field is name of new field containing accumulated result
accumulator
$sum -> returns sum of numeric value
$avg -> returns avg of numeric value
$first-> return value from first document of each group
$max
$min
b. $project
{ $project: { <field>: <1|0> } }
c. $match
{ $match: { <query> } }
d. $sort
Sorts all input documents and returns them to the pipeline in sorted order.
$sort takes a document that specifies the field(s) to sort by and the respective sort order.
{ $sort: { <field1>: <sort order>, <field2>: <sort order> ... } }
e. $limit
limit documents at this stage
{ $limit: <positive integer> }
f. $unwind
Deconstructs an array field from the input documents to output a document for each element.
Each output document is the input document with the value of the array field replaced by the element.
#use $ for field path
{ $unwind: <field path> }
g. $skip
skip number of documents
{ $skip: <positive integer> }
h. $count
{ $count: <string> }
i. $lookup
Perform left outer join to another collection in same database
> db.post.aggregate([
{$group:{_id:"$by",tutorial_count:{$sum:1}}}
])
> >db.post.aggregate([
{$unwind:"$comments"},
{$match:{"comments.user":"sam"}},
{$project:{title:1,_id:0,"comments.message":1}}
])
> db.post.aggregate([
{$unwind:"$comments"},
{$match:{"comments.dateCreated":{$gt:new Date(2015,11,12,3,34)}}},
{$project:{title:1,"comments.message":1,"comments.dateCreated":1,"comments.user":1}}
]).pretty()
> db.post.aggregate([
{$unwind:"$comments"},
{$group:{_id:"$comments.user",count:{$sum:1}}},
{$sort:{count:-1}}
])
16. Map Reduce
Data processing Paradigm for condensing large volume of data into useful aggregate results.
MongoDb uses map reduce for processsing large set of data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment