This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# az vm create command to create a Linux VM: | |
az vm create \ | |
--resource-group learn-85594f60-ef0f-4f1e-ad12-08bf2ea66630 \ | |
--name myvmanji \ | |
--image UbuntuLTS \ | |
--admin-username azureuser \ | |
--generate-ssh-keys | |
#Run the following az vm extension set command to configure Nginx on your VM: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://app.pluralsight.com/library/courses/preparing-google-cloud-professional-data-engineer-exam-1/recommended-courses ---> ML | |
https://app.pluralsight.com/profile/author/vitthal-srinivasan | |
https://app.pluralsight.com/profile/author/james-wilson | |
https://app.pluralsight.com/profile/author/janani-ravi |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Table : | |
==================== | |
CREATE EXTERNAL TABLE tweets ( createddate string, | |
geolocation string, | |
tweetmessage string, | |
user_name struct<geoenabled:boolean, id:int, name:string, screenname:string, userlocation:string> | |
)ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' LOCATION 'gs://iwinner-data/json_data'; | |
Query : |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Database pioneer and Turing Award winner Jim Gray gave a famous adage: When you have lots of data, bring [machine learning] computations to the data, rather than data to the computations. | |
According to him, there is nothing closer to the data than the database; so the computations have to be done inside the database. | |
Now all major cloud and database vendors are: | |
🔸 offering SQL data pipelines in the data warehouse | |
🔸 expanding in-database ML computations offerings | |
ML and analytics in the data warehouse are cheaper and more efficient. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
C:\Users\anjai>gcloud config get-value project | |
iwinner-data | |
Updates are available for some Cloud SDK components. To install them, | |
please run: | |
$ gcloud components update | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.mts.matrix.spark.stream | |
import com.mts.matrix.spark.utils.SparkUtils | |
import org.apache.spark.sql.{DataFrame, SaveMode} | |
import org.apache.spark.sql.functions.{col,lit, from_json} | |
import org.apache.spark.sql.streaming.{StreamingQuery, Trigger} | |
import org.apache.spark.sql.types.{IntegerType, StringType, StructType} | |
import org.apache.spark.sql.streaming.Trigger | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def getSparkSessionMongoDbConfig(parms: Map[String, String]): SparkSession = { | |
val spark = SparkSession | |
.builder | |
.appName(parms("JOB_NAME")) | |
.master("local[*]") | |
.config("spark.mongodb.input.uri", "mongodb://127.0.0.1/retaildb.orders?authSource=admin") | |
.config("spark.mongodb.output.uri", "mongodb://127.0.0.1/retaildb.orders?authSource=admin") | |
.getOrCreate() | |
val isS3Enable = parms("S3_OPERATION_ENABLE").toBoolean; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#################################################################################### | |
UDF VS UDAF VS UDTF | |
1.UDF : UDFs works on a single row in a table and produces a single row as output. Its one to one relationship between input and output of a function. e.g Hive built in TRIM() function. | |
Extends UDF | |
we have to overload a method called evaluate() inside our class. | |
2.UDAF : User defined aggregate functions works on more than one row and gives single row as output. e.g Hive built in MAX() or COUNT() functions. | |
Extends UDAF. | |
We need to overwrite five methods called init(), iterate(), terminatePartial(), merge() and terminate() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
MongoDB : | |
localhost | |
Port:27017 | |
username: admin | |
password: admin | |
Port : 27017 | |
Databasename: meetup | |
collectionName(Table_Name): meetup_rsvp_tbl |