This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| package com.mts.matrix.spark.stream | |
| import com.mts.matrix.spark.utils.SparkUtils | |
| import org.apache.spark.sql.{DataFrame, SaveMode} | |
| import org.apache.spark.sql.functions.{col,lit, from_json} | |
| import org.apache.spark.sql.streaming.{StreamingQuery, Trigger} | |
| import org.apache.spark.sql.types.{IntegerType, StringType, StructType} | |
| import org.apache.spark.sql.streaming.Trigger | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def getSparkSessionMongoDbConfig(parms: Map[String, String]): SparkSession = { | |
| val spark = SparkSession | |
| .builder | |
| .appName(parms("JOB_NAME")) | |
| .master("local[*]") | |
| .config("spark.mongodb.input.uri", "mongodb://127.0.0.1/retaildb.orders?authSource=admin") | |
| .config("spark.mongodb.output.uri", "mongodb://127.0.0.1/retaildb.orders?authSource=admin") | |
| .getOrCreate() | |
| val isS3Enable = parms("S3_OPERATION_ENABLE").toBoolean; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #################################################################################### | |
| UDF VS UDAF VS UDTF | |
| 1.UDF : UDFs works on a single row in a table and produces a single row as output. Its one to one relationship between input and output of a function. e.g Hive built in TRIM() function. | |
| Extends UDF | |
| we have to overload a method called evaluate() inside our class. | |
| 2.UDAF : User defined aggregate functions works on more than one row and gives single row as output. e.g Hive built in MAX() or COUNT() functions. | |
| Extends UDAF. | |
| We need to overwrite five methods called init(), iterate(), terminatePartial(), merge() and terminate() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| MongoDB : | |
| localhost | |
| Port:27017 | |
| username: admin | |
| password: admin | |
| Port : 27017 | |
| Databasename: meetup | |
| collectionName(Table_Name): meetup_rsvp_tbl |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Write to Cassandra using foreachBatch() in Scala | |
| import org.apache.spark.sql._ | |
| import org.apache.spark.sql.cassandra._ | |
| import com.datastax.spark.connector.cql.CassandraConnectorConf | |
| import com.datastax.spark.connector.rdd.ReadConf | |
| import com.datastax.spark.connector._ | |
| val host = "<ip address>" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Spark Cassandra Filter | |
| CREATE TABLE data_storage.stack_overflow_test_table ( | |
| id int, | |
| text_id text, | |
| clustering date, | |
| some_other text, | |
| PRIMARY KEY (( id, text_id ), clustering) | |
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| https://www.guru99.com/deep-learning-libraries.html | |
| ################################################################ | |
| TensorFlow | |
| Created by Google | |
| version 1.0 in February, 2017 | |
| TensorFlow is an open-source software library for dataflow programming across a range of tasks. | |
| It is a symbolic math library that is used for machine learning applications like neural networks. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| * Licensed to the Apache Software Foundation (ASF) under one or more | |
| * contributor license agreements. See the NOTICE file distributed with | |
| * this work for additional information regarding copyright ownership. | |
| * The ASF licenses this file to You under the Apache License, Version 2.0 | |
| * (the "License"); you may not use this file except in compliance with | |
| * the License. You may obtain a copy of the License at | |
| * | |
| * http://www.apache.org/licenses/LICENSE-2.0 | |
| * |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| https://www.youtube.com/playlist?list=PLZoTAELRMXVPUyxuK8AphGMuIJHTyuWna | |
| https://www.youtube.com/watch?v=p_tpQSY1aTs&list=PLZoTAELRMXVPUyxuK8AphGMuIJHTyuWna&index=3&t=0s |