Union only those rows (from large table) with keys in left small table, i.e. union two dataframes together but only those with the key in my small table.
Aggregation on an array of nested json = How to sum the quantities across all lines for a given order (which would give 1 + 3 = 4 for the below sample dataset):
{
"orderId": "oi1",
"orderLines": [
{
"productId": "p1",
"quantity": 1,
"sequence": 1,
"totalPrice": {
"gross": 50,
"net": 40,
"tax": 10
}
},
{
"productId": "p2",
"quantity": 3,
"sequence": 2,
"totalPrice": {
"gross": 300,
"net": 240,
"tax": 60
}
}
]
}
Stackoverflow -> https://stackoverflow.com/questions/43758982/how-to-aggregate-over-array-in-json
Create a DataFrame with the columns and fields defined in a file, i.e. the header and fields are in a file. The schema is in the field5 the values comes after field6.
field1 value1;
field2 value2;
field3 value3;
field4 value4;
field5 17 col1 col2 col3 col4 col5 col6 col7 col8;
field6
val1 val 2 val3 val4 val5 val6 val7 val8
val9 val10 val11 val12 val13 val14 val15 val16
val17 val18 val19 val20 val21 val22 val23 val24;
EndOfFile;
Calculate Consecutiveness in a dataset, i.e. if it is meeting certain status. If the exchange id is consecutively having Risky and Unstable status then increment the count by 1 for that week and merge with the dataset.
scala> solution.show
+---------+-----------+--------+---------------+
| Date|Exchange_Id| Status|Consecutiveness|
+---------+-----------+--------+---------------+
|5/05/2017| a| RISKY| 0|
|5/05/2017| b| Stable| 0|
|5/05/2017| c| Stable| 0|
|5/05/2017| d|UNSTABLE| 0|
|5/05/2017| e| UNKNOWN| 0|
|5/05/2017| f| UNKNOWN| 0|
|6/05/2017| a| RISKY| 1|
|6/05/2017| b| Stable| 0|
|6/05/2017| c| Stable| 0|
|6/05/2017| d|UNSTABLE| 1|
|6/05/2017| e|UNSTABLE| 0|
|6/05/2017| f| UNKNOWN| 0|
+---------+-----------+--------+---------------+
Preferred solution is to use window aggregate functions.
val dates = Seq(
"08/11/2015",
"09/11/2015",
"09/12/2015").toDF("date_string")
Calculate the difference in days between dates
and current day.
TIP Use functions object.