Last active
March 17, 2021 00:35
-
-
Save devin-petersohn/b680d0c167ada97baa33e0e93a4d8d2d to your computer and use it in GitHub Desktop.
Modin SQL API Demo
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import modin.pandas as pd | |
gstore_apps_df = pd.read_csv("https://tinyurl.com/googleplaystorecsv") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"SELECT App, Category, Rating FROM gstore_apps WHERE Price = '0'" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Select columns | |
result_df1 = gstore_apps_df.loc[,: ['App', 'Category', 'Rating']] | |
# From the resulting dataframe filter where Price = 0 | |
result_df = result_df1['Price'] == 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import modin.sql as mdsql |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sql_str = "SELECT App,Category,Rating FROM gstore_apps WHERE Price = '0'" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
result_df = mdsql.query(sql_str, gstore_apps=gstore_apps_df) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
query_str = "SELECT App, Category, Rating WHERE Price = '0'" | |
result_df = mdsql.query(query_str, from=gstore_apps_df) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gstore_reviews_df = pd.read_csv("https://tinyurl.com/gstorereviewscsv") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sql_str = ( | |
"""SELECT category, | |
avg(sentiment_polarity) as avg_sentiment_polarity, | |
avg(sentiment_subjectivity) as avg_sentiment_subjectivity | |
FROM ( | |
SELECT category, sentiment, sentiment_polarity | |
FROM google_items_df INNER JOIN gstore_reviews_df | |
ON google_apps_df.app = gstore_reviews_df.app | |
) | |
GROUP BY category | |
HAVING CAST(avg_sentiment_subjectivity as float) < 0.5 | |
ORDER BY avg_sentiment_polarity DESC | |
LIMIT 10""" | |
) | |
result_df = mdsql.query( | |
sql_str, | |
google_apps_df=google_apps_df, | |
gstore_reviews_df=gstore_reviews_df | |
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# join the items and reviews | |
result_df = mdsql.query( | |
"""SELECT category, sentiment, sentiment_polarity | |
FROM google_items_df INNER JOIN gstore_reviews_df | |
ON google_apps_df.app = gstore_reviews_df.app""", | |
google_apps_df=google_apps_df, | |
gstore_reviews_df=gstore_reviews_df | |
) | |
# group by category and calculate averages | |
result_df = mdsql.query( | |
"""SELECT category, | |
avg(sentiment_polarity) as avg_sentiment_polarity, | |
avg(sentiment_subjectivity) as avg_sentiment_subjectivity | |
GROUP BY category | |
HAVING CAST(avg_sentiment_subjectivity as float) < 0.5 | |
ORDER BY avg_sentiment_polarity DESC | |
LIMIT 10""", | |
from=result_df | |
) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment