This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #0 | |
| from googletrans import Translator | |
| #1 | |
| translator = Translator() | |
| #2 | |
| translated = translator.translate("Hello there, I like pie.") | |
| #3 | |
| detected = translator.detect("Hello there, I like pie.") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def keep_emoji_strip_nonalphanum(text): | |
| # split the text by emoji; multi-emoji emojis will be kept together | |
| split_on_emoji = emoji.get_emoji_regexp().split(text) | |
| # strip all whitespaces | |
| rejoined_no_whitespace = " ".join(split_on_emoji).split() | |
| # check each substring and apply strip symbols if it doesn't contain emojis | |
| cleaned = [strip_symbols(sub) if sub[0] not in emoji.UNICODE_EMOJI \ | |
| else sub \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # set readability and styling | |
| sns.set(font_scale=1.4) | |
| sns.set_style("whitegrid") | |
| f = plt.figure(figsize=(12, 10), dpi=1000) | |
| # logic to obtain and plot Replies Count | |
| grouper = blackpink.groupby([pd.Grouper(freq='D')]) | |
| grouper['Replies Count'].sum().plot() | |
| # more styling and labeling |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # get relevant indexes | |
| text_lengths = copy_detected_langs['text_length']\ | |
| .value_counts(normalize=True)\ | |
| .sort_values(ascending=False)\ | |
| .to_frame()[:20].index | |
| # set styling | |
| sns.set(font_scale=1.4) | |
| sns.set_style("whitegrid") | |
| fig, axes = plt.subplots(1, figsize=(15, 10), dpi=1000) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| target_y = 'predicted_conf' | |
| sns.boxplot(x='predicted_lang', | |
| y=target_y, | |
| data=copy_detected_langs[(copy_detected_langs['predicted_conf'] >= 0) & \ | |
| (copy_detected_langs['predicted_lang'].isin(names))], | |
| order = names | |
| ) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.