hannesdatta · September 2, 2022 11:24
diff --git a/exercise_3.9.ipynb b/exercise_3.9.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.9 Tying things together\n",
    "\n",
    "Now it's your turn. Use the concepts from above to...\n",
    "\n",
    "- Create an array, holding ten subreddit names of your choice\n",
    "- Write a function that returns as a dictionary the following data points from the about page of a subreddit: `display_name`, `title`, `subscribers`, and the date of creation, `created` (e.g., this is the link to the viewable about page for the [subreddit \"University\"](https://www.reddit.com/r/University/about), and this is the link to the [JSON version of the same page](https://www.reddit.com/r/University/about/.json)).\n",
    "- Write a loop to retrieve data for the ten subreddits, and store the data in a new-line separated JSON file called `my_first_web_data.json`.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\"><b>Tips:</b>\n",
    "    \n",
    "<ul>\n",
    "    <li>Did you know you can \"look\" at the API output directly in Firefox or Chrome? Just open the URL that is called for a particular subreddit in your browser. Try it with <a href='https://www.reddit.com/r/University/about.json'>this one first (click)!</a></li>\n",
    "  <li>You can use <code>f.write</code> multiple times in your code. To write a new line to the file, use <code>f.write('\\n')</code>.</li>\n",
    "    <li>Please pay attention to where you open the file for the first time, and how (<code>'a'</code> vs. <code>'w'</code>)</li>\n",
    "  \n",
    "</ul> \n",
    " \n",
    "</div>\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Solution__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import relevant packages\n",
    "import requests\n",
    "import json\n",
    "\n",
    "subreddits = ['skateboarding', 'climbing', 'tennis']\n",
    "\n",
    "# function to retrieve some data from reddit\n",
    "def get_data(subreddit):\n",
    "    url = 'https://www.reddit.com/r/' + subreddit + '/about.json'\n",
    "    print(url)\n",
    "    content = requests.get(url, headers = {'User-agent': 'I am learning Python.'}).json()\n",
    "\n",
    "    result = {\"display_name\": content['data']['display_name'],\n",
    "     \"title\": content['data']['title'],\n",
    "     \"subscribers\": content['data']['subscribers'],\n",
    "     \"timestamp\": content['data']['created']}\n",
    "    \n",
    "    return(result)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "skateboarding\n",
      "https://www.reddit.com/r/skateboarding/about.json\n",
      "climbing\n",
      "https://www.reddit.com/r/climbing/about.json\n",
      "tennis\n",
      "https://www.reddit.com/r/tennis/about.json\n"
     ]
    }
   ],
   "source": [
    "# write data\n",
    "f=open('my_data.json','w',encoding='utf-8')\n",
    "\n",
    "# loop through all subreddits\n",
    "for subreddit in subreddits:\n",
    "    print(subreddit)\n",
    "    f.write(json.dumps(get_data(subreddit)))\n",
    "    f.write('\\n')\n",
    "\n",
    "# close data file\n",
    "f.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### 3.9 Tying things together\n",
	"\n",
	"Now it's your turn. Use the concepts from above to...\n",
	"\n",
	"- Create an array, holding ten subreddit names of your choice\n",
	"- Write a function that returns as a dictionary the following data points from the about page of a subreddit: `display_name`, `title`, `subscribers`, and the date of creation, `created` (e.g., this is the link to the viewable about page for the [subreddit \"University\"](https://www.reddit.com/r/University/about), and this is the link to the [JSON version of the same page](https://www.reddit.com/r/University/about/.json)).\n",
	"- Write a loop to retrieve data for the ten subreddits, and store the data in a new-line separated JSON file called `my_first_web_data.json`.\n",
	"\n",
	"<div class=\"alert alert-block alert-info\"><b>Tips:</b>\n",
	" \n",
	"<ul>\n",
	" <li>Did you know you can \"look\" at the API output directly in Firefox or Chrome? Just open the URL that is called for a particular subreddit in your browser. Try it with <a href='https://www.reddit.com/r/University/about.json'>this one first (click)!</a></li>\n",
	" <li>You can use <code>f.write</code> multiple times in your code. To write a new line to the file, use <code>f.write('\\n')</code>.</li>\n",
	" <li>Please pay attention to where you open the file for the first time, and how (<code>'a'</code> vs. <code>'w'</code>)</li>\n",
	" \n",
	"</ul> \n",
	" \n",
	"</div>\n",
	"\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"__Solution__"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"# import relevant packages\n",
	"import requests\n",
	"import json\n",
	"\n",
	"subreddits = ['skateboarding', 'climbing', 'tennis']\n",
	"\n",
	"# function to retrieve some data from reddit\n",
	"def get_data(subreddit):\n",
	" url = 'https://www.reddit.com/r/' + subreddit + '/about.json'\n",
	" print(url)\n",
	" content = requests.get(url, headers = {'User-agent': 'I am learning Python.'}).json()\n",
	"\n",
	" result = {\"display_name\": content['data']['display_name'],\n",
	" \"title\": content['data']['title'],\n",
	" \"subscribers\": content['data']['subscribers'],\n",
	" \"timestamp\": content['data']['created']}\n",
	" \n",
	" return(result)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"skateboarding\n",
	"https://www.reddit.com/r/skateboarding/about.json\n",
	"climbing\n",
	"https://www.reddit.com/r/climbing/about.json\n",
	"tennis\n",
	"https://www.reddit.com/r/tennis/about.json\n"
	]
	}
	],
	"source": [
	"# write data\n",
	"f=open('my_data.json','w',encoding='utf-8')\n",
	"\n",
	"# loop through all subreddits\n",
	"for subreddit in subreddits:\n",
	" print(subreddit)\n",
	" f.write(json.dumps(get_data(subreddit)))\n",
	" f.write('\\n')\n",
	"\n",
	"# close data file\n",
	"f.close()"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.8.3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 4
	}