Last active
March 21, 2020 12:37
-
-
Save dfeng/7af174954d7bef469a5a to your computer and use it in GitHub Desktop.
Housing Data – an introduction to scraping in Python
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## Scraping with Python\n", | |
| "\n", | |
| "Web scraping used to be an ugly business, and it still is. However, when the webpages you are scraping behave in a nice and predictable way, we can take advantage of html traversers to get at what we want in a simple and robust manner.\n", | |
| "\n", | |
| "A HTML document, if properly formatted, is basically a huge tree (an XML tree). Searching over and traversing this tree is optimal way to do web scraping, given that the documents we wish to scrape are correctly formatted (unfortunately this is not always the case, but fortunately for us, it is the case here). Python has a library called `BeautifulSoup` which helps us to do exactly that. R also has a similar library called `lxml`, but we shall be using Python for this demonstration.\n", | |
| "\n", | |
| "Let us start by loading the library." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 1, | |
| "metadata": { | |
| "collapsed": true | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "from bs4 import BeautifulSoup\n", | |
| "from pandas import DataFrame\n", | |
| "import re\n", | |
| "\n", | |
| "# opening the file\n", | |
| "f = open(\"newdata/4.html\")\n", | |
| "lines = f.readlines()\n", | |
| "html = \"\".join(lines)\n", | |
| "html = re.sub('<br/?>', '\\n', html)\n", | |
| "f.close()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "The first thing you should do whenever you have a html file is open up the source code and browse around. The first thing you notice is that the code is formatted properly.\n", | |
| "\n", | |
| "Case in point: if you look at this table below, the two things we want (\"Owner\" and \"CITY...\") are nicely wrapped between two `td` classes, namely `plabel` and `data`." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 2, | |
| "metadata": { | |
| "collapsed": false, | |
| "scrolled": true | |
| }, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| " <table>\n", | |
| "\t\t<tr id=\"MainContent_rowOwn\">\n", | |
| "\t\t\t<td class=\"plabel\">Owner</td><td class=\"data\"><span id=\"MainContent_lblOwner\">RODRIGUEZ WILLIAM & LYSIE</span></td>\n", | |
| "\t\t</tr><tr id=\"MainContent_rowCoOwn\">\n", | |
| "\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "print(\"\".join(lines[296:300]))" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Therefore, instead of doing crazy regular expression, we can simply use HTML selectors to get what we want. The first set of information is nicely formatted into `dt`, `dd` pairs. These are easy to select with the BeautifulSoup library:\n", | |
| "\n", | |
| "```python\n", | |
| "soup(\"dt\")\n", | |
| "```\n", | |
| "\n", | |
| "The output of this function is a list of all the `dt` tags. We then need to loop over them and extract the text. List comprehensions in Python are the shortcut to do this:\n", | |
| "\n", | |
| "```python\n", | |
| "[x.text.strip() for x in dt] \n", | |
| "```\n", | |
| "\n", | |
| "The `strip()` function simply removes leading and trailing whitespace. Combining all this information gives:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 3, | |
| "metadata": { | |
| "collapsed": false, | |
| "scrolled": true | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "soup = BeautifulSoup(html, \"lxml\")\n", | |
| "\n", | |
| "# selecting the relevant html tags\n", | |
| "dt = soup(\"dt\")\n", | |
| "dd = soup(\"dd\")\n", | |
| "labels = soup(\"td\", class_=\"plabel\")\n", | |
| "data = soup(\"td\", class_=\"data\")\n", | |
| "\n", | |
| "# doing some simple text extraction and cleanup\n", | |
| "name = [x.text.strip() for x in dt] + \\\n", | |
| " [x.text.replace(\"\\n\", \"\").strip() for x in labels]\n", | |
| "value = [x.text.strip() for x in dd] + \\\n", | |
| " [x.text.strip() for x in data]" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "This is what it looks like:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 4, | |
| "metadata": { | |
| "collapsed": false, | |
| "scrolled": false | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>0</th>\n", | |
| " <th>1</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>Location</td>\n", | |
| " <td>11 URIAH ST</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>Mblu</td>\n", | |
| " <td>014/ 0853/ 00101/ /</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>Acct#</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>Owner</td>\n", | |
| " <td>RODRIGUEZ WILLIAM & LYSIE</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>Assessment</td>\n", | |
| " <td>$144,200</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>5</th>\n", | |
| " <td>Appraisal</td>\n", | |
| " <td>$206,000</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>6</th>\n", | |
| " <td>PID</td>\n", | |
| " <td>4</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7</th>\n", | |
| " <td>Building Count</td>\n", | |
| " <td>1</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8</th>\n", | |
| " <td>Owner</td>\n", | |
| " <td>RODRIGUEZ WILLIAM & LYSIE</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>9</th>\n", | |
| " <td>Co-Owner</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>10</th>\n", | |
| " <td>Address</td>\n", | |
| " <td>11 URIAH ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>11</th>\n", | |
| " <td>Sale Price</td>\n", | |
| " <td>$140,000</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>12</th>\n", | |
| " <td>Certificate</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>13</th>\n", | |
| " <td>Book & Page</td>\n", | |
| " <td>6012/ 266</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>14</th>\n", | |
| " <td>Sale Date</td>\n", | |
| " <td>12/31/2001</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>15</th>\n", | |
| " <td>Instrument</td>\n", | |
| " <td>10</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>16</th>\n", | |
| " <td>Year Built:</td>\n", | |
| " <td>1989</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>17</th>\n", | |
| " <td>Living Area:</td>\n", | |
| " <td>1792</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>18</th>\n", | |
| " <td>Replacement Cost:</td>\n", | |
| " <td>$185,046</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>19</th>\n", | |
| " <td>Building Percent Good:</td>\n", | |
| " <td>81</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>20</th>\n", | |
| " <td>Replacement Cost Less Depre...</td>\n", | |
| " <td>$149,900</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>21</th>\n", | |
| " <td>Use Code</td>\n", | |
| " <td>1010</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>22</th>\n", | |
| " <td>Description</td>\n", | |
| " <td>Single Family</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>23</th>\n", | |
| " <td>Zone</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>24</th>\n", | |
| " <td>Neighborhood</td>\n", | |
| " <td>0101</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>25</th>\n", | |
| " <td>Alt Land Appr</td>\n", | |
| " <td>No</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>26</th>\n", | |
| " <td>Category</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>27</th>\n", | |
| " <td>Size (Acres)</td>\n", | |
| " <td>0.18</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>28</th>\n", | |
| " <td>Frontage</td>\n", | |
| " <td>57</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>29</th>\n", | |
| " <td>Depth</td>\n", | |
| " <td>79</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>30</th>\n", | |
| " <td>Assessed Value</td>\n", | |
| " <td>$39,270</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>31</th>\n", | |
| " <td>Appraised Value</td>\n", | |
| " <td>$56,100</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " 0 \\\n", | |
| "0 Location \n", | |
| "1 Mblu \n", | |
| "2 Acct# \n", | |
| "3 Owner \n", | |
| "4 Assessment \n", | |
| "5 Appraisal \n", | |
| "6 PID \n", | |
| "7 Building Count \n", | |
| "8 Owner \n", | |
| "9 Co-Owner \n", | |
| "10 Address \n", | |
| "11 Sale Price \n", | |
| "12 Certificate \n", | |
| "13 Book & Page \n", | |
| "14 Sale Date \n", | |
| "15 Instrument \n", | |
| "16 Year Built: \n", | |
| "17 Living Area: \n", | |
| "18 Replacement Cost: \n", | |
| "19 Building Percent Good: \n", | |
| "20 Replacement Cost Less Depre... \n", | |
| "21 Use Code \n", | |
| "22 Description \n", | |
| "23 Zone \n", | |
| "24 Neighborhood \n", | |
| "25 Alt Land Appr \n", | |
| "26 Category \n", | |
| "27 Size (Acres) \n", | |
| "28 Frontage \n", | |
| "29 Depth \n", | |
| "30 Assessed Value \n", | |
| "31 Appraised Value \n", | |
| "\n", | |
| " 1 \n", | |
| "0 11 URIAH ST \n", | |
| "1 014/ 0853/ 00101/ / \n", | |
| "2 \n", | |
| "3 RODRIGUEZ WILLIAM & LYSIE \n", | |
| "4 $144,200 \n", | |
| "5 $206,000 \n", | |
| "6 4 \n", | |
| "7 1 \n", | |
| "8 RODRIGUEZ WILLIAM & LYSIE \n", | |
| "9 \n", | |
| "10 11 URIAH ST\\nNEW HAVEN, CT 06512 \n", | |
| "11 $140,000 \n", | |
| "12 \n", | |
| "13 6012/ 266 \n", | |
| "14 12/31/2001 \n", | |
| "15 10 \n", | |
| "16 1989 \n", | |
| "17 1792 \n", | |
| "18 $185,046 \n", | |
| "19 81 \n", | |
| "20 $149,900 \n", | |
| "21 1010 \n", | |
| "22 Single Family \n", | |
| "23 RS2 \n", | |
| "24 0101 \n", | |
| "25 No \n", | |
| "26 \n", | |
| "27 0.18 \n", | |
| "28 57 \n", | |
| "29 79 \n", | |
| "30 $39,270 \n", | |
| "31 $56,100 " | |
| ] | |
| }, | |
| "execution_count": 4, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "DataFrame(list(zip(name, value)))" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Now wasn't that nice and clean.\n", | |
| "\n", | |
| "Extracting sales information is going to be a little trickier. We first notice that the table with this information always has the id \"MainContent_grdSales\", which helps immensely. After locating that table, we can then find all the table rows (`tr`) within this by simply chaining the `find_all` function. As we only want the first 10 sales, we limit it by 11, and drop the first row (header row). Ideally, when people are writing tables in html, they should be using the `th` tag for header rows, and `tr` for the body, which would mean we didn't need to make this hack. But then again, if everyone followed html convention scrapers would probably be out of their jobs." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 5, | |
| "metadata": { | |
| "collapsed": false, | |
| "scrolled": false | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/plain": [ | |
| "'RODRIGUEZ WILLIAM & LYSIE,$140000,\\xa0, 6012/ 266,10,12/31/2001;REMINGTON DEBORAH ,$0,\\xa0, 5222/ 120,1,10/22/1997;DELUCIA ANTHONY M + PATRICIA A,$84000,\\xa0, 4885/ 117,\\xa0,07/13/1995;UNKNOWN,$0,\\xa0, 4198/ 53,\\xa0,01/09/1990'" | |
| ] | |
| }, | |
| "execution_count": 5, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "people_tr = soup.find(id=\"MainContent_grdSales\") \\\n", | |
| " .find_all(\"tr\", limit=11)[1:]\n", | |
| "# ensuing (sort of) ugly code to collapse a 2-d list to a string\n", | |
| "people_sales = \";\".join([\",\".join([td.text.replace(\",\",\"\") for td in p.find_all(\"td\")]) for p in people_tr])\n", | |
| "people_sales" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Building data is nice. We find the table, and then extract the rows." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 6, | |
| "metadata": { | |
| "collapsed": false | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>0</th>\n", | |
| " <th>1</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>Style</td>\n", | |
| " <td>Colonial</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>Model</td>\n", | |
| " <td>Single Family</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>Grade:</td>\n", | |
| " <td>Average</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>Stories:</td>\n", | |
| " <td>2 Stories</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>Occupancy</td>\n", | |
| " <td>1</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>5</th>\n", | |
| " <td>Exterior Wall 1</td>\n", | |
| " <td>Vinyl Siding</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>6</th>\n", | |
| " <td>Exterior Wall 2</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7</th>\n", | |
| " <td>Roof Structure:</td>\n", | |
| " <td>Gable/Hip</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8</th>\n", | |
| " <td>Roof Cover</td>\n", | |
| " <td>Asphalt</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>9</th>\n", | |
| " <td>Interior Wall 1</td>\n", | |
| " <td>Drywall/Plaste</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>10</th>\n", | |
| " <td>Interior Wall 2</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>11</th>\n", | |
| " <td>Interior Flr 1</td>\n", | |
| " <td>Fin WD/Carpet</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>12</th>\n", | |
| " <td>Interior Flr 2</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>13</th>\n", | |
| " <td>Heat Fuel</td>\n", | |
| " <td>Gas/Oil</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>14</th>\n", | |
| " <td>Heat Type:</td>\n", | |
| " <td>Hot Water</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>15</th>\n", | |
| " <td>AC Type:</td>\n", | |
| " <td>None</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>16</th>\n", | |
| " <td>Total Bedrooms:</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>17</th>\n", | |
| " <td>Total Bthrms:</td>\n", | |
| " <td>2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>18</th>\n", | |
| " <td>Total Half Baths:</td>\n", | |
| " <td>1</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>19</th>\n", | |
| " <td>Total Xtra Fixtrs:</td>\n", | |
| " <td></td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>20</th>\n", | |
| " <td>Total Rooms:</td>\n", | |
| " <td>6</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>21</th>\n", | |
| " <td>Bath Style:</td>\n", | |
| " <td>Average</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>22</th>\n", | |
| " <td>Kitchen Style:</td>\n", | |
| " <td>Average</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>23</th>\n", | |
| " <td>Interior Condition</td>\n", | |
| " <td>Average</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>24</th>\n", | |
| " <td>Fin Bsmnt Area</td>\n", | |
| " <td>805</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>25</th>\n", | |
| " <td>Fin Bsmnt Qual</td>\n", | |
| " <td>Fin Rec Room</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>26</th>\n", | |
| " <td>NBHD Code</td>\n", | |
| " <td>M COVE BLDG</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " 0 1\n", | |
| "0 Style Colonial\n", | |
| "1 Model Single Family\n", | |
| "2 Grade: Average\n", | |
| "3 Stories: 2 Stories\n", | |
| "4 Occupancy 1\n", | |
| "5 Exterior Wall 1 Vinyl Siding\n", | |
| "6 Exterior Wall 2 \n", | |
| "7 Roof Structure: Gable/Hip\n", | |
| "8 Roof Cover Asphalt\n", | |
| "9 Interior Wall 1 Drywall/Plaste\n", | |
| "10 Interior Wall 2 \n", | |
| "11 Interior Flr 1 Fin WD/Carpet\n", | |
| "12 Interior Flr 2 \n", | |
| "13 Heat Fuel Gas/Oil\n", | |
| "14 Heat Type: Hot Water\n", | |
| "15 AC Type: None\n", | |
| "16 Total Bedrooms: 3 Bedrooms\n", | |
| "17 Total Bthrms: 2\n", | |
| "18 Total Half Baths: 1\n", | |
| "19 Total Xtra Fixtrs: \n", | |
| "20 Total Rooms: 6\n", | |
| "21 Bath Style: Average\n", | |
| "22 Kitchen Style: Average\n", | |
| "23 Interior Condition Average\n", | |
| "24 Fin Bsmnt Area 805\n", | |
| "25 Fin Bsmnt Qual Fin Rec Room\n", | |
| "26 NBHD Code M COVE BLDG" | |
| ] | |
| }, | |
| "execution_count": 6, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "building_tr = soup.find(id=\"MainContent_ctl01_grdCns\") \\\n", | |
| " .find_all(\"tr\")[1:]\n", | |
| "building = [[td.text for td in tr(\"td\")] for tr in building_tr]\n", | |
| "DataFrame(building)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Dealing with Extras:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 7, | |
| "metadata": { | |
| "collapsed": false | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/plain": [ | |
| "0" | |
| ] | |
| }, | |
| "execution_count": 7, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "people_tr = soup.find(id=\"MainContent_grdXf\") \\\n", | |
| " .find_all(\"tr\")[1:]\n", | |
| "sum([int(tr.find_all(\"td\")[3].text.replace(\",\", \"\")[1:]) for tr in people_tr])" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Dealing with the Garage:" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 8, | |
| "metadata": { | |
| "collapsed": false | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/plain": [ | |
| "0" | |
| ] | |
| }, | |
| "execution_count": 8, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "out_tr = soup.find(id=\"MainContent_grdOb\") \\\n", | |
| " .find_all(\"tr\")[1:]\n", | |
| "out_bs = [\",\".join([td.text.replace(\",\",\"\") for td in tr.find_all(\"td\")]) for tr in out_tr]\n", | |
| "sum([int(out_b.split(\",\")[5][1:]) for out_b in out_bs if \"GARAGE\" in out_b])" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "We now have all the data we want. Crucially, we have extracted the accompanying labels as well as the data points. This means we are more susceptible to mislabelling. A first step towards reducing error is to clean the labels, removing extraneous symbols and spaces. We introduce a function that uses regex to do so." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 9, | |
| "metadata": { | |
| "collapsed": false | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/plain": [ | |
| "['Location',\n", | |
| " 'Mblu',\n", | |
| " 'Acct',\n", | |
| " 'Owner',\n", | |
| " 'Assessment',\n", | |
| " 'Appraisal',\n", | |
| " 'PID',\n", | |
| " 'Building_Count',\n", | |
| " 'Owner',\n", | |
| " 'CoOwner',\n", | |
| " 'Address',\n", | |
| " 'Sale_Price',\n", | |
| " 'Certificate',\n", | |
| " 'Book_Page',\n", | |
| " 'Sale_Date',\n", | |
| " 'Instrument',\n", | |
| " 'Year_Built',\n", | |
| " 'Living_Area',\n", | |
| " 'Replacement_Cost',\n", | |
| " 'Building_Percent_Good',\n", | |
| " 'Replacement_Cost_Less_Depreciation',\n", | |
| " 'Use_Code',\n", | |
| " 'Description',\n", | |
| " 'Zone',\n", | |
| " 'Neighborhood',\n", | |
| " 'Alt_Land_Appr',\n", | |
| " 'Category',\n", | |
| " 'Size_Acres',\n", | |
| " 'Frontage',\n", | |
| " 'Depth',\n", | |
| " 'Assessed_Value',\n", | |
| " 'Appraised_Value']" | |
| ] | |
| }, | |
| "execution_count": 9, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "def prettify(s):\n", | |
| " s = re.sub(r\"[^\\w\\s]\", '', s) # Remove all non-word characters (everything except numbers and letters)\n", | |
| " s = re.sub(r\"\\s+\", '_', s) # Replace all runs of whitespace with a single dash\n", | |
| "\n", | |
| " return s\n", | |
| "[prettify(t) for t in name]" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "Thus, the extracted raw data we have is a list of label:value's, where the label is formatted nicely. In this exercise, we will be restricting our attention to a subset of the labels.\n", | |
| "\n", | |
| "## Starting the engine\n", | |
| "\n", | |
| "Now it's time to do it over the files." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 10, | |
| "metadata": { | |
| "collapsed": false | |
| }, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>AC_Type</th>\n", | |
| " <th>Address</th>\n", | |
| " <th>Appraisal</th>\n", | |
| " <th>Appraised_Value</th>\n", | |
| " <th>Bath_Style</th>\n", | |
| " <th>Building_Percent_Good</th>\n", | |
| " <th>Grade</th>\n", | |
| " <th>Kitchen_Style</th>\n", | |
| " <th>Living_Area</th>\n", | |
| " <th>Location</th>\n", | |
| " <th>...</th>\n", | |
| " <th>PID</th>\n", | |
| " <th>Replacement_Cost</th>\n", | |
| " <th>Sale_Price</th>\n", | |
| " <th>Size_Acres</th>\n", | |
| " <th>Style</th>\n", | |
| " <th>Total_Bedrooms</th>\n", | |
| " <th>Total_Bthrms</th>\n", | |
| " <th>Total_Half_Baths</th>\n", | |
| " <th>Year_Built</th>\n", | |
| " <th>Zone</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$1,237,500</td>\n", | |
| " <td>$1,237,500</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>51 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>1</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>3.8</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$553,600</td>\n", | |
| " <td>$553,600</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>75 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>2</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>1.7</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>None</td>\n", | |
| " <td>199 SOUTH END RD\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$197,900</td>\n", | |
| " <td>$58,400</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>73</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1475</td>\n", | |
| " <td>199 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>3</td>\n", | |
| " <td>$176,395</td>\n", | |
| " <td>$276,040</td>\n", | |
| " <td>0.25</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1950</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>None</td>\n", | |
| " <td>11 URIAH ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$206,000</td>\n", | |
| " <td>$56,100</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>81</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1792</td>\n", | |
| " <td>11 URIAH ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>4</td>\n", | |
| " <td>$185,046</td>\n", | |
| " <td>$140,000</td>\n", | |
| " <td>0.18</td>\n", | |
| " <td>Colonial</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1989</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>None</td>\n", | |
| " <td>181 SOUTH END RD\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$155,400</td>\n", | |
| " <td>$61,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>73</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>864</td>\n", | |
| " <td>181 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>5</td>\n", | |
| " <td>$129,247</td>\n", | |
| " <td>$183,000</td>\n", | |
| " <td>0.36</td>\n", | |
| " <td>Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1945</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>5</th>\n", | |
| " <td>None</td>\n", | |
| " <td>169 SOUTH END ROAD\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$140,800</td>\n", | |
| " <td>$60,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>63</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1040</td>\n", | |
| " <td>169 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>6</td>\n", | |
| " <td>$111,858</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.32</td>\n", | |
| " <td>Bungalow</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1940</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>6</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>173 SOUTH END RD\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$185,500</td>\n", | |
| " <td>$60,700</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>81</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1512</td>\n", | |
| " <td>173 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>7</td>\n", | |
| " <td>$154,017</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.35</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>5 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1986</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>7</th>\n", | |
| " <td>None</td>\n", | |
| " <td>165 SOUTH END RD\\nNEW HAVEN, CT 06511</td>\n", | |
| " <td>$132,000</td>\n", | |
| " <td>$55,800</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>63</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1080</td>\n", | |
| " <td>165 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>8</td>\n", | |
| " <td>$121,000</td>\n", | |
| " <td>$64,000</td>\n", | |
| " <td>0.17</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1940</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>8</th>\n", | |
| " <td>None</td>\n", | |
| " <td>161 SOUTH END RD\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$133,100</td>\n", | |
| " <td>$54,500</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>63</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1040</td>\n", | |
| " <td>161 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>9</td>\n", | |
| " <td>$124,783</td>\n", | |
| " <td>$142,000</td>\n", | |
| " <td>0.14</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>2 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1930</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>9</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>157 SOUTH END RD\\nNEW HAVEN, CT 06511</td>\n", | |
| " <td>$126,000</td>\n", | |
| " <td>$56,900</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>63</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>985</td>\n", | |
| " <td>157 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>10</td>\n", | |
| " <td>$109,710</td>\n", | |
| " <td>$157,000</td>\n", | |
| " <td>0.2</td>\n", | |
| " <td>Ranch</td>\n", | |
| " <td>1 Bedroom</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1900</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>10</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$58,000</td>\n", | |
| " <td>$58,000</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>153 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>11</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.24</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>11</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$191,800</td>\n", | |
| " <td>$191,800</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>12</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.09</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>12</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$62,000</td>\n", | |
| " <td>$62,000</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>107 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>13</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.43</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>13</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$54,500</td>\n", | |
| " <td>$54,500</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>101 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>14</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.14</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>14</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$58,600</td>\n", | |
| " <td>$58,600</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>95 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>15</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.26</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>15</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$3,200</td>\n", | |
| " <td>$3,200</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>16</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.03</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>16</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$62,000</td>\n", | |
| " <td>$62,000</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>91 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>17</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.43</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>17</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$59,300</td>\n", | |
| " <td>$59,300</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>103 SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>18</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.29</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>18</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$68,700</td>\n", | |
| " <td>$68,700</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>19</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.99</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>19</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$2,621,500</td>\n", | |
| " <td>$2,621,500</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>URIAH ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>20</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>8.05</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>20</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$325,600</td>\n", | |
| " <td>$325,600</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>21</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>1</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>21</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$814,100</td>\n", | |
| " <td>$814,100</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>SOUTH END RD</td>\n", | |
| " <td>...</td>\n", | |
| " <td>22</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>2.5</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>22</th>\n", | |
| " <td>None</td>\n", | |
| " <td>5 DOUGLASS AVE\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$179,200</td>\n", | |
| " <td>$58,300</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>68</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1484</td>\n", | |
| " <td>5 DOUGLASS AV</td>\n", | |
| " <td>...</td>\n", | |
| " <td>23</td>\n", | |
| " <td>$171,446</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.16</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>2 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1940</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>23</th>\n", | |
| " <td>None</td>\n", | |
| " <td>11 DOUGLASS AVE\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$239,100</td>\n", | |
| " <td>$55,300</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>75</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>2124</td>\n", | |
| " <td>11 DOUGLASS AV</td>\n", | |
| " <td>...</td>\n", | |
| " <td>24</td>\n", | |
| " <td>$227,013</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.13</td>\n", | |
| " <td>Colonial</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1940</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>24</th>\n", | |
| " <td>None</td>\n", | |
| " <td>17 DOUGLASS AV\\nNEW HAVEN, CT 06511</td>\n", | |
| " <td>$124,400</td>\n", | |
| " <td>$55,300</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>65</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>998</td>\n", | |
| " <td>17 DOUGLASS AV</td>\n", | |
| " <td>...</td>\n", | |
| " <td>25</td>\n", | |
| " <td>$104,512</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.13</td>\n", | |
| " <td>Bungalow</td>\n", | |
| " <td>2 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1940</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>25</th>\n", | |
| " <td>None</td>\n", | |
| " <td>23 DOUGLASS AVE\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$123,400</td>\n", | |
| " <td>$55,300</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>65</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>886</td>\n", | |
| " <td>23 DOUGLASS AV</td>\n", | |
| " <td>...</td>\n", | |
| " <td>26</td>\n", | |
| " <td>$99,943</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.13</td>\n", | |
| " <td>Bungalow</td>\n", | |
| " <td>2 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1925</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>26</th>\n", | |
| " <td>None</td>\n", | |
| " <td>29 DOUGLASS AV\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$166,300</td>\n", | |
| " <td>$53,800</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>63</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1680</td>\n", | |
| " <td>29 DOUGLASS AV</td>\n", | |
| " <td>...</td>\n", | |
| " <td>27</td>\n", | |
| " <td>$178,623</td>\n", | |
| " <td>$245,000</td>\n", | |
| " <td>0.12</td>\n", | |
| " <td>Colonial</td>\n", | |
| " <td>4 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1940</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>27</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$58,800</td>\n", | |
| " <td>$58,800</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>32 URIAH ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>28</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.27</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>28</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>24 URIAH ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$150,000</td>\n", | |
| " <td>$54,100</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>77</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1170</td>\n", | |
| " <td>24 URIAH ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>29</td>\n", | |
| " <td>$157,165</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.13</td>\n", | |
| " <td>Raised Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1975</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>29</th>\n", | |
| " <td>None</td>\n", | |
| " <td>20 URIAH ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$159,800</td>\n", | |
| " <td>$55,300</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>73</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1144</td>\n", | |
| " <td>20 URIAH ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>30</td>\n", | |
| " <td>$143,100</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.13</td>\n", | |
| " <td>Raised Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1969</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>...</th>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>69</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$394,000</td>\n", | |
| " <td>$394,000</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>269 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>70</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>1.21</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>70</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$852,600</td>\n", | |
| " <td>$849,900</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>353 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>71</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>2.61</td>\n", | |
| " <td>Outbuildings</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>71</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$1,266,900</td>\n", | |
| " <td>$1,263,500</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>425 DODGE AV</td>\n", | |
| " <td>...</td>\n", | |
| " <td>72</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>3.88</td>\n", | |
| " <td>Outbuildings</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>AIRP</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>72</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$900</td>\n", | |
| " <td>$900</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>392 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>73</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.01</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>73</th>\n", | |
| " <td></td>\n", | |
| " <td>23 MULLIGAN DR\\nWALLINGFORD, CT 06492</td>\n", | |
| " <td>$620</td>\n", | |
| " <td>$620</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>400 BURR ST #1</td>\n", | |
| " <td>...</td>\n", | |
| " <td>74</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.62</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>74</th>\n", | |
| " <td></td>\n", | |
| " <td>23 MULLIGAN DR\\n \\nWALLINFORD, CT 06492</td>\n", | |
| " <td>$340</td>\n", | |
| " <td>$340</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>400 BURR ST #2</td>\n", | |
| " <td>...</td>\n", | |
| " <td>75</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.34</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>75</th>\n", | |
| " <td></td>\n", | |
| " <td>23 MULLIGAN DR\\n \\nWALLINGFORD, CT 06492</td>\n", | |
| " <td>$410</td>\n", | |
| " <td>$410</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>400 BURR ST #3</td>\n", | |
| " <td>...</td>\n", | |
| " <td>76</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.41</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>76</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>C/O HERMAN DOSTIE\\n23 MULLIGAN DR\\nWALLINGFORD...</td>\n", | |
| " <td>$263,500</td>\n", | |
| " <td>$64,900</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>88</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1611</td>\n", | |
| " <td>396 BURR ST #4</td>\n", | |
| " <td>...</td>\n", | |
| " <td>77</td>\n", | |
| " <td>$219,851</td>\n", | |
| " <td>$245,000</td>\n", | |
| " <td>0.62</td>\n", | |
| " <td>Raised Ranch</td>\n", | |
| " <td>4 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1990</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>77</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>23 MULLIGAN DR\\n \\nWALLINGFORD, CT 06492</td>\n", | |
| " <td>$286,500</td>\n", | |
| " <td>$61,500</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>98</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1828</td>\n", | |
| " <td>400 BURR ST #5</td>\n", | |
| " <td>...</td>\n", | |
| " <td>78</td>\n", | |
| " <td>$229,597</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.35</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>2009</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>78</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>23 MULLIGAN DR\\nWALLINGFORD, CT 06492</td>\n", | |
| " <td>$298,400</td>\n", | |
| " <td>$57,200</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>98</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>2038</td>\n", | |
| " <td>400 BURR ST #6</td>\n", | |
| " <td>...</td>\n", | |
| " <td>79</td>\n", | |
| " <td>$246,073</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.19</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>2009</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>79</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>23 MULLIGAN DR\\n \\nWALLINGFORD, CT 06492</td>\n", | |
| " <td>$312,700</td>\n", | |
| " <td>$58,900</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>98</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>2025</td>\n", | |
| " <td>400 BURR ST #7</td>\n", | |
| " <td>...</td>\n", | |
| " <td>80</td>\n", | |
| " <td>$258,942</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.24</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>2009</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>80</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>23 MULLIGAN DR\\nWALLINGFORD, CT 06492</td>\n", | |
| " <td>$344,100</td>\n", | |
| " <td>$60,700</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>98</td>\n", | |
| " <td>Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>2312</td>\n", | |
| " <td>400 BURR ST #8</td>\n", | |
| " <td>...</td>\n", | |
| " <td>81</td>\n", | |
| " <td>$289,153</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.31</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>2009</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>81</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>23 MULLIGAN DR\\nWALLINGFORD, CT 06492</td>\n", | |
| " <td>$287,200</td>\n", | |
| " <td>$62,800</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>98</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1828</td>\n", | |
| " <td>400 BURR ST #9</td>\n", | |
| " <td>...</td>\n", | |
| " <td>82</td>\n", | |
| " <td>$229,004</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.42</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>2009</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>82</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>392 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$129,200</td>\n", | |
| " <td>$51,400</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>58</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1040</td>\n", | |
| " <td>392 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>83</td>\n", | |
| " <td>$134,145</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.18</td>\n", | |
| " <td>Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1955</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>83</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>384 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$124,600</td>\n", | |
| " <td>$53,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>58</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1040</td>\n", | |
| " <td>384 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>84</td>\n", | |
| " <td>$123,528</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.24</td>\n", | |
| " <td>Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1957</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>84</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>374 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$125,000</td>\n", | |
| " <td>$53,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>58</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1040</td>\n", | |
| " <td>374 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>85</td>\n", | |
| " <td>$124,145</td>\n", | |
| " <td>$148,000</td>\n", | |
| " <td>0.24</td>\n", | |
| " <td>Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1957</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>85</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>366 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$128,100</td>\n", | |
| " <td>$53,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>58</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>988</td>\n", | |
| " <td>366 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>86</td>\n", | |
| " <td>$129,540</td>\n", | |
| " <td>$93,280</td>\n", | |
| " <td>0.24</td>\n", | |
| " <td>Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1956</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>86</th>\n", | |
| " <td>None</td>\n", | |
| " <td>22 CONIFER DR\\nHAMDEN, CT 06518</td>\n", | |
| " <td>$131,600</td>\n", | |
| " <td>$53,700</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>68</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1536</td>\n", | |
| " <td>360 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>87</td>\n", | |
| " <td>$114,600</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.27</td>\n", | |
| " <td>2 Family</td>\n", | |
| " <td>4 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1930</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>87</th>\n", | |
| " <td>None</td>\n", | |
| " <td>354 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$141,700</td>\n", | |
| " <td>$53,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>58</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1434</td>\n", | |
| " <td>354 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>88</td>\n", | |
| " <td>$147,419</td>\n", | |
| " <td>$129,900</td>\n", | |
| " <td>0.24</td>\n", | |
| " <td>Colonial</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1950</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>88</th>\n", | |
| " <td>None</td>\n", | |
| " <td>348 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$155,700</td>\n", | |
| " <td>$53,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>68</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1352</td>\n", | |
| " <td>348 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>89</td>\n", | |
| " <td>$142,187</td>\n", | |
| " <td>$138,500</td>\n", | |
| " <td>0.24</td>\n", | |
| " <td>Colonial</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1941</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>89</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06510</td>\n", | |
| " <td>$4,500</td>\n", | |
| " <td>$4,500</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>344 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>90</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.04</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>90</th>\n", | |
| " <td></td>\n", | |
| " <td>165 CHURCH ST\\nNEW HAVEN, CT 06511</td>\n", | |
| " <td>$66,100</td>\n", | |
| " <td>$66,100</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>0</td>\n", | |
| " <td>342 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>91</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.73</td>\n", | |
| " <td>Vacant Land</td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td></td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>91</th>\n", | |
| " <td>None</td>\n", | |
| " <td>340 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$145,700</td>\n", | |
| " <td>$52,100</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>56</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1624</td>\n", | |
| " <td>340 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>92</td>\n", | |
| " <td>$156,458</td>\n", | |
| " <td>$30,000</td>\n", | |
| " <td>0.21</td>\n", | |
| " <td>Cape Cod</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1950</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>92</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>336 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$181,500</td>\n", | |
| " <td>$53,200</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>76</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1274</td>\n", | |
| " <td>336 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>93</td>\n", | |
| " <td>$164,403</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.25</td>\n", | |
| " <td>Raised Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1968</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>93</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>330 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$152,700</td>\n", | |
| " <td>$53,200</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>62</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1040</td>\n", | |
| " <td>330 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>94</td>\n", | |
| " <td>$160,494</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.25</td>\n", | |
| " <td>Raised Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1964</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>94</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>320 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$159,200</td>\n", | |
| " <td>$53,200</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>66</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1040</td>\n", | |
| " <td>320 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>95</td>\n", | |
| " <td>$149,198</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.25</td>\n", | |
| " <td>Raised Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1967</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>95</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>310 BURR ST\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$170,700</td>\n", | |
| " <td>$52,000</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>66</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1560</td>\n", | |
| " <td>310 BURR ST</td>\n", | |
| " <td>...</td>\n", | |
| " <td>96</td>\n", | |
| " <td>$179,867</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.20</td>\n", | |
| " <td>Raised Ranch</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>0</td>\n", | |
| " <td>1965</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>96</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>175 TOWNSEND TER\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$287,900</td>\n", | |
| " <td>$57,300</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>76</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>2880</td>\n", | |
| " <td>175 TOWNSEND TER</td>\n", | |
| " <td>...</td>\n", | |
| " <td>97</td>\n", | |
| " <td>$298,980</td>\n", | |
| " <td>$330,000</td>\n", | |
| " <td>0.19</td>\n", | |
| " <td>Split-Level</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1968</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>97</th>\n", | |
| " <td>Central</td>\n", | |
| " <td>155 TOWNSEND TERR\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$320,300</td>\n", | |
| " <td>$59,100</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>80</td>\n", | |
| " <td>Ave/Good</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>2916</td>\n", | |
| " <td>155 TOWNSEND TER</td>\n", | |
| " <td>...</td>\n", | |
| " <td>98</td>\n", | |
| " <td>$321,943</td>\n", | |
| " <td>$0</td>\n", | |
| " <td>0.25</td>\n", | |
| " <td>Colonial</td>\n", | |
| " <td>4 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1970</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>98</th>\n", | |
| " <td>None</td>\n", | |
| " <td>147 TOWNSEND TERRACE\\nNEW HAVEN, CT 06512</td>\n", | |
| " <td>$243,200</td>\n", | |
| " <td>$60,100</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>86</td>\n", | |
| " <td>Above Average</td>\n", | |
| " <td>Average</td>\n", | |
| " <td>1632</td>\n", | |
| " <td>147 TOWNSEND TER</td>\n", | |
| " <td>...</td>\n", | |
| " <td>99</td>\n", | |
| " <td>$208,412</td>\n", | |
| " <td>$138,000</td>\n", | |
| " <td>0.29</td>\n", | |
| " <td>Split-Level</td>\n", | |
| " <td>3 Bedrooms</td>\n", | |
| " <td>2</td>\n", | |
| " <td>1</td>\n", | |
| " <td>1969</td>\n", | |
| " <td>RS2</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "<p>99 rows × 24 columns</p>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " AC_Type Address Appraisal \\\n", | |
| "0 165 CHURCH ST\\nNEW HAVEN, CT 06510 $1,237,500 \n", | |
| "1 165 CHURCH ST\\nNEW HAVEN, CT 06510 $553,600 \n", | |
| "2 None 199 SOUTH END RD\\nNEW HAVEN, CT 06512 $197,900 \n", | |
| "3 None 11 URIAH ST\\nNEW HAVEN, CT 06512 $206,000 \n", | |
| "4 None 181 SOUTH END RD\\nNEW HAVEN, CT 06512 $155,400 \n", | |
| "5 None 169 SOUTH END ROAD\\nNEW HAVEN, CT 06512 $140,800 \n", | |
| "6 Central 173 SOUTH END RD\\nNEW HAVEN, CT 06512 $185,500 \n", | |
| "7 None 165 SOUTH END RD\\nNEW HAVEN, CT 06511 $132,000 \n", | |
| "8 None 161 SOUTH END RD\\nNEW HAVEN, CT 06512 $133,100 \n", | |
| "9 Central 157 SOUTH END RD\\nNEW HAVEN, CT 06511 $126,000 \n", | |
| "10 165 CHURCH ST\\nNEW HAVEN, CT 06510 $58,000 \n", | |
| "11 165 CHURCH ST\\nNEW HAVEN, CT 06510 $191,800 \n", | |
| "12 165 CHURCH ST\\nNEW HAVEN, CT 06510 $62,000 \n", | |
| "13 165 CHURCH ST\\nNEW HAVEN, CT 06510 $54,500 \n", | |
| "14 165 CHURCH ST\\nNEW HAVEN, CT 06510 $58,600 \n", | |
| "15 165 CHURCH ST\\nNEW HAVEN, CT 06510 $3,200 \n", | |
| "16 165 CHURCH ST\\nNEW HAVEN, CT 06510 $62,000 \n", | |
| "17 165 CHURCH ST\\nNEW HAVEN, CT 06510 $59,300 \n", | |
| "18 165 CHURCH ST\\nNEW HAVEN, CT 06510 $68,700 \n", | |
| "19 165 CHURCH ST\\nNEW HAVEN, CT 06510 $2,621,500 \n", | |
| "20 165 CHURCH ST\\nNEW HAVEN, CT 06510 $325,600 \n", | |
| "21 165 CHURCH ST\\nNEW HAVEN, CT 06510 $814,100 \n", | |
| "22 None 5 DOUGLASS AVE\\nNEW HAVEN, CT 06512 $179,200 \n", | |
| "23 None 11 DOUGLASS AVE\\nNEW HAVEN, CT 06512 $239,100 \n", | |
| "24 None 17 DOUGLASS AV\\nNEW HAVEN, CT 06511 $124,400 \n", | |
| "25 None 23 DOUGLASS AVE\\nNEW HAVEN, CT 06512 $123,400 \n", | |
| "26 None 29 DOUGLASS AV\\nNEW HAVEN, CT 06512 $166,300 \n", | |
| "27 165 CHURCH ST\\nNEW HAVEN, CT 06510 $58,800 \n", | |
| "28 Central 24 URIAH ST\\nNEW HAVEN, CT 06512 $150,000 \n", | |
| "29 None 20 URIAH ST\\nNEW HAVEN, CT 06512 $159,800 \n", | |
| ".. ... ... ... \n", | |
| "69 165 CHURCH ST\\nNEW HAVEN, CT 06510 $394,000 \n", | |
| "70 165 CHURCH ST\\nNEW HAVEN, CT 06510 $852,600 \n", | |
| "71 165 CHURCH ST\\nNEW HAVEN, CT 06510 $1,266,900 \n", | |
| "72 165 CHURCH ST\\nNEW HAVEN, CT 06510 $900 \n", | |
| "73 23 MULLIGAN DR\\nWALLINGFORD, CT 06492 $620 \n", | |
| "74 23 MULLIGAN DR\\n \\nWALLINFORD, CT 06492 $340 \n", | |
| "75 23 MULLIGAN DR\\n \\nWALLINGFORD, CT 06492 $410 \n", | |
| "76 Central C/O HERMAN DOSTIE\\n23 MULLIGAN DR\\nWALLINGFORD... $263,500 \n", | |
| "77 Central 23 MULLIGAN DR\\n \\nWALLINGFORD, CT 06492 $286,500 \n", | |
| "78 Central 23 MULLIGAN DR\\nWALLINGFORD, CT 06492 $298,400 \n", | |
| "79 Central 23 MULLIGAN DR\\n \\nWALLINGFORD, CT 06492 $312,700 \n", | |
| "80 Central 23 MULLIGAN DR\\nWALLINGFORD, CT 06492 $344,100 \n", | |
| "81 Central 23 MULLIGAN DR\\nWALLINGFORD, CT 06492 $287,200 \n", | |
| "82 Central 392 BURR ST\\nNEW HAVEN, CT 06512 $129,200 \n", | |
| "83 Central 384 BURR ST\\nNEW HAVEN, CT 06512 $124,600 \n", | |
| "84 Central 374 BURR ST\\nNEW HAVEN, CT 06512 $125,000 \n", | |
| "85 Central 366 BURR ST\\nNEW HAVEN, CT 06512 $128,100 \n", | |
| "86 None 22 CONIFER DR\\nHAMDEN, CT 06518 $131,600 \n", | |
| "87 None 354 BURR ST\\nNEW HAVEN, CT 06512 $141,700 \n", | |
| "88 None 348 BURR ST\\nNEW HAVEN, CT 06512 $155,700 \n", | |
| "89 165 CHURCH ST\\nNEW HAVEN, CT 06510 $4,500 \n", | |
| "90 165 CHURCH ST\\nNEW HAVEN, CT 06511 $66,100 \n", | |
| "91 None 340 BURR ST\\nNEW HAVEN, CT 06512 $145,700 \n", | |
| "92 Central 336 BURR ST\\nNEW HAVEN, CT 06512 $181,500 \n", | |
| "93 Central 330 BURR ST\\nNEW HAVEN, CT 06512 $152,700 \n", | |
| "94 Central 320 BURR ST\\nNEW HAVEN, CT 06512 $159,200 \n", | |
| "95 Central 310 BURR ST\\nNEW HAVEN, CT 06512 $170,700 \n", | |
| "96 Central 175 TOWNSEND TER\\nNEW HAVEN, CT 06512 $287,900 \n", | |
| "97 Central 155 TOWNSEND TERR\\nNEW HAVEN, CT 06512 $320,300 \n", | |
| "98 None 147 TOWNSEND TERRACE\\nNEW HAVEN, CT 06512 $243,200 \n", | |
| "\n", | |
| " Appraised_Value Bath_Style Building_Percent_Good Grade \\\n", | |
| "0 $1,237,500 \n", | |
| "1 $553,600 \n", | |
| "2 $58,400 Average 73 Above Average \n", | |
| "3 $56,100 Average 81 Average \n", | |
| "4 $61,000 Average 73 Average \n", | |
| "5 $60,000 Average 63 Average \n", | |
| "6 $60,700 Average 81 Average \n", | |
| "7 $55,800 Average 63 Average \n", | |
| "8 $54,500 Average 63 Above Average \n", | |
| "9 $56,900 Average 63 Average \n", | |
| "10 $58,000 \n", | |
| "11 $191,800 \n", | |
| "12 $62,000 \n", | |
| "13 $54,500 \n", | |
| "14 $58,600 \n", | |
| "15 $3,200 \n", | |
| "16 $62,000 \n", | |
| "17 $59,300 \n", | |
| "18 $68,700 \n", | |
| "19 $2,621,500 \n", | |
| "20 $325,600 \n", | |
| "21 $814,100 \n", | |
| "22 $58,300 Average 68 Above Average \n", | |
| "23 $55,300 Average 75 Ave/Good \n", | |
| "24 $55,300 Average 65 Average \n", | |
| "25 $55,300 Average 65 Average \n", | |
| "26 $53,800 Average 63 Above Average \n", | |
| "27 $58,800 \n", | |
| "28 $54,100 Average 77 Average \n", | |
| "29 $55,300 Average 73 Average \n", | |
| ".. ... ... ... ... \n", | |
| "69 $394,000 \n", | |
| "70 $849,900 \n", | |
| "71 $1,263,500 \n", | |
| "72 $900 \n", | |
| "73 $620 \n", | |
| "74 $340 \n", | |
| "75 $410 \n", | |
| "76 $64,900 Average 88 Ave/Good \n", | |
| "77 $61,500 Average 98 Ave/Good \n", | |
| "78 $57,200 Average 98 Ave/Good \n", | |
| "79 $58,900 Above Average 98 Ave/Good \n", | |
| "80 $60,700 Average 98 Good \n", | |
| "81 $62,800 Average 98 Ave/Good \n", | |
| "82 $51,400 Average 58 Average \n", | |
| "83 $53,000 Average 58 Average \n", | |
| "84 $53,000 Average 58 Average \n", | |
| "85 $53,000 Average 58 Average \n", | |
| "86 $53,700 Average 68 Average \n", | |
| "87 $53,000 Average 58 Average \n", | |
| "88 $53,000 Average 68 Average \n", | |
| "89 $4,500 \n", | |
| "90 $66,100 \n", | |
| "91 $52,100 Average 56 Average \n", | |
| "92 $53,200 Average 76 Above Average \n", | |
| "93 $53,200 Average 62 Above Average \n", | |
| "94 $53,200 Average 66 Above Average \n", | |
| "95 $52,000 Average 66 Above Average \n", | |
| "96 $57,300 Average 76 Ave/Good \n", | |
| "97 $59,100 Average 80 Ave/Good \n", | |
| "98 $60,100 Average 86 Above Average \n", | |
| "\n", | |
| " Kitchen_Style Living_Area Location ... PID Replacement_Cost \\\n", | |
| "0 0 51 SOUTH END RD ... 1 $0 \n", | |
| "1 0 75 SOUTH END RD ... 2 $0 \n", | |
| "2 Average 1475 199 SOUTH END RD ... 3 $176,395 \n", | |
| "3 Average 1792 11 URIAH ST ... 4 $185,046 \n", | |
| "4 Average 864 181 SOUTH END RD ... 5 $129,247 \n", | |
| "5 Average 1040 169 SOUTH END RD ... 6 $111,858 \n", | |
| "6 Average 1512 173 SOUTH END RD ... 7 $154,017 \n", | |
| "7 Average 1080 165 SOUTH END RD ... 8 $121,000 \n", | |
| "8 Average 1040 161 SOUTH END RD ... 9 $124,783 \n", | |
| "9 Average 985 157 SOUTH END RD ... 10 $109,710 \n", | |
| "10 0 153 SOUTH END RD ... 11 $0 \n", | |
| "11 0 SOUTH END RD ... 12 $0 \n", | |
| "12 0 107 SOUTH END RD ... 13 $0 \n", | |
| "13 0 101 SOUTH END RD ... 14 $0 \n", | |
| "14 0 95 SOUTH END RD ... 15 $0 \n", | |
| "15 0 SOUTH END RD ... 16 $0 \n", | |
| "16 0 91 SOUTH END RD ... 17 $0 \n", | |
| "17 0 103 SOUTH END RD ... 18 $0 \n", | |
| "18 0 SOUTH END RD ... 19 $0 \n", | |
| "19 0 URIAH ST ... 20 $0 \n", | |
| "20 0 SOUTH END RD ... 21 $0 \n", | |
| "21 0 SOUTH END RD ... 22 $0 \n", | |
| "22 Average 1484 5 DOUGLASS AV ... 23 $171,446 \n", | |
| "23 Average 2124 11 DOUGLASS AV ... 24 $227,013 \n", | |
| "24 Average 998 17 DOUGLASS AV ... 25 $104,512 \n", | |
| "25 Average 886 23 DOUGLASS AV ... 26 $99,943 \n", | |
| "26 Average 1680 29 DOUGLASS AV ... 27 $178,623 \n", | |
| "27 0 32 URIAH ST ... 28 $0 \n", | |
| "28 Average 1170 24 URIAH ST ... 29 $157,165 \n", | |
| "29 Average 1144 20 URIAH ST ... 30 $143,100 \n", | |
| ".. ... ... ... ... .. ... \n", | |
| "69 0 269 BURR ST ... 70 $0 \n", | |
| "70 0 353 BURR ST ... 71 $0 \n", | |
| "71 0 425 DODGE AV ... 72 $0 \n", | |
| "72 0 392 BURR ST ... 73 $0 \n", | |
| "73 0 400 BURR ST #1 ... 74 $0 \n", | |
| "74 0 400 BURR ST #2 ... 75 $0 \n", | |
| "75 0 400 BURR ST #3 ... 76 $0 \n", | |
| "76 Average 1611 396 BURR ST #4 ... 77 $219,851 \n", | |
| "77 Average 1828 400 BURR ST #5 ... 78 $229,597 \n", | |
| "78 Average 2038 400 BURR ST #6 ... 79 $246,073 \n", | |
| "79 Above Average 2025 400 BURR ST #7 ... 80 $258,942 \n", | |
| "80 Average 2312 400 BURR ST #8 ... 81 $289,153 \n", | |
| "81 Average 1828 400 BURR ST #9 ... 82 $229,004 \n", | |
| "82 Average 1040 392 BURR ST ... 83 $134,145 \n", | |
| "83 Average 1040 384 BURR ST ... 84 $123,528 \n", | |
| "84 Average 1040 374 BURR ST ... 85 $124,145 \n", | |
| "85 Average 988 366 BURR ST ... 86 $129,540 \n", | |
| "86 Average 1536 360 BURR ST ... 87 $114,600 \n", | |
| "87 Average 1434 354 BURR ST ... 88 $147,419 \n", | |
| "88 Average 1352 348 BURR ST ... 89 $142,187 \n", | |
| "89 0 344 BURR ST ... 90 $0 \n", | |
| "90 0 342 BURR ST ... 91 $0 \n", | |
| "91 Average 1624 340 BURR ST ... 92 $156,458 \n", | |
| "92 Average 1274 336 BURR ST ... 93 $164,403 \n", | |
| "93 Average 1040 330 BURR ST ... 94 $160,494 \n", | |
| "94 Average 1040 320 BURR ST ... 95 $149,198 \n", | |
| "95 Average 1560 310 BURR ST ... 96 $179,867 \n", | |
| "96 Average 2880 175 TOWNSEND TER ... 97 $298,980 \n", | |
| "97 Average 2916 155 TOWNSEND TER ... 98 $321,943 \n", | |
| "98 Average 1632 147 TOWNSEND TER ... 99 $208,412 \n", | |
| "\n", | |
| " Sale_Price Size_Acres Style Total_Bedrooms Total_Bthrms \\\n", | |
| "0 $0 3.8 Vacant Land \n", | |
| "1 $0 1.7 Vacant Land \n", | |
| "2 $276,040 0.25 Cape Cod 3 Bedrooms 2 \n", | |
| "3 $140,000 0.18 Colonial 3 Bedrooms 2 \n", | |
| "4 $183,000 0.36 Ranch 3 Bedrooms 1 \n", | |
| "5 $0 0.32 Bungalow 3 Bedrooms 1 \n", | |
| "6 $0 0.35 Cape Cod 5 Bedrooms 2 \n", | |
| "7 $64,000 0.17 Cape Cod 3 Bedrooms 1 \n", | |
| "8 $142,000 0.14 Cape Cod 2 Bedrooms 1 \n", | |
| "9 $157,000 0.2 Ranch 1 Bedroom 1 \n", | |
| "10 $0 0.24 Vacant Land \n", | |
| "11 $0 0.09 Vacant Land \n", | |
| "12 $0 0.43 Vacant Land \n", | |
| "13 $0 0.14 Vacant Land \n", | |
| "14 $0 0.26 Vacant Land \n", | |
| "15 $0 0.03 Vacant Land \n", | |
| "16 $0 0.43 Vacant Land \n", | |
| "17 $0 0.29 Vacant Land \n", | |
| "18 $0 0.99 Vacant Land \n", | |
| "19 $0 8.05 Vacant Land \n", | |
| "20 $0 1 Vacant Land \n", | |
| "21 $0 2.5 Vacant Land \n", | |
| "22 $0 0.16 Cape Cod 2 Bedrooms 1 \n", | |
| "23 $0 0.13 Colonial 3 Bedrooms 2 \n", | |
| "24 $0 0.13 Bungalow 2 Bedrooms 1 \n", | |
| "25 $0 0.13 Bungalow 2 Bedrooms 1 \n", | |
| "26 $245,000 0.12 Colonial 4 Bedrooms 1 \n", | |
| "27 $0 0.27 Vacant Land \n", | |
| "28 $0 0.13 Raised Ranch 3 Bedrooms 1 \n", | |
| "29 $0 0.13 Raised Ranch 3 Bedrooms 1 \n", | |
| ".. ... ... ... ... ... \n", | |
| "69 $0 1.21 Vacant Land \n", | |
| "70 $0 2.61 Outbuildings \n", | |
| "71 $0 3.88 Outbuildings \n", | |
| "72 $0 0.01 Vacant Land \n", | |
| "73 $0 0.62 Vacant Land \n", | |
| "74 $0 0.34 Vacant Land \n", | |
| "75 $0 0.41 Vacant Land \n", | |
| "76 $245,000 0.62 Raised Ranch 4 Bedrooms 2 \n", | |
| "77 $0 0.35 Cape Cod 3 Bedrooms 2 \n", | |
| "78 $0 0.19 Cape Cod 3 Bedrooms 2 \n", | |
| "79 $0 0.24 Cape Cod 3 Bedrooms 2 \n", | |
| "80 $0 0.31 Cape Cod 3 Bedrooms 2 \n", | |
| "81 $0 0.42 Cape Cod 3 Bedrooms 2 \n", | |
| "82 $0 0.18 Ranch 3 Bedrooms 1 \n", | |
| "83 $0 0.24 Ranch 3 Bedrooms 1 \n", | |
| "84 $148,000 0.24 Ranch 3 Bedrooms 1 \n", | |
| "85 $93,280 0.24 Ranch 3 Bedrooms 1 \n", | |
| "86 $0 0.27 2 Family 4 Bedrooms 2 \n", | |
| "87 $129,900 0.24 Colonial 3 Bedrooms 1 \n", | |
| "88 $138,500 0.24 Colonial 3 Bedrooms 1 \n", | |
| "89 $0 0.04 Vacant Land \n", | |
| "90 $0 0.73 Vacant Land \n", | |
| "91 $30,000 0.21 Cape Cod 3 Bedrooms 1 \n", | |
| "92 $0 0.25 Raised Ranch 3 Bedrooms 1 \n", | |
| "93 $0 0.25 Raised Ranch 3 Bedrooms 1 \n", | |
| "94 $0 0.25 Raised Ranch 3 Bedrooms 1 \n", | |
| "95 $0 0.20 Raised Ranch 3 Bedrooms 1 \n", | |
| "96 $330,000 0.19 Split-Level 3 Bedrooms 1 \n", | |
| "97 $0 0.25 Colonial 4 Bedrooms 2 \n", | |
| "98 $138,000 0.29 Split-Level 3 Bedrooms 2 \n", | |
| "\n", | |
| " Total_Half_Baths Year_Built Zone \n", | |
| "0 AIRP \n", | |
| "1 AIRP \n", | |
| "2 1 1950 RS2 \n", | |
| "3 1 1989 RS2 \n", | |
| "4 0 1945 RS2 \n", | |
| "5 0 1940 RS2 \n", | |
| "6 0 1986 RS2 \n", | |
| "7 1 1940 RS2 \n", | |
| "8 0 1930 RS2 \n", | |
| "9 0 1900 RS2 \n", | |
| "10 RS2 \n", | |
| "11 AIRP \n", | |
| "12 RS2 \n", | |
| "13 RS2 \n", | |
| "14 RS2 \n", | |
| "15 RS2 \n", | |
| "16 RS2 \n", | |
| "17 RS2 \n", | |
| "18 RS2 \n", | |
| "19 AIRP \n", | |
| "20 AIRP \n", | |
| "21 AIRP \n", | |
| "22 1 1940 RS2 \n", | |
| "23 0 1940 RS2 \n", | |
| "24 0 1940 RS2 \n", | |
| "25 0 1925 RS2 \n", | |
| "26 0 1940 RS2 \n", | |
| "27 RS2 \n", | |
| "28 0 1975 RS2 \n", | |
| "29 0 1969 RS2 \n", | |
| ".. ... ... ... \n", | |
| "69 AIRP \n", | |
| "70 AIRP \n", | |
| "71 AIRP \n", | |
| "72 RS2 \n", | |
| "73 RS2 \n", | |
| "74 RS2 \n", | |
| "75 RS2 \n", | |
| "76 1 1990 RS2 \n", | |
| "77 1 2009 RS2 \n", | |
| "78 1 2009 RS2 \n", | |
| "79 1 2009 RS2 \n", | |
| "80 1 2009 RS2 \n", | |
| "81 1 2009 RS2 \n", | |
| "82 0 1955 RS2 \n", | |
| "83 0 1957 RS2 \n", | |
| "84 0 1957 RS2 \n", | |
| "85 1 1956 RS2 \n", | |
| "86 0 1930 RS2 \n", | |
| "87 1 1950 RS2 \n", | |
| "88 0 1941 RS2 \n", | |
| "89 RS2 \n", | |
| "90 RS2 \n", | |
| "91 0 1950 RS2 \n", | |
| "92 1 1968 RS2 \n", | |
| "93 0 1964 RS2 \n", | |
| "94 1 1967 RS2 \n", | |
| "95 0 1965 RS2 \n", | |
| "96 1 1968 RS2 \n", | |
| "97 1 1970 RS2 \n", | |
| "98 1 1969 RS2 \n", | |
| "\n", | |
| "[99 rows x 24 columns]" | |
| ] | |
| }, | |
| "execution_count": 10, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "# Parallel?\n", | |
| "from joblib import Parallel, delayed \n", | |
| "import multiprocessing\n", | |
| "\n", | |
| "CORES = multiprocessing.cpu_count()\n", | |
| "\n", | |
| "COLS = set(['Location', 'Appraisal', 'PID', 'Owner', 'Address', 'Sale_Price', 'Year_Built', 'Living_Area', 'Replacement_Cost', 'Building_Percent_Good', 'Size_Acres', 'Zone', 'Neighborhood', 'Appraised_Value'] + \\\n", | |
| " ['Style', 'Model', 'Grade', 'Occupancy', 'AC_Type', 'Total_Bedrooms', 'Total_Bthrms', 'Total_Half_Baths', 'Bath_Style', 'Kitchen_Style'])\n", | |
| "DICT = {key: [] for key in COLS}\n", | |
| "\n", | |
| "def prettify(s):\n", | |
| " \"\"\"Converts strings into nice column names\"\"\"\n", | |
| " s = re.sub(r\"[^\\w\\s]\", '', s) # Remove all non-word characters (everything except numbers and letters)\n", | |
| " s = re.sub(r\"\\s+\", '_', s) # Replace all runs of whitespace with a single dash\n", | |
| " return s\n", | |
| "\n", | |
| "def extract(html):\n", | |
| " \"\"\"Given a html string, extracts all the relevant information and returns a dictionary of items\"\"\"\n", | |
| " items = {}\n", | |
| " soup = BeautifulSoup(html, \"lxml\")\n", | |
| "\n", | |
| " # GENERAL INFORMATION\n", | |
| " dt = soup(\"dt\")\n", | |
| " dd = soup(\"dd\")\n", | |
| " labels = soup(\"td\", class_=\"plabel\")\n", | |
| " data = soup(\"td\", class_=\"data\")\n", | |
| " \n", | |
| " names = [prettify(x.text.strip()) for x in dt] + \\\n", | |
| " [prettify(x.text.replace(\"\\n\", \"\").strip()) for x in labels]\n", | |
| " values = [x.text.strip() for x in dd] + \\\n", | |
| " [x.text.strip() for x in data]\n", | |
| " items.update(dict(zip(names, values)))\n", | |
| " \n", | |
| " # BUILDING\n", | |
| " building_tr = soup.find(id=\"MainContent_ctl01_grdCns\") \\\n", | |
| " .find_all(\"tr\")[1:]\n", | |
| " building = [[td.text for td in tr(\"td\")] for tr in building_tr]\n", | |
| " building = [(prettify(a),b) for a,b in building]\n", | |
| " items.update(dict(building))\n", | |
| " \n", | |
| " # SALES\n", | |
| " people_tr = soup.find(id=\"MainContent_grdSales\") \\\n", | |
| " .find_all(\"tr\", limit=11)[1:]\n", | |
| " people_sales = \";\".join([\",\".join([td.text.replace(\",\",\"\") for td in p.find_all(\"td\")]) for p in people_tr])\n", | |
| " items['sales'] = people_sales\n", | |
| " \n", | |
| " # EXTRAS\n", | |
| " people_tr = soup.find(id=\"MainContent_grdXf\") \\\n", | |
| " .find_all(\"tr\")[1:]\n", | |
| " if len(people_tr) > 0:\n", | |
| " items['extras'] = sum([int(tr.find_all(\"td\")[3].text.replace(\",\", \"\")[1:]) for tr in people_tr])\n", | |
| " \n", | |
| " # GARAGE\n", | |
| " out_tr = soup.find(id=\"MainContent_grdOb\") \\\n", | |
| " .find_all(\"tr\")[1:]\n", | |
| " if len(out_tr) > 0:\n", | |
| " out_bs = [\",\".join([td.text.replace(\",\",\"\") for td in tr.find_all(\"td\")]) for tr in out_tr]\n", | |
| " items['garage'] = sum([int(out_b.split(\",\")[5][1:]) for out_b in out_bs if \"GARAGE\" in out_b])\n", | |
| "\n", | |
| " return(items)\n", | |
| "\n", | |
| "def scrape(fname):\n", | |
| " \"\"\"Scrape a html file, appending the results to DICT\"\"\"\n", | |
| " f = open(\"newdata/\" + fname)\n", | |
| " html = f.read()\n", | |
| " f.close()\n", | |
| "\n", | |
| " # converting <br>'s to \\n, so that the address' newline doesn't get gobbled up\n", | |
| " html = re.sub('<br/?>', '\\n', html)\n", | |
| " items = extract(html)\n", | |
| " for key in COLS:\n", | |
| " if key in items.keys():\n", | |
| " DICT[key].append(items[key])\n", | |
| " else:\n", | |
| " DICT[key].append(None)\n", | |
| "\n", | |
| "files = [str(i) + \".html\" for i in range(1, 100)]\n", | |
| "Parallel(n_jobs=1)(delayed(scrape)(f) for f in files) \n", | |
| "\n", | |
| "DataFrame.from_dict(DICT)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "collapsed": true | |
| }, | |
| "source": [ | |
| "There we have it." | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python 3", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.4.3" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 0 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment