Created
September 13, 2017 08:14
-
-
Save PyDataBlog/4b577d559533da8c3e958e019b2d4046 to your computer and use it in GitHub Desktop.
A tutorial for scraping information of all the S&P 500 listed companies
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": true | |
}, | |
"source": [ | |
"This tutorial covers how to leverage Pandas' read_csv method to pick up tables from URL links. \n", | |
"Wikipedia has a nice, regularly updated page with the S&P 500 listed companies. \n", | |
"\n", | |
"You can check for more info on pandas.read_csv method here: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html\n", | |
"\n", | |
"Link for S&P 500 table: https://en.wikipedia.org/wiki/List_of_S%26P_500_companies\n", | |
"\n", | |
"Let's start exploring the tables Pandas grabs from the wiki page" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"import pandas as pd" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"data = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": { | |
"scrolled": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style>\n", | |
" .dataframe thead tr:only-child th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: left;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>0</th>\n", | |
" <th>1</th>\n", | |
" <th>2</th>\n", | |
" <th>3</th>\n", | |
" <th>4</th>\n", | |
" <th>5</th>\n", | |
" <th>6</th>\n", | |
" <th>7</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>Ticker symbol</td>\n", | |
" <td>Security</td>\n", | |
" <td>SEC filings</td>\n", | |
" <td>GICS Sector</td>\n", | |
" <td>GICS Sub Industry</td>\n", | |
" <td>Address of Headquarters</td>\n", | |
" <td>Date first added</td>\n", | |
" <td>CIK</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>MMM</td>\n", | |
" <td>3M Company</td>\n", | |
" <td>reports</td>\n", | |
" <td>Industrials</td>\n", | |
" <td>Industrial Conglomerates</td>\n", | |
" <td>St. Paul, Minnesota</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000066740</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>ABT</td>\n", | |
" <td>Abbott Laboratories</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Health Care Equipment</td>\n", | |
" <td>North Chicago, Illinois</td>\n", | |
" <td>1964-03-31</td>\n", | |
" <td>0000001800</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>ABBV</td>\n", | |
" <td>AbbVie Inc.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Pharmaceuticals</td>\n", | |
" <td>North Chicago, Illinois</td>\n", | |
" <td>2012-12-31</td>\n", | |
" <td>0001551152</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>ACN</td>\n", | |
" <td>Accenture plc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>IT Consulting & Other Services</td>\n", | |
" <td>Dublin, Ireland</td>\n", | |
" <td>2011-07-06</td>\n", | |
" <td>0001467373</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" 0 1 2 3 \\\n", | |
"0 Ticker symbol Security SEC filings GICS Sector \n", | |
"1 MMM 3M Company reports Industrials \n", | |
"2 ABT Abbott Laboratories reports Health Care \n", | |
"3 ABBV AbbVie Inc. reports Health Care \n", | |
"4 ACN Accenture plc reports Information Technology \n", | |
"\n", | |
" 4 5 6 \\\n", | |
"0 GICS Sub Industry Address of Headquarters Date first added \n", | |
"1 Industrial Conglomerates St. Paul, Minnesota NaN \n", | |
"2 Health Care Equipment North Chicago, Illinois 1964-03-31 \n", | |
"3 Pharmaceuticals North Chicago, Illinois 2012-12-31 \n", | |
"4 IT Consulting & Other Services Dublin, Ireland 2011-07-06 \n", | |
"\n", | |
" 7 \n", | |
"0 CIK \n", | |
"1 0000066740 \n", | |
"2 0000001800 \n", | |
"3 0001551152 \n", | |
"4 0001467373 " | |
] | |
}, | |
"execution_count": 3, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"table = data[0]\n", | |
"table.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"After checking out the first table with the help of indexing, Pandas returns a nice data frame (kickass Excel) with all the information we require for the exercise. \n", | |
"\n", | |
"However, there's a small catch!\n", | |
"\n", | |
"Notice how the index column ([Ticker symbol, ..... CIK]) is seen as the first row?. The power of Pandas is the ease with which data can be manipulated. Let's quickly correct this by slicing and renaming the columns." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"sliced_table = table[1:]\n", | |
"header = table.iloc[0]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": { | |
"scrolled": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style>\n", | |
" .dataframe thead tr:only-child th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: left;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Ticker symbol</th>\n", | |
" <th>Security</th>\n", | |
" <th>SEC filings</th>\n", | |
" <th>GICS Sector</th>\n", | |
" <th>GICS Sub Industry</th>\n", | |
" <th>Address of Headquarters</th>\n", | |
" <th>Date first added</th>\n", | |
" <th>CIK</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>MMM</td>\n", | |
" <td>3M Company</td>\n", | |
" <td>reports</td>\n", | |
" <td>Industrials</td>\n", | |
" <td>Industrial Conglomerates</td>\n", | |
" <td>St. Paul, Minnesota</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000066740</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>ABT</td>\n", | |
" <td>Abbott Laboratories</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Health Care Equipment</td>\n", | |
" <td>North Chicago, Illinois</td>\n", | |
" <td>1964-03-31</td>\n", | |
" <td>0000001800</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>ABBV</td>\n", | |
" <td>AbbVie Inc.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Pharmaceuticals</td>\n", | |
" <td>North Chicago, Illinois</td>\n", | |
" <td>2012-12-31</td>\n", | |
" <td>0001551152</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>ACN</td>\n", | |
" <td>Accenture plc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>IT Consulting & Other Services</td>\n", | |
" <td>Dublin, Ireland</td>\n", | |
" <td>2011-07-06</td>\n", | |
" <td>0001467373</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>5</th>\n", | |
" <td>ATVI</td>\n", | |
" <td>Activision Blizzard</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Home Entertainment Software</td>\n", | |
" <td>Santa Monica, California</td>\n", | |
" <td>2015-08-31</td>\n", | |
" <td>0000718877</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>6</th>\n", | |
" <td>AYI</td>\n", | |
" <td>Acuity Brands Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Industrials</td>\n", | |
" <td>Electrical Components & Equipment</td>\n", | |
" <td>Atlanta, Georgia</td>\n", | |
" <td>2016-05-03</td>\n", | |
" <td>0001144215</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>7</th>\n", | |
" <td>ADBE</td>\n", | |
" <td>Adobe Systems Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Application Software</td>\n", | |
" <td>San Jose, California</td>\n", | |
" <td>1997-05-05</td>\n", | |
" <td>0000796343</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>8</th>\n", | |
" <td>AMD</td>\n", | |
" <td>Advanced Micro Devices Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Semiconductors</td>\n", | |
" <td>Sunnyvale, California</td>\n", | |
" <td>2017-03-20</td>\n", | |
" <td>0000002488</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>9</th>\n", | |
" <td>AAP</td>\n", | |
" <td>Advance Auto Parts</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Automotive Retail</td>\n", | |
" <td>Roanoke, Virginia</td>\n", | |
" <td>2015-07-09</td>\n", | |
" <td>0001158449</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>10</th>\n", | |
" <td>AES</td>\n", | |
" <td>AES Corp</td>\n", | |
" <td>reports</td>\n", | |
" <td>Utilities</td>\n", | |
" <td>Independent Power Producers & Energy Traders</td>\n", | |
" <td>Arlington, Virginia</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000874761</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>11</th>\n", | |
" <td>AET</td>\n", | |
" <td>Aetna Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Managed Health Care</td>\n", | |
" <td>Hartford, Connecticut</td>\n", | |
" <td>1976-06-30</td>\n", | |
" <td>0001122304</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>12</th>\n", | |
" <td>AMG</td>\n", | |
" <td>Affiliated Managers Group Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Financials</td>\n", | |
" <td>Asset Management & Custody Banks</td>\n", | |
" <td>Beverly, Massachusetts</td>\n", | |
" <td>2014-07-01</td>\n", | |
" <td>0001004434</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>13</th>\n", | |
" <td>AFL</td>\n", | |
" <td>AFLAC Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Financials</td>\n", | |
" <td>Life & Health Insurance</td>\n", | |
" <td>Columbus, Georgia</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000004977</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>14</th>\n", | |
" <td>A</td>\n", | |
" <td>Agilent Technologies Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Health Care Equipment</td>\n", | |
" <td>Santa Clara, California</td>\n", | |
" <td>2000-06-05</td>\n", | |
" <td>0001090872</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>15</th>\n", | |
" <td>APD</td>\n", | |
" <td>Air Products & Chemicals Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Materials</td>\n", | |
" <td>Industrial Gases</td>\n", | |
" <td>Allentown, Pennsylvania</td>\n", | |
" <td>1985-04-30</td>\n", | |
" <td>0000002969</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>16</th>\n", | |
" <td>AKAM</td>\n", | |
" <td>Akamai Technologies Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Internet Software & Services</td>\n", | |
" <td>Cambridge, Massachusetts</td>\n", | |
" <td>2007-07-12</td>\n", | |
" <td>0001086222</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>17</th>\n", | |
" <td>ALK</td>\n", | |
" <td>Alaska Air Group Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Industrials</td>\n", | |
" <td>Airlines</td>\n", | |
" <td>Seattle, Washington</td>\n", | |
" <td>2016-05-13</td>\n", | |
" <td>0000766421</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>18</th>\n", | |
" <td>ALB</td>\n", | |
" <td>Albemarle Corp</td>\n", | |
" <td>reports</td>\n", | |
" <td>Materials</td>\n", | |
" <td>Specialty Chemicals</td>\n", | |
" <td>Baton Rouge, Louisiana</td>\n", | |
" <td>2016-07-01</td>\n", | |
" <td>0000915913</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>19</th>\n", | |
" <td>ARE</td>\n", | |
" <td>Alexandria Real Estate Equities Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Real Estate</td>\n", | |
" <td>Office REITs</td>\n", | |
" <td>Pasadena, California</td>\n", | |
" <td>2017-03-20</td>\n", | |
" <td>0001035443</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>20</th>\n", | |
" <td>ALXN</td>\n", | |
" <td>Alexion Pharmaceuticals</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Biotechnology</td>\n", | |
" <td>Cheshire, Connecticut</td>\n", | |
" <td>2012-05-25</td>\n", | |
" <td>0000899866</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>21</th>\n", | |
" <td>ALGN</td>\n", | |
" <td>Align Technology</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Health Care Supplies</td>\n", | |
" <td>San Jose, California</td>\n", | |
" <td>2017-06-19</td>\n", | |
" <td>0001097149</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>22</th>\n", | |
" <td>ALLE</td>\n", | |
" <td>Allegion</td>\n", | |
" <td>reports</td>\n", | |
" <td>Industrials</td>\n", | |
" <td>Building Products</td>\n", | |
" <td>Dublin, Ireland</td>\n", | |
" <td>2013-12-02</td>\n", | |
" <td>0001579241</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>23</th>\n", | |
" <td>AGN</td>\n", | |
" <td>Allergan, Plc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Pharmaceuticals</td>\n", | |
" <td>Dublin, Ireland</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000884629</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>24</th>\n", | |
" <td>ADS</td>\n", | |
" <td>Alliance Data Systems</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Data Processing & Outsourced Services</td>\n", | |
" <td>Plano, Texas</td>\n", | |
" <td>2013-12-23</td>\n", | |
" <td>0001101215</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>25</th>\n", | |
" <td>LNT</td>\n", | |
" <td>Alliant Energy Corp</td>\n", | |
" <td>reports</td>\n", | |
" <td>Utilities</td>\n", | |
" <td>Electric Utilities</td>\n", | |
" <td>Madison, Wisconsin</td>\n", | |
" <td>2016-07-01</td>\n", | |
" <td>0000352541</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>26</th>\n", | |
" <td>ALL</td>\n", | |
" <td>Allstate Corp</td>\n", | |
" <td>reports</td>\n", | |
" <td>Financials</td>\n", | |
" <td>Property & Casualty Insurance</td>\n", | |
" <td>Northfield Township, Illinois</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000899051</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>27</th>\n", | |
" <td>GOOGL</td>\n", | |
" <td>Alphabet Inc Class A</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Internet Software & Services</td>\n", | |
" <td>Mountain View, California</td>\n", | |
" <td>2014-04-03</td>\n", | |
" <td>0001652044</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>28</th>\n", | |
" <td>GOOG</td>\n", | |
" <td>Alphabet Inc Class C</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Internet Software & Services</td>\n", | |
" <td>Mountain View, California</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0001652044</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>29</th>\n", | |
" <td>MO</td>\n", | |
" <td>Altria Group Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Staples</td>\n", | |
" <td>Tobacco</td>\n", | |
" <td>Richmond, Virginia</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000764180</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>30</th>\n", | |
" <td>AMZN</td>\n", | |
" <td>Amazon.com Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Internet & Direct Marketing Retail</td>\n", | |
" <td>Seattle, Washington</td>\n", | |
" <td>2005-11-18</td>\n", | |
" <td>0001018724</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>...</th>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>476</th>\n", | |
" <td>VIAB</td>\n", | |
" <td>Viacom Inc.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Cable & Satellite</td>\n", | |
" <td>New York, New York</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0001339947</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>477</th>\n", | |
" <td>V</td>\n", | |
" <td>Visa Inc.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Internet Software & Services</td>\n", | |
" <td>San Francisco, California</td>\n", | |
" <td>2009-12-21</td>\n", | |
" <td>0001403161</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>478</th>\n", | |
" <td>VNO</td>\n", | |
" <td>Vornado Realty Trust</td>\n", | |
" <td>reports</td>\n", | |
" <td>Real Estate</td>\n", | |
" <td>Office REITs</td>\n", | |
" <td>New York, New York</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000899689</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>479</th>\n", | |
" <td>VMC</td>\n", | |
" <td>Vulcan Materials</td>\n", | |
" <td>reports</td>\n", | |
" <td>Materials</td>\n", | |
" <td>Construction Materials</td>\n", | |
" <td>Birmingham, Alabama</td>\n", | |
" <td>1999-06-30</td>\n", | |
" <td>0001396009</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>480</th>\n", | |
" <td>WMT</td>\n", | |
" <td>Wal-Mart Stores</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Staples</td>\n", | |
" <td>Hypermarkets & Super Centers</td>\n", | |
" <td>Bentonville, Arkansas</td>\n", | |
" <td>1982-08-31</td>\n", | |
" <td>0000104169</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>481</th>\n", | |
" <td>WBA</td>\n", | |
" <td>Walgreens Boots Alliance</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Staples</td>\n", | |
" <td>Drug Retail</td>\n", | |
" <td>Deerfield, Illinois</td>\n", | |
" <td>1979-12-31</td>\n", | |
" <td>0001618921</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>482</th>\n", | |
" <td>DIS</td>\n", | |
" <td>The Walt Disney Company</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Cable & Satellite</td>\n", | |
" <td>Burbank, California</td>\n", | |
" <td>1976-06-30</td>\n", | |
" <td>0001001039</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>483</th>\n", | |
" <td>WM</td>\n", | |
" <td>Waste Management Inc.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Industrials</td>\n", | |
" <td>Environmental & Facilities Services</td>\n", | |
" <td>Houston, Texas</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000823768</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>484</th>\n", | |
" <td>WAT</td>\n", | |
" <td>Waters Corporation</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Health Care Distributors</td>\n", | |
" <td>Milford, Massachusetts</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0001000697</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>485</th>\n", | |
" <td>WEC</td>\n", | |
" <td>Wec Energy Group Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Utilities</td>\n", | |
" <td>Electric Utilities</td>\n", | |
" <td>Milwaukee, Wisconsin</td>\n", | |
" <td>2008-10-31</td>\n", | |
" <td>0000783325</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>486</th>\n", | |
" <td>WFC</td>\n", | |
" <td>Wells Fargo</td>\n", | |
" <td>reports</td>\n", | |
" <td>Financials</td>\n", | |
" <td>Diversified Banks</td>\n", | |
" <td>San Francisco, California</td>\n", | |
" <td>1976-06-30</td>\n", | |
" <td>0000072971</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>487</th>\n", | |
" <td>HCN</td>\n", | |
" <td>Welltower Inc.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Real Estate</td>\n", | |
" <td>Health Care REITs</td>\n", | |
" <td>Toledo, Ohio</td>\n", | |
" <td>2009-01-30</td>\n", | |
" <td>0000766704</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>488</th>\n", | |
" <td>WDC</td>\n", | |
" <td>Western Digital</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Technology Hardware, Storage & Peripherals</td>\n", | |
" <td>Irvine, California</td>\n", | |
" <td>2009-07-01</td>\n", | |
" <td>0000106040</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>489</th>\n", | |
" <td>WU</td>\n", | |
" <td>Western Union Co</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Internet Software & Services</td>\n", | |
" <td>Englewood, Colorado</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0001365135</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>490</th>\n", | |
" <td>WRK</td>\n", | |
" <td>WestRock Company</td>\n", | |
" <td>reports</td>\n", | |
" <td>Materials</td>\n", | |
" <td>Paper Packaging</td>\n", | |
" <td>Richmond, Virginia</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0001636023</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>491</th>\n", | |
" <td>WY</td>\n", | |
" <td>Weyerhaeuser Corp.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Real Estate</td>\n", | |
" <td>Specialized REITs</td>\n", | |
" <td>Federal Way, Washington</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000106535</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>492</th>\n", | |
" <td>WHR</td>\n", | |
" <td>Whirlpool Corp.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Household Appliances</td>\n", | |
" <td>Benton Harbor, Michigan</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000106640</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>493</th>\n", | |
" <td>WMB</td>\n", | |
" <td>Williams Cos.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Energy</td>\n", | |
" <td>Oil & Gas Storage & Transportation</td>\n", | |
" <td>Tulsa, Oklahoma</td>\n", | |
" <td>1975-03-31</td>\n", | |
" <td>0000107263</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>494</th>\n", | |
" <td>WLTW</td>\n", | |
" <td>Willis Towers Watson</td>\n", | |
" <td>reports</td>\n", | |
" <td>Financials</td>\n", | |
" <td>Insurance Brokers</td>\n", | |
" <td>London, United Kingdom</td>\n", | |
" <td>2016-01-05</td>\n", | |
" <td>0001140536</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>495</th>\n", | |
" <td>WYN</td>\n", | |
" <td>Wyndham Worldwide</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Hotels, Resorts & Cruise Lines</td>\n", | |
" <td>Parsippany, New Jersey</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0001361658</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>496</th>\n", | |
" <td>WYNN</td>\n", | |
" <td>Wynn Resorts Ltd</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Casinos & Gaming</td>\n", | |
" <td>Paradise, Nevada</td>\n", | |
" <td>2008-11-14</td>\n", | |
" <td>0001174922</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>497</th>\n", | |
" <td>XEL</td>\n", | |
" <td>Xcel Energy Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Utilities</td>\n", | |
" <td>Multi-Utilities</td>\n", | |
" <td>Minneapolis, Minnesota</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000072903</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>498</th>\n", | |
" <td>XRX</td>\n", | |
" <td>Xerox Corp.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Technology Hardware, Storage & Peripherals</td>\n", | |
" <td>Norwalk, Connecticut</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000108772</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>499</th>\n", | |
" <td>XLNX</td>\n", | |
" <td>Xilinx Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Information Technology</td>\n", | |
" <td>Semiconductors</td>\n", | |
" <td>San Jose, California</td>\n", | |
" <td>1999-11-08</td>\n", | |
" <td>0000743988</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>500</th>\n", | |
" <td>XL</td>\n", | |
" <td>XL Capital</td>\n", | |
" <td>reports</td>\n", | |
" <td>Financials</td>\n", | |
" <td>Property & Casualty Insurance</td>\n", | |
" <td>Hamilton, Bermuda</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000875159</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>501</th>\n", | |
" <td>XYL</td>\n", | |
" <td>Xylem Inc.</td>\n", | |
" <td>reports</td>\n", | |
" <td>Industrials</td>\n", | |
" <td>Industrial Machinery</td>\n", | |
" <td>White Plains, New York</td>\n", | |
" <td>2011-11-01</td>\n", | |
" <td>0001524472</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>502</th>\n", | |
" <td>YUM</td>\n", | |
" <td>Yum! Brands Inc</td>\n", | |
" <td>reports</td>\n", | |
" <td>Consumer Discretionary</td>\n", | |
" <td>Restaurants</td>\n", | |
" <td>Louisville, Kentucky</td>\n", | |
" <td>1997-10-06</td>\n", | |
" <td>0001041061</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>503</th>\n", | |
" <td>ZBH</td>\n", | |
" <td>Zimmer Biomet Holdings</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Health Care Equipment</td>\n", | |
" <td>Warsaw, Indiana</td>\n", | |
" <td>2001-08-07</td>\n", | |
" <td>0001136869</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>504</th>\n", | |
" <td>ZION</td>\n", | |
" <td>Zions Bancorp</td>\n", | |
" <td>reports</td>\n", | |
" <td>Financials</td>\n", | |
" <td>Regional Banks</td>\n", | |
" <td>Salt Lake City, Utah</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0000109380</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>505</th>\n", | |
" <td>ZTS</td>\n", | |
" <td>Zoetis</td>\n", | |
" <td>reports</td>\n", | |
" <td>Health Care</td>\n", | |
" <td>Pharmaceuticals</td>\n", | |
" <td>Florham Park, New Jersey</td>\n", | |
" <td>2013-06-21</td>\n", | |
" <td>0001555280</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>505 rows × 8 columns</p>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Ticker symbol Security SEC filings \\\n", | |
"1 MMM 3M Company reports \n", | |
"2 ABT Abbott Laboratories reports \n", | |
"3 ABBV AbbVie Inc. reports \n", | |
"4 ACN Accenture plc reports \n", | |
"5 ATVI Activision Blizzard reports \n", | |
"6 AYI Acuity Brands Inc reports \n", | |
"7 ADBE Adobe Systems Inc reports \n", | |
"8 AMD Advanced Micro Devices Inc reports \n", | |
"9 AAP Advance Auto Parts reports \n", | |
"10 AES AES Corp reports \n", | |
"11 AET Aetna Inc reports \n", | |
"12 AMG Affiliated Managers Group Inc reports \n", | |
"13 AFL AFLAC Inc reports \n", | |
"14 A Agilent Technologies Inc reports \n", | |
"15 APD Air Products & Chemicals Inc reports \n", | |
"16 AKAM Akamai Technologies Inc reports \n", | |
"17 ALK Alaska Air Group Inc reports \n", | |
"18 ALB Albemarle Corp reports \n", | |
"19 ARE Alexandria Real Estate Equities Inc reports \n", | |
"20 ALXN Alexion Pharmaceuticals reports \n", | |
"21 ALGN Align Technology reports \n", | |
"22 ALLE Allegion reports \n", | |
"23 AGN Allergan, Plc reports \n", | |
"24 ADS Alliance Data Systems reports \n", | |
"25 LNT Alliant Energy Corp reports \n", | |
"26 ALL Allstate Corp reports \n", | |
"27 GOOGL Alphabet Inc Class A reports \n", | |
"28 GOOG Alphabet Inc Class C reports \n", | |
"29 MO Altria Group Inc reports \n", | |
"30 AMZN Amazon.com Inc reports \n", | |
".. ... ... ... \n", | |
"476 VIAB Viacom Inc. reports \n", | |
"477 V Visa Inc. reports \n", | |
"478 VNO Vornado Realty Trust reports \n", | |
"479 VMC Vulcan Materials reports \n", | |
"480 WMT Wal-Mart Stores reports \n", | |
"481 WBA Walgreens Boots Alliance reports \n", | |
"482 DIS The Walt Disney Company reports \n", | |
"483 WM Waste Management Inc. reports \n", | |
"484 WAT Waters Corporation reports \n", | |
"485 WEC Wec Energy Group Inc reports \n", | |
"486 WFC Wells Fargo reports \n", | |
"487 HCN Welltower Inc. reports \n", | |
"488 WDC Western Digital reports \n", | |
"489 WU Western Union Co reports \n", | |
"490 WRK WestRock Company reports \n", | |
"491 WY Weyerhaeuser Corp. reports \n", | |
"492 WHR Whirlpool Corp. reports \n", | |
"493 WMB Williams Cos. reports \n", | |
"494 WLTW Willis Towers Watson reports \n", | |
"495 WYN Wyndham Worldwide reports \n", | |
"496 WYNN Wynn Resorts Ltd reports \n", | |
"497 XEL Xcel Energy Inc reports \n", | |
"498 XRX Xerox Corp. reports \n", | |
"499 XLNX Xilinx Inc reports \n", | |
"500 XL XL Capital reports \n", | |
"501 XYL Xylem Inc. reports \n", | |
"502 YUM Yum! Brands Inc reports \n", | |
"503 ZBH Zimmer Biomet Holdings reports \n", | |
"504 ZION Zions Bancorp reports \n", | |
"505 ZTS Zoetis reports \n", | |
"\n", | |
" GICS Sector GICS Sub Industry \\\n", | |
"1 Industrials Industrial Conglomerates \n", | |
"2 Health Care Health Care Equipment \n", | |
"3 Health Care Pharmaceuticals \n", | |
"4 Information Technology IT Consulting & Other Services \n", | |
"5 Information Technology Home Entertainment Software \n", | |
"6 Industrials Electrical Components & Equipment \n", | |
"7 Information Technology Application Software \n", | |
"8 Information Technology Semiconductors \n", | |
"9 Consumer Discretionary Automotive Retail \n", | |
"10 Utilities Independent Power Producers & Energy Traders \n", | |
"11 Health Care Managed Health Care \n", | |
"12 Financials Asset Management & Custody Banks \n", | |
"13 Financials Life & Health Insurance \n", | |
"14 Health Care Health Care Equipment \n", | |
"15 Materials Industrial Gases \n", | |
"16 Information Technology Internet Software & Services \n", | |
"17 Industrials Airlines \n", | |
"18 Materials Specialty Chemicals \n", | |
"19 Real Estate Office REITs \n", | |
"20 Health Care Biotechnology \n", | |
"21 Health Care Health Care Supplies \n", | |
"22 Industrials Building Products \n", | |
"23 Health Care Pharmaceuticals \n", | |
"24 Information Technology Data Processing & Outsourced Services \n", | |
"25 Utilities Electric Utilities \n", | |
"26 Financials Property & Casualty Insurance \n", | |
"27 Information Technology Internet Software & Services \n", | |
"28 Information Technology Internet Software & Services \n", | |
"29 Consumer Staples Tobacco \n", | |
"30 Consumer Discretionary Internet & Direct Marketing Retail \n", | |
".. ... ... \n", | |
"476 Consumer Discretionary Cable & Satellite \n", | |
"477 Information Technology Internet Software & Services \n", | |
"478 Real Estate Office REITs \n", | |
"479 Materials Construction Materials \n", | |
"480 Consumer Staples Hypermarkets & Super Centers \n", | |
"481 Consumer Staples Drug Retail \n", | |
"482 Consumer Discretionary Cable & Satellite \n", | |
"483 Industrials Environmental & Facilities Services \n", | |
"484 Health Care Health Care Distributors \n", | |
"485 Utilities Electric Utilities \n", | |
"486 Financials Diversified Banks \n", | |
"487 Real Estate Health Care REITs \n", | |
"488 Information Technology Technology Hardware, Storage & Peripherals \n", | |
"489 Information Technology Internet Software & Services \n", | |
"490 Materials Paper Packaging \n", | |
"491 Real Estate Specialized REITs \n", | |
"492 Consumer Discretionary Household Appliances \n", | |
"493 Energy Oil & Gas Storage & Transportation \n", | |
"494 Financials Insurance Brokers \n", | |
"495 Consumer Discretionary Hotels, Resorts & Cruise Lines \n", | |
"496 Consumer Discretionary Casinos & Gaming \n", | |
"497 Utilities Multi-Utilities \n", | |
"498 Information Technology Technology Hardware, Storage & Peripherals \n", | |
"499 Information Technology Semiconductors \n", | |
"500 Financials Property & Casualty Insurance \n", | |
"501 Industrials Industrial Machinery \n", | |
"502 Consumer Discretionary Restaurants \n", | |
"503 Health Care Health Care Equipment \n", | |
"504 Financials Regional Banks \n", | |
"505 Health Care Pharmaceuticals \n", | |
"\n", | |
" Address of Headquarters Date first added CIK \n", | |
"1 St. Paul, Minnesota NaN 0000066740 \n", | |
"2 North Chicago, Illinois 1964-03-31 0000001800 \n", | |
"3 North Chicago, Illinois 2012-12-31 0001551152 \n", | |
"4 Dublin, Ireland 2011-07-06 0001467373 \n", | |
"5 Santa Monica, California 2015-08-31 0000718877 \n", | |
"6 Atlanta, Georgia 2016-05-03 0001144215 \n", | |
"7 San Jose, California 1997-05-05 0000796343 \n", | |
"8 Sunnyvale, California 2017-03-20 0000002488 \n", | |
"9 Roanoke, Virginia 2015-07-09 0001158449 \n", | |
"10 Arlington, Virginia NaN 0000874761 \n", | |
"11 Hartford, Connecticut 1976-06-30 0001122304 \n", | |
"12 Beverly, Massachusetts 2014-07-01 0001004434 \n", | |
"13 Columbus, Georgia NaN 0000004977 \n", | |
"14 Santa Clara, California 2000-06-05 0001090872 \n", | |
"15 Allentown, Pennsylvania 1985-04-30 0000002969 \n", | |
"16 Cambridge, Massachusetts 2007-07-12 0001086222 \n", | |
"17 Seattle, Washington 2016-05-13 0000766421 \n", | |
"18 Baton Rouge, Louisiana 2016-07-01 0000915913 \n", | |
"19 Pasadena, California 2017-03-20 0001035443 \n", | |
"20 Cheshire, Connecticut 2012-05-25 0000899866 \n", | |
"21 San Jose, California 2017-06-19 0001097149 \n", | |
"22 Dublin, Ireland 2013-12-02 0001579241 \n", | |
"23 Dublin, Ireland NaN 0000884629 \n", | |
"24 Plano, Texas 2013-12-23 0001101215 \n", | |
"25 Madison, Wisconsin 2016-07-01 0000352541 \n", | |
"26 Northfield Township, Illinois NaN 0000899051 \n", | |
"27 Mountain View, California 2014-04-03 0001652044 \n", | |
"28 Mountain View, California NaN 0001652044 \n", | |
"29 Richmond, Virginia NaN 0000764180 \n", | |
"30 Seattle, Washington 2005-11-18 0001018724 \n", | |
".. ... ... ... \n", | |
"476 New York, New York NaN 0001339947 \n", | |
"477 San Francisco, California 2009-12-21 0001403161 \n", | |
"478 New York, New York NaN 0000899689 \n", | |
"479 Birmingham, Alabama 1999-06-30 0001396009 \n", | |
"480 Bentonville, Arkansas 1982-08-31 0000104169 \n", | |
"481 Deerfield, Illinois 1979-12-31 0001618921 \n", | |
"482 Burbank, California 1976-06-30 0001001039 \n", | |
"483 Houston, Texas NaN 0000823768 \n", | |
"484 Milford, Massachusetts NaN 0001000697 \n", | |
"485 Milwaukee, Wisconsin 2008-10-31 0000783325 \n", | |
"486 San Francisco, California 1976-06-30 0000072971 \n", | |
"487 Toledo, Ohio 2009-01-30 0000766704 \n", | |
"488 Irvine, California 2009-07-01 0000106040 \n", | |
"489 Englewood, Colorado NaN 0001365135 \n", | |
"490 Richmond, Virginia NaN 0001636023 \n", | |
"491 Federal Way, Washington NaN 0000106535 \n", | |
"492 Benton Harbor, Michigan NaN 0000106640 \n", | |
"493 Tulsa, Oklahoma 1975-03-31 0000107263 \n", | |
"494 London, United Kingdom 2016-01-05 0001140536 \n", | |
"495 Parsippany, New Jersey NaN 0001361658 \n", | |
"496 Paradise, Nevada 2008-11-14 0001174922 \n", | |
"497 Minneapolis, Minnesota NaN 0000072903 \n", | |
"498 Norwalk, Connecticut NaN 0000108772 \n", | |
"499 San Jose, California 1999-11-08 0000743988 \n", | |
"500 Hamilton, Bermuda NaN 0000875159 \n", | |
"501 White Plains, New York 2011-11-01 0001524472 \n", | |
"502 Louisville, Kentucky 1997-10-06 0001041061 \n", | |
"503 Warsaw, Indiana 2001-08-07 0001136869 \n", | |
"504 Salt Lake City, Utah NaN 0000109380 \n", | |
"505 Florham Park, New Jersey 2013-06-21 0001555280 \n", | |
"\n", | |
"[505 rows x 8 columns]" | |
] | |
}, | |
"execution_count": 5, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"corrected_table = sliced_table.rename(columns=header)\n", | |
"corrected_table" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Great! Now the table has been cleaned and for whatever extration we require of it. Let's say make a list of all the ticker symbols of the S&P 500 companies. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"['MMM', 'ABT', 'ABBV', 'ACN', 'ATVI', 'AYI', 'ADBE', 'AMD', 'AAP', 'AES', 'AET', 'AMG', 'AFL', 'A', 'APD', 'AKAM', 'ALK', 'ALB', 'ARE', 'ALXN', 'ALGN', 'ALLE', 'AGN', 'ADS', 'LNT', 'ALL', 'GOOGL', 'GOOG', 'MO', 'AMZN', 'AEE', 'AAL', 'AEP', 'AXP', 'AIG', 'AMT', 'AWK', 'AMP', 'ABC', 'AME', 'AMGN', 'APH', 'APC', 'ADI', 'ANDV', 'ANSS', 'ANTM', 'AON', 'AOS', 'APA', 'AIV', 'AAPL', 'AMAT', 'ADM', 'ARNC', 'AJG', 'AIZ', 'T', 'ADSK', 'ADP', 'AZO', 'AVB', 'AVY', 'BHGE', 'BLL', 'BAC', 'BK', 'BCR', 'BAX', 'BBT', 'BDX', 'BRK.B', 'BBY', 'BIIB', 'BLK', 'HRB', 'BA', 'BWA', 'BXP', 'BSX', 'BHF', 'BMY', 'AVGO', 'BF.B', 'CHRW', 'CA', 'COG', 'CPB', 'COF', 'CAH', 'CBOE', 'KMX', 'CCL', 'CAT', 'CBG', 'CBS', 'CELG', 'CNC', 'CNP', 'CTL', 'CERN', 'CF', 'SCHW', 'CHTR', 'CHK', 'CVX', 'CMG', 'CB', 'CHD', 'CI', 'XEC', 'CINF', 'CTAS', 'CSCO', 'C', 'CFG', 'CTXS', 'CLX', 'CME', 'CMS', 'COH', 'KO', 'CTSH', 'CL', 'CMCSA', 'CMA', 'CAG', 'CXO', 'COP', 'ED', 'STZ', 'COO', 'GLW', 'COST', 'COTY', 'CCI', 'CSRA', 'CSX', 'CMI', 'CVS', 'DHI', 'DHR', 'DRI', 'DVA', 'DE', 'DLPH', 'DAL', 'XRAY', 'DVN', 'DLR', 'DFS', 'DISCA', 'DISCK', 'DISH', 'DG', 'DLTR', 'D', 'DOV', 'DWDP', 'DPS', 'DTE', 'DRE', 'DUK', 'DXC', 'ETFC', 'EMN', 'ETN', 'EBAY', 'ECL', 'EIX', 'EW', 'EA', 'EMR', 'ETR', 'EVHC', 'EOG', 'EQT', 'EFX', 'EQIX', 'EQR', 'ESS', 'EL', 'ES', 'RE', 'EXC', 'EXPE', 'EXPD', 'ESRX', 'EXR', 'XOM', 'FFIV', 'FB', 'FAST', 'FRT', 'FDX', 'FIS', 'FITB', 'FE', 'FISV', 'FLIR', 'FLS', 'FLR', 'FMC', 'FL', 'F', 'FTV', 'FBHS', 'BEN', 'FCX', 'GPS', 'GRMN', 'IT', 'GD', 'GE', 'GGP', 'GIS', 'GM', 'GPC', 'GILD', 'GPN', 'GS', 'GT', 'GWW', 'HAL', 'HBI', 'HOG', 'HRS', 'HIG', 'HAS', 'HCA', 'HCP', 'HP', 'HSIC', 'HSY', 'HES', 'HPE', 'HLT', 'HOLX', 'HD', 'HON', 'HRL', 'HST', 'HPQ', 'HUM', 'HBAN', 'IDXX', 'INFO', 'ITW', 'ILMN', 'IR', 'INTC', 'ICE', 'IBM', 'INCY', 'IP', 'IPG', 'IFF', 'INTU', 'ISRG', 'IVZ', 'IRM', 'JEC', 'JBHT', 'SJM', 'JNJ', 'JCI', 'JPM', 'JNPR', 'KSU', 'K', 'KEY', 'KMB', 'KIM', 'KMI', 'KLAC', 'KSS', 'KHC', 'KR', 'LB', 'LLL', 'LH', 'LRCX', 'LEG', 'LEN', 'LVLT', 'LUK', 'LLY', 'LNC', 'LKQ', 'LMT', 'L', 'LOW', 'LYB', 'MTB', 'MAC', 'M', 'MRO', 'MPC', 'MAR', 'MMC', 'MLM', 'MAS', 'MA', 'MAT', 'MKC', 'MCD', 'MCK', 'MDT', 'MRK', 'MET', 'MTD', 'MGM', 'KORS', 'MCHP', 'MU', 'MSFT', 'MAA', 'MHK', 'TAP', 'MDLZ', 'MON', 'MNST', 'MCO', 'MS', 'MOS', 'MSI', 'MYL', 'NDAQ', 'NOV', 'NAVI', 'NTAP', 'NFLX', 'NWL', 'NFX', 'NEM', 'NWSA', 'NWS', 'NEE', 'NLSN', 'NKE', 'NI', 'NBL', 'JWN', 'NSC', 'NTRS', 'NOC', 'NRG', 'NUE', 'NVDA', 'ORLY', 'OXY', 'OMC', 'OKE', 'ORCL', 'PCAR', 'PKG', 'PH', 'PDCO', 'PAYX', 'PYPL', 'PNR', 'PBCT', 'PEP', 'PKI', 'PRGO', 'PFE', 'PCG', 'PM', 'PSX', 'PNW', 'PXD', 'PNC', 'RL', 'PPG', 'PPL', 'PX', 'PCLN', 'PFG', 'PG', 'PGR', 'PLD', 'PRU', 'PEG', 'PSA', 'PHM', 'PVH', 'QRVO', 'PWR', 'QCOM', 'DGX', 'Q', 'RRC', 'RJF', 'RTN', 'O', 'RHT', 'REG', 'REGN', 'RF', 'RSG', 'RMD', 'RHI', 'ROK', 'COL', 'ROP', 'ROST', 'RCL', 'CRM', 'SBAC', 'SCG', 'SLB', 'SNI', 'STX', 'SEE', 'SRE', 'SHW', 'SIG', 'SPG', 'SWKS', 'SLG', 'SNA', 'SO', 'LUV', 'SPGI', 'SWK', 'SPLS', 'SBUX', 'STT', 'SRCL', 'SYK', 'STI', 'SYMC', 'SYF', 'SNPS', 'SYY', 'TROW', 'TGT', 'TEL', 'FTI', 'TXN', 'TXT', 'TMO', 'TIF', 'TWX', 'TJX', 'TMK', 'TSS', 'TSCO', 'TDG', 'TRV', 'TRIP', 'FOXA', 'FOX', 'TSN', 'UDR', 'ULTA', 'USB', 'UA', 'UAA', 'UNP', 'UAL', 'UNH', 'UPS', 'URI', 'UTX', 'UHS', 'UNM', 'VFC', 'VLO', 'VAR', 'VTR', 'VRSN', 'VRSK', 'VZ', 'VRTX', 'VIAB', 'V', 'VNO', 'VMC', 'WMT', 'WBA', 'DIS', 'WM', 'WAT', 'WEC', 'WFC', 'HCN', 'WDC', 'WU', 'WRK', 'WY', 'WHR', 'WMB', 'WLTW', 'WYN', 'WYNN', 'XEL', 'XRX', 'XLNX', 'XL', 'XYL', 'YUM', 'ZBH', 'ZION', 'ZTS']\n" | |
] | |
} | |
], | |
"source": [ | |
"tickers = corrected_table['Ticker symbol'].tolist()\n", | |
"print (tickers)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": true | |
}, | |
"source": [ | |
"# Go ahead and play with the cleaned data. You only get better with practice which exposes you to various methods. Happy coding!" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.2" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 1 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment