Please download the 2TB dataset from here: https://www.kaggle.com/c/passenger-screening-algorithm-challenge/data
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
:: START BAT FILE | |
sqlplus username/password@host:port/service -i @run.sql | |
:: END BAT FILE | |
--START SQL FILE run.sql | |
SET VERIFY OFF; | |
SET FEEDBACK OFF | |
SET TERMOUT OFF | |
SET PAGESIZE 0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
# Write a function that takes as input two lists Y, P, | |
# and returns the float corresponding to their cross-entropy. | |
def cross_entropy(Y, P): | |
cross_entropy = 0 | |
for i in range(len(Y)): | |
cross_entropy -= Y[i]*np.log(P[i]) + (1-Y[i])*np.log(1-P[i]) | |
return cross_entropy |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
def softmax(L): | |
expL = np.exp(L) | |
sumExpL = sum(expL) | |
result = [] | |
for i in expL: | |
result.append(i*1.0/sumExpL) | |
return result | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
# Setting the random seed, feel free to change it and see different solutions. | |
np.random.seed(42) | |
def stepFunction(t): | |
if t >= 0: | |
return 1 | |
return 0 | |
def prediction(X, W, b): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I don't think that alternative data or its use violates any SEC regulations. E.g., People fly planes with infrared cameras over oil storage tanks to measure and predict output and consumption levels, etc. It is both legal and allowed. People also legally build faster networks to transfer data between exchanges for extra profit. | |
I guess in some ways it is like a country showing up to battle with an army of 'I, Robots' while the other side shows up with M-16s, tanks, etc. There is no Geneva Convention rule for it. People are just finding ways to achieve information dominance over their competitive market. I posted the issue to the class because I saw a gap and thought the purpose of computer ethics was to form frameworks for challenging problems that allow society to move forward. | |
Re: FTC, I've read the fine print for my Visa credit card. Selling my anonymized transaction data is 100% allowed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I appreciate the definitions here. Having a standard body of knowledge and shared understanding is critical for an organization of professionals. Sometimes agreeing on what something is challenging. Reading through and accepting these definitions makes this conformity less of a hurdle. | |
I like how an understanding of objectives is called out in the Data Science Code of Conduct. It is easy to hire robot workers to apply algorithms to data flows and pop out results. It is harder to have engaged partners that can articulate the how and why to senior leaders and include the influence of the tactics and strategies the client organization employes. | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Computer Ethics, Chapter 7 | |
1. #14, applied to 7.1 or 7.2 | |
Carl Adonis has committed to work on this project and see it through to completion. The problem needs to be sorted into the programmatic and technical pieces to help clarify the best path forward. Further, the context of this situation is at a high level of security given this software system is replacing human decision-making process and will launch nuclear missiles automatically in response to a signal. | |
It is assumed Adonis has the mastery of the knowledge needed to identify this issues. He has documented them and presented them to his supervisor. The government is gearing up for the next phase of the effort, and this supervisor has asked for him to channel these inputs into a form that can be taken on in future project work. | |
This feedback seems reasonable, but it doesn't address the technical or program management issues separately. Perhaps, he could say, "Hey boss, I've got these changes documented as you asked, but we really need to sit do |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
As a software developer, I struggled with the idea of external algorithmists. The idea that people will continue to build algorithms and that algorithms will be in use long enough before improving that people will review them seems odd. I guess people may continue to build them, but I will take a bet that the job I do will not be available for my kids. | |
If you think this is all theoretical, I urge you to check out Planning Domain Definition Language (PDDL). It is a way to convert any deterministic problem into a search problem. After gettings hands on experience with PDDL, I realized that PDDL could be used to reproduce many of the decisions I make every day. | |
Improvements in machine performance, like those Google's DeepMind published in Nature, have demonstrated that by using the same patterns, we can train machines to expert level performance across different problem domains. Specifically, reinforcement learning can tune performance iteratively based on new experience, better and faster than we can. | |
Here i |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Drawing on the internet readings of this lesson, what lessons can be extracted that might apply to data science as it applies to private business? | |
This week, we were introduced to the concepts behavior search in the context of data mining. Behavior mining caused me to pause and visualize what the author was saying. My conclusion was that to infer behavior; one would need full-scope access to the same information the individual experiences. | |
I did some research to see how this subject-behavior data mining (contrasted with limited scope search) applies to private business. One huge market is in network communications. Many companies perform SSL traffic inspection. This means that any traffic 'secured' with HTTPS/TLS layer encryption is decrypted and inspected in the same manner as non-encrypted traffic. Simply put, this means that green lock in your browser is just a Maginot Line. | |
When viewing this depth of information, as the reading suggests, other details may emerge that raise subject-behavior |