Skip to content

Instantly share code, notes, and snippets.

View samlexrod's full-sized avatar
🎯
Focusing

Samuel Rodriguez samlexrod

🎯
Focusing
View GitHub Profile
@samlexrod
samlexrod / SF_OpenMapProject.md
Last active December 9, 2017 22:16
Area of the world in https://www.openstreetmap.org and data munging techniques, such as assessing the quality of the data for validity, accuracy, completeness, consistency and uniformity.
@samlexrod
samlexrod / Rideshare_scenario.md
Last active December 10, 2017 08:39
Inspired to create a ride-share data model and queries

Rideshare Scenario - PostgreSQL

After doing a case study, I got inspired to continue building up the data model and create more queries to investigate the made-up dataset.

You will find some modified questions asked and other customized questions I made to help on making the queries.

Query Examples

The Red Wine Journey by Samuel Rodriguez - R

========================================================

Selecting the best red wine is a tricky one. There are many variables that can make the quality of a red wine the best or the worst.

This analysis is designed to explore the factors that contribute to a bad or good red wine. The dataset includes only physicochemical variables to predict the sensory output. The sensory output is the quality of a red wine measured subjectively by at least 3 wine experts with a rating from 0 (very bad) to 10 (very excellent).

The purpose is to create a model that can differentiate a good red wine from a bad red wine using attribute information within the data provided. Let's start exploring!

@samlexrod
samlexrod / Useful Codes and Source.md
Last active March 12, 2018 22:05
Here is a list of useful codes found online. I will add more as I find problems and do more research

Anaconda Distribution

Adding Extensions to Jupyter Notebooks

  • To install notebook extensions, type the following commands and then relaunch jupyter notebook from your shell:
conda install -c conda-forge jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
@samlexrod
samlexrod / PostgreSQL Challenges Completed.md
Last active October 28, 2019 20:59
Here I show my commitment to challenge myself to building better queries from places such as codewars, hacker rank, or others.

PostgreSQL Challenges

  1. You need to build a pivot table WITHOUT using CROSSTAB function. Having two tables products and details you need to select a pivot table of products with counts of details occurrences (possible details values are ['good', 'ok', 'bad'].
SELECT
  distinct pro.name,
  (SELECT count(*) FROM details det2 WHERE det2.detail = 'good' AND det2.product_id = det.product_id) AS good,
  (SELECT count(*) FROM details det2 WHERE det2.detail = 'ok' AND det2.product_id = det.product_id) AS ok,
 (SELECT count(*) FROM details det2 WHERE det2.detail = 'bad' AND det2.product_id = det.product_id) AS bad
@samlexrod
samlexrod / Setting up Working Environment.md
Last active March 12, 2018 22:28
Here I show how I set up my working environment, one, to show how I do it, and two, to remember how I do it just in case. :-)

Setting my Work Environment

  1. Install Anaconda Distribution
  2. Installing Spyder
conda install spyder

1. Enable interactive 3d plots in Spyder by going to Tools > Preferences > IPython Console > Graphics. 
From this page, set Backend to Automatic.
2. When you run scripts, if they load up other files (such as datasets or other scripts), 
@samlexrod
samlexrod / My R Statistics Tutorial.md
Last active March 14, 2018 23:10
Here is my R Statistics Tutorial for those interested in learning Basic Statistics with R. I will be developing the structure of the lecture as I come with new ideas and work on the drafts. I will explain statistical concepts in dept as well as the code utilized in R.

Understanding the Cumulative Distribution Function (CDF)

Uderstanding the Normal Distribution

$$\Pr(a<x<b) =\int_a^b \frac{1}{\sqrt{2\pi s}}e^\frac{-1}{2}(\frac{x-m}{s})^2$$

Understanding the Empirical Rule

@samlexrod
samlexrod / My Python Data Methodology.md
Last active March 23, 2018 00:06
Here I will show the techniques I use to manipulate data using python. These techniques might change depending if I found a more efficient way of performing the task.

Loading and Dumping Data

Magic Command

%matplotlib notebook  # interactive render
%matplotlib inline    # create plot instantly
%matplotlib gtk       # create plot in new window

Import Useful Packages

list_one = ['eggs', 'bacon', 'ham', 'spam']
list_two = [1, 'egg', 2, 'bacon', 'bacon']
list_tree = [list_one, list_two]
list_four = [['eggs', 'bacon'], ['ham', 'spam'], list_tree]