duplicates = multiple editions
A Classical Introduction to Modern Number Theory, Kenneth Ireland Michael Rosen
A Classical Introduction to Modern Number Theory, Kenneth Ireland Michael Rosen
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
# Python 3 and compatibility with Python 2 | |
from __future__ import unicode_literals, print_function | |
import os | |
import sys | |
import re | |
import logging |
data { | |
int N; | |
int M; | |
real<lower=0> Y[N]; | |
} | |
parameters { | |
real<lower=0> mu; | |
real<lower=0> phi; | |
real<lower=1, upper=2> theta; |
tl;dr I want to use Rust to program robots. Help me find the best core libraries to build on.
Robotic systems require high performance and reliability, but also have enormous complexity in terms of algorithms employed, number of subsystems, embedded hardware control, and other metrics. Development is mostly split between C++ for performance and safety critical components, and MatLab or Python for quick research or task iteration.
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """ | |
import numpy as np | |
import cPickle as pickle | |
import gym | |
# hyperparameters | |
H = 200 # number of hidden layer neurons | |
batch_size = 10 # every how many episodes to do a param update? | |
learning_rate = 1e-4 | |
gamma = 0.99 # discount factor for reward |
How do we solve for the policy optimization problem which is to maximize the total reward given some parametrized policy?
To begin with, for an episode the total reward is the sum of all the rewards. If our environment is stochastic, we can never be sure if we will get the same rewards the next time we perform the same actions. Thus the more we go into the future the more the total future reward may diverge. So for that reason it is common to use the discounted future reward where the parameter discount
is called the discount factor and is between 0 and 1.
A good strategy for an agent would be to always choose an action that maximizes the (discounted) future reward. In other words we want to maximize the expected reward per episode.
I upgraded my iPhone 5s to iOS 10 and could no longer retrieve photos from it. This was unacceptable for me so I worked at achieving retrieving my photos. This document is my story (on Ubuntu 16.04).
The solution is to compile libimobiledevice and ifuse from source.
Who is this guide intended for?
# "impute" missing binary predictor | |
# really just marginalizes over missingness | |
# imputed values produced in generated quantities | |
N <- 1000 # number of cases | |
N_miss <- 100 # number missing values | |
x_baserate <- 0.25 # prob x==1 in total sample | |
a <- 0 # intercept in y ~ N( a+b*x , 1 ) | |
b <- 1 # slope in y ~ N( a+b*x , 1 ) |