Xiang J xiangjjj

A criticism of "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big"

Yoav Goldberg, Jan 23, 2021.

The FAccT paper "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big" by Bender, Gebru, McMillan-Major and Shmitchell has been the center of a controversary recently. The final version is now out, and, owing a lot to this controversary, would undoubtly become very widely read. I read an earlier draft of the paper, and I think that the new and updated final version is much improved in many ways: kudos for the authors for this upgrade. I also agree with and endorse most of the content. This is important stuff, you should read it.

However, I do find some aspects of the paper (and the resulting discourse around it and around technology) to be problematic. These weren't clear to me when initially reading the first draft several months ago, but they became very clear to me now. These points are for the most part

Je crois savoir finalement comment faire du profilage en obtenant la moyenne des temps d'exécution. Il faut utiliser tf.profiler.Profiler.

Comme exemple, j'ai tenté le profilage de l'exemple pour MNIST SOFTMAX fourni par tensorflow: https://github.com/tensorflow/tensorflow/blob/r1.4/tensorflow/examples/tutorials/mnist/mnist_softmax.py

Le code est adapté ci-dessous avec ajout des lignes nécessaires pour le profilage:

# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");

#Express - Logging The express.js node.js web application framework comes with a built-in logging module called logger which is the connect.js logger. It is really handy to enable and you can use it just like any other Express module. app.use(express.logger());

Without any configuration, the logger middleware will generate a detailed log using what is called the default format. The logger actually supports four predefined log formats: default, short ,tiny, and dev. Each of these predefined formats show various amounts of detail. You can specify one of them this way:

app.use(express.logger('dev'));

If you prefer, you can customize the precise details to be logged using the the following options to format the output of the logger:

	# Install tmux 3.0a on Centos

	# install deps
	sudo yum install -y gcc kernel-devel make ncurses-devel

	# DOWNLOAD SOURCES FOR LIBEVENT AND MAKE AND INSTALL
	curl -LOk https://github.com/libevent/libevent/releases/download/release-2.1.11-stable/libevent-2.1.11-stable.tar.gz
	tar -xf libevent-2.1.11-stable.tar.gz
	cd libevent-2.1.11-stable
	./configure --prefix=/usr/local

	import torch
	import torch.nn as nn
	from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence

	seqs = ['gigantic_string','tiny_str','medium_str']

	# make <pad> idx 0
	vocab = ['<pad>'] + sorted(set(''.join(seqs)))

	# make model

	#!/bin/bash

	# install CUDA Toolkit v8.0
	# instructions from https://developer.nvidia.com/cuda-downloads (linux -> x86_64 -> Ubuntu -> 16.04 -> deb (network))
	CUDA_REPO_PKG="cuda-repo-ubuntu1604_8.0.61-1_amd64.deb"
	wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/${CUDA_REPO_PKG}
	sudo dpkg -i ${CUDA_REPO_PKG}
	sudo apt-get update
	sudo apt-get -y install cuda

	"""adapted from https://github.com/OlavHN/bnlstm to store separate population statistics per state"""
	import tensorflow as tf, numpy as np
	RNNCell = tf.nn.rnn_cell.RNNCell

	class BNLSTMCell(RNNCell):
	'''Batch normalized LSTM as described in arxiv.org/abs/1603.09025'''
	def __init__(self, num_units, is_training_tensor, max_bn_steps, initial_scale=0.1, activation=tf.tanh, decay=0.95):
	"""
	* max bn steps is the maximum number of steps for which to store separate population stats
	"""

	"""Simple example on how to log scalars and images to tensorboard without tensor ops.

	License: BSD License 2.0
	"""
	__author__ = "Michael Gygli"

	import tensorflow as tf
	from StringIO import StringIO
	import matplotlib.pyplot as plt
	import numpy as np

	"""Sequence-to-sequence model with an attention mechanism."""
	# see https://www.tensorflow.org/versions/r0.10/tutorials/seq2seq/index.html
	# compare https://github.com/tflearn/tflearn/blob/master/examples/nlp/seq2seq_example.py
	from __future__ import print_function
	import numpy as np
	import tensorflow as tf

	vocab_size=256 # We are lazy, so we avoid fency mapping and just use one class per character/byte
	target_vocab_size=vocab_size
	learning_rate=0.1

	# vim style tmux config

	# use C-a, since it's on the home row and easier to hit than C-b
	set-option -g prefix C-a
	unbind-key C-a
	bind-key C-a send-prefix
	set -g base-index 1

	# Easy config reload
	bind-key R source-file ~/.tmux.conf \; display-message "tmux.conf reloaded."