@ref: Become a Git Guru and Simple Git
$ git init <directory>
will initialize an empty working directory, where one do the work.$ git init --bare <directory>
will initialize an empty directory that is only for sharing purpose. It does not contain any actual work. For local usage, one will create such a directory as a remote repository.
$ git add <file>
or$ git add <directory>
or$ git add *
will add changes to index stage, which will later on be committed to HEAD.$ git commit -m "message"
will commit changes to HEAD with a message.
- For local usage, one can point the remote repository to the sharing directory by
$ git remote add origin <path/to/bare directory>
. Here$ origin
is a name one chose for the remote repository. Now the sharing directory becomes our remote repository. - Then, one can push changes stored in HEAD to the remote repository
$ git push origin master
- One can inspect commit ID using
$ git log --oneline
. - Then one can check each commit using
$ git checkout <commit ID>
. - To revert to any specific commit, use
$ git revert <commit ID>
.
- After clone a repo, try
$ git branch
, one only sees the master branch. It doesn't mean other branches are missing, they are just hidden. Use$ git branch -a
to show all branches. - If one wants to work on some specific branch, use
$ git checkout branch-name
to track remote branch locally. - See this post for reference.
- Check out this website
- Branch allows one to make changes to a repo without affecting the original code. Only when the changes have been made and accepted, one can merge them back. The original branch is always named
master
. - One can create a branch, say
test
, using$ git branch test
. - One can checkout current branches by
$ git branch
. - To move from one branch to another, one can do
$ git checkout test
, which moves frommaster
totest
. - Check local branches by
$ git branch
, remote branches bygit branch -r
, and all branches by$ git branch -a
. - (TODO) how to push and merge and delete
- Push local branch to remote by
$ git push -u origin <branch>
, whereorigin
is the name of remote (it is usually namedorigin
) and<branch>
is replaced by the name of the branch.
- Submodule adds external repositories to an existing repository, that is, adds a git repo to another git repo. This is a nice way of managing external packages.
- To add, use
$ git submodule add repo/location path/location
- To remove and update, see this post for details.
- To add, use
MacTeX is a TeX distribution for the MacOS system, and is maintained by the MacTeX Technical Working Group. The full MacTeX package contains the full distribution of TeX Live, which is over 2GB of materials. BasicTeX is an alternative distribution that includes a subset of TeX Live of size 100MB. Due to lack of disk space, I always prefer the much smaller BasicTeX.
The installation is easy by following the instruction on the web. The main problem comes from the missing packages that are included in MacTeX but not in BasicTeX, though BasicTeX contains most standard tools needed to write TeX documents. You can use package management tool (tlmgr) to help installing additional packages. In this post, I will introduce some tips that is enough for daily work.
- First of all, you need to install BasicTeX by searching and download online. It upgrades every year, so you need to manually upgrade it every year.
- This biggest problem of using BasicTeX is to install missing packages. To do so, you need to use tlmgr.
- Install tlmgr in bash via
$ sudo tlmgr update --self
- Install LaTeX packages via
$ sudo tlmgr install packname
- If packname is not found, you can always search it by
$ tlmgr search --global --file packname
- For example, search optparams.sty, you do
$ tlmgr search --global --file optparams.sty
, it gives us
tlmgr: package repository http://mirrors.opencas.cn/ctan/systems/texlive/tlnet sauerj: texmf-dist/tex/latex/sauerj/optparams.sty
- So the package name is sauerj
- For example, search optparams.sty, you do
- Build the .tex and .bib files in bash
$ pdflatex main.tex
,$ bibtex main
,$ pdflatex main.tex
, and finally$ open main.pdf
(note, there are four commands). - See this post and this blog for references.
- BasicTeX will update every year and does not provide an upgrade script for the current version. That means, users need to update it manually (fresh install the new version). Check /usr/local/texlive/ to find out the current version BasicTeX.
- The caveat of a fresh install is that all the tlmgr packages will be lost. To avoid this, you can manually upgrade texlive by following the instruction. However, it seems very complex and is not recommended by the texlive team.
- The other option is to stay with the current version, use
$ sudo tlmgr option repository ftp://tug.org/historic/systems/texlive/2017/tlnet-final
- In command line, move working directory to mallet root directory
- Make sure the .mallet file and data files are inside the mallet directory
- To import data and generate a .mallet file, type
$ bin/mallet import-dir --input c:/mallet/mydata --output name.mallet --keep-sequence --remove-stopwords
- type
$ bin/mallet train-topics --input name.mallet --num-topics 7 --optimize-interval 20 --output-topic-keys name_keys.txt --output-doc-topics name_composition.txt
- How to delete files in a folder?
- Use
d
to mark file as ready to delete - Use
x
to execute the delete action
- Use
- What if installing some packages through MELPA shows "NOT FOUND"?
- Try to refresh package list using
M-x package-refresh-contents
then install.
- Try to refresh package list using
EWW is a web browser written by ELisp. It comes with for Emacs by default. One can open a eww buffer by M-x eww
and then type the URL or search keywords.
- Press
q
to exit the browser. - Press
g
to reload the current webpage. - Press
w
to copy the current URL to the kill ring. - Press
R
to determine "readable" text and will only display this part, which will get rid of manu and stuff. - Press
d
to download. The default download directory is ~/Downloads/. - Press
l
orr
to go backward or forward URL. - Press
H
to see browsing history. - Press
b
to bookmark websites. Bookmarks can be viewed withB
.
- Start SQLite via
M-x sql-sqlite
and it will start a sqli buffer; sql-mode comes with Emacs. - An instruction about how to use SQLite/MySQL in Emacs can be found here
- Follow the instruction, you can setup either login parameters (usually for single database/server usage) or multiple databases/servers options (usually for multiple databases/server usage).
- To do so, you need to modify .emacs file
- To start MySQL in Emacs, type
M-x my-sql-server1
to choose server orM-x sql-mysql
to use default parameters
- virtualenvwrapper.el is a replicate of virtualenvwrapper package for Python. To use it, you need to install Virtualenv, but virtualenvwrapper is not necessary.
- If you do not want to install all virtual environments in the same place, i.e., ~/.virtualenvs you don't need to install virtualenvwrapper. You just need virtualenvwrapper.el for Emacs. In this case, please specify the
venv-location
in your .emacs file. - Install virtualenvwrapper.el via
M-x package-install virtualenvwrapper
. - Edit .emacs to load and enable (check its GitHub page).
- After specify the
venv-location
, useM-x venv-mkvirtualenv RET venv1
to start a virtual environment named venv1. - To activate the virtual environment, use
M-x venv-workon
and specify the name of the virtual environment, in this case, venv1. - To deactivate the virtual environment, use
M-x venv-deactivate
. - To show the list of virtual environments, use
M-x venv-lsvirtualenv
. - To move to the virtual environment directory, use
M-x venv-cdvirtualenv
. - To remove the virtual environment, use
M-x venv-rmvirtualenv RET venv1
to delete venv1. - Other commands can be found here. These commands are very similar to the ones of virtualenvwrapper.
- There is an issue calling IPython kernel within virtual environment inside Emacs. It is because matplotlib has a bug when interacting with jupyter kernel inside some text editors, such as Emacs. A possible fix is to add line
backend: TkAgg
in ~/.matplotlib/matplotlibrc.
- To add image to Jekyll posts, especially for Github Pages, we need to use baseurl and append it to the address of the image, like

. See this post for reference.
- What do
public
andstatic
mean in Java?public
means the all classes and objects can assess the method.static
means that the method is associated with class, not an instace of the class. That is, you can call the method without have an object of that class.- See stack overflow
- What are the access modifiers?
private
indicates only the same class can access the contents.default
means nothing to be declared; it indicates that same package can access the contents.proected
provides the same access likedefault
but allows subclass and superclass to access the contents, even if they are located in different packages.public
means all code can access the contents
- What is the difference between
char
andCharacter
?char
is a primitive type representing a 16-bit Unicode character, whileCharacter
is a wrapper class allowing the use of primitive type in OOP- When using
HashMap
, if you want to usechar
as key, you need to initialize the hashmap asHashMap<Character, Object>
- See this post for more information
- Single quote vs double quote
- Single quote is for character literal
char
- Double quote is for string literal
string
- This applies to C++ as well
- Single quote is for character literal
(TODO)
- (TODO) about pointer,
*
,&
- template includes function template, class template, and template specialization. It is well explained in book "The C++ language Tutorial".
- CMake is the de facto standard build tool for C++.
- C++ does not have a universal package management system. Different OS and different building systems require different ways to manage packages.
- On MacOS and Linux, packages can be managed via system-wide package manager, such as Homebrew.
- Google suggests Conan for C++ package management, however, I found out that Conan is a bit complex and more suitable for very complex project. Conan is also suitable for Windows.
- One good thing about Conan is that it provides a fairly good tutorial on Conan integration with CLion.
- A good way of managing external packages in C++ development is using CMake
find_package()
(pre-compiled binary) and$ git submodule
(), see this post for details. - If the package is header-only or has a "proper CMake setup," then
$ git submodule
is the way to go.
- To allow different projects have their own lib environment, one can use virtualenv. PyCharm comes with virtualenv.
- Python setup may use setup.py or requirements.txt. It you are distributing your package via PyPI, use setup.py to specify what are the required dependencies. If you want to document which packages should be installed for a develop environment, use requirements.txt (See this post for reference).
- To maintain zeros on the left of a number, you can first convert the number to string, then use
zfill(n)
to fill the number with zeros to n digits. - Convert unicode date to string
- Assume
df['date']
contains unicode date object. - To convert, one can do:
from datetime import datetime df['date'].update(df['date'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d').strftime('%Y%m%d')))
- Or use
pd.to_datetime()
, e.g.,df['date'] = pd.to_datetime(df['date'])
.
- Assume
- Convert float to datetime
- Assume
df['date']
contains float number indicating datetime. - To convert, one can do
df['date'].update(pd.to_datetime(df['date'].astype(str)))
.
- Assume
- To add an element to the beginning of a list, one can do
lst = [1,2,3] x = 5 result = [x] + lst # return: [5,1,2,3]
- Packages can be installed via
$ pip install packname
or$ pip install -r requirements.txt
. - If one wants to uninstall a package and its dependencies, one can do
$ pip install pip-autoremove
and$ pip-autoremove packname -y
. - See this post for reference.
- To check existing Python packages, use
$ pip freeze
or$ pipdeptree
or open a Python interpreter and type$ help("modules")
. - To check outdated packages, use
$ pip list --outdated --format=columns
. - To upgrade packages, use
$ pip install PackageName --upgrade
. - To remove all installed packages:
$ pip freeze | xargs pip uninstall -y
to uninstall all$ pip freeze
packages.$ pip uninstall -r requirements.txt
to uninstall all packages specified in requirements.txt.
- When plotting timeseries using
sns.tsplot()
, you need to make a dataframe with a string date object and a unit object. Unit object is used to calculate error bar, see this post for explanation.import seaborn as sns gammas = sns.load_dataset("gammas") ax = sns.tsplot(time="timepoint", value="BOLD signal", unit="subject", condition="ROI", data=gammas)
bokeh.plotting.Figure
vsbokeh.models.Plot
Figure
is a subclass ofPlot
that simplifies plot creation with default axes, grids, tools, etc.
- On a plot legend, if the name of the legend is the same as the column name, the legend will display the column values instead of the text name. Read more about this feature here.
- To deal with time series, one usually use function
pd.DataFrame.shift()
to shift a dataframe.- For example, if one needs to calculate daily stock return, i.e.,
$(p_1-p_0)/p_0$ , we can do following,df2 = df.shift(-1) # shift dataframe upwards to align p1 with p0 daily_return = (df - df2) / df2
-
pd.DataFrame.shift(1)
will shift the dataframe downwards by 1 row; conversely,pd.DataFrame.shift(-1)
will shift the dataframe upwards by 1 row.
- For example, if one needs to calculate daily stock return, i.e.,
statsmodels.regression.linear_model.OLS
does not include intercept by default, therfore, to use it, one need to manually add constant usingstatsmodels.api.add_constant()
. See this post for reference.
- Sphnix is a Python documentation tool, which generates documentation for Python project automatically.
- It can be installed via
$ pip install sphnix
. - Sphnix generates html documentation that can be hosted in Read the Docs.
- See this blog and this website for nice tutorials.
- Install via
$ pip install virtualenv
. - After installation,
$ cd
to the project directory, use$ virtualenv --system-site-packages venv
to create a virtual environment named venv with all system site packages; use$ virtualenv --no-site-packages venv
to create a virtual environment without system site packages. - Virtual environment initialized this way will be located in the project directory. In order to have all virtual environments in the same directory, we need virtualenvwrapper.
- Under the project directory, use
$ source venv/bin/activate
to enable the virtual environment. - Use
$ deactivate
to exit the virtual environment. - To remove the virtual environment, just delete the venv folder by
$ rm -r venv
. - Remember to add venv to .gitignore file to exclude the virtual environment from git.
- See this website for an introductory tutorial.
- An extension of Virtualenv, named virtualenvwrapper, can be installed. It extends Virtualenv functionality, such as moving all virtual environments to one place.
- A simple tutorial explains its usage.
- After install via
$ pip install virtualenvwrapper
, to initialize, put following lines in your shell startup file:export WORKON_HOME=$HOME/.virtualenvs export PROJECT_HOME=$HOME/Devel source /usr/local/bin/virtualenvwrapper.sh
- Then, reload the shell startup file (
source .bash_profile
) or restart shell.
- Then, reload the shell startup file (
- Use
$ workon
to check list of virtual environments. - Use
$ mkvirtualenv nameofenv
to create virtual environment. - To bind an existing working directory to an existing virtual environment, use
$ setvirtualenvproject [virtualenv_path project_path]
, if no parameters are given, assume current working directory and current virtual environment.- This step is a key difference from Virtualenv, as Virtualenv creates virtual environment under the project directory. Therefore, after creation, they are bound.
- For Emacs users, use virtualenvwrapper.el, which includes a large subset of the functionality of virtualenvwrapper. See Emacs section for details.
- See this website for nice tutorials.
- See this post for an introductory tutorial on Pandas+SQLAlchemy.
- To use SQLAlchemy, you need to first create an engine, which requires a configuration. The default driver for the configuration is mysql-python, which requires MySQLdb. However, installing MySQLdb is a pain in the ass (a probable fix can be found here as of 2017). So, try to use non-default driver mysql-connector-python, install it via
$ pip install mysql-connector-python
and create engine usingcreate_engine(mysql+mysqlconnector://user:password@localhost/dbname)
.
- How to save a model object?
- Some R models are computationally costly, therefore, it is wise to save them as a .RData file for future reuse.
- Save can be done via
save(model, file="save_location")
. - This post explains how to save, reuse, and update a model object.
- How to factorize a feature?
- When doing classification, we usually need one or more category variables.
- In this case, we can factorize vriables to create such categories. To do so, simply we use
factor(df$col.name)
to factorize a column.
- How to select rows, columns, or subsets of a dataframe?
- Selecting rows, one can do
df[1, ]
for first row and all columns, ordf[1:2, ]
for first two rows and all columns, ordf[2:nrow(df), ]
for 2 to last rows and all columns. - Selecting columns, one can do
df[, 1]
for first column and all rows, ordf[, 1:2]
for first two columns and all rows, ordf[, 2:ncol(df)]
for 2 to last columns and all rows. - Selecting specific columns, one can do
df[, c('col1','col3','col5')]
for first, third, and fifth columns. - Selecting a subset, one can do
df[5:nrow(df), 'col.name']
to select 5 to last rows and the specfic column namecol.name
.
- Selecting rows, one can do
- About R packages, DataCamp has a nice beginner's guide.
- When you upgrade R, all the installed packages will be gone, the way to avoid this is to use reset library location or use virtual environment.
- R stores its packages via
.libPaths()
variable. See this post for reference. To store packages in another location, see Startup Management section below. - To uninstall R packages, one can do
remove.packages("package-name", lib="library-path")
. The default "library-path" is the first element in.libPaths()
. In order to remove a package and its unneeded dependencies, one can do following:library("tools") removeDepends <- function(pkg, lib, recursive = FALSE){ d <- package_dependencies(,installed.packages(), recursive = recursive) depends <- if(!is.null(d[[pkg]])) d[[pkg]] else character() needed <- unique(unlist(d[!names(d) %in% c(pkg,depends)])) toRemove <- depends[!depends %in% needed] if(length(toRemove)){ toRemove <- select.list(c(pkg,sort(toRemove)), multiple = TRUE, title = "Select packages to remove") remove.packages(toRemove, lib=lib) return(toRemove) } else { invisible(character()) } }
- This snippet is copied from this post, which explains how to remove a R package and its dependencies.
- R also supports dependency management like Python virtual environment. There are many R virtual environment tools. Packrat, created by RStudio team, seems nice. See this post for more information.
- Initializing Packrat via
packrat::init()
for an existing project will scan through all .R files in the project and check allrequire()
andlibrary()
commands to find out all package dependencies. packrat::snapshot()
will create a snapshot for the current dependency structure. A snapshot can be used to restore the project under a new environment.- Installing packages under Packrat is the same as that of a normal R session, via
install.packages("package-name")
. - To check the status under Packrat, use
packrat::status()
. - To remove unused packages, use
packrat::clean()
.
- In order to specify a location other than default to store R packages, you need to know about R startup process.
- Details about R startup can be found here.
- Also, to add an external library path to
.libPaths()
, one can do.libPaths( c( .libPaths(), "~/Documents/RLib/") )
within a R session. Note, this only adds the external library path to the current session. When restart a new session, the path is gone. - Another way of adding external library path is to create a ~/.Rprofile and specify following startup command:
.libPaths("~/Documents/RLib/")
. This will always append the external library path to.libPaths()
upon startup. - To install packages in the non-default location, use
install.packages("package-name", lib="~/Documents/RLib/")
; if ~/Documents/RLib/ is the first element in.libPaths()
, theninstall.packages("package-name")
will does it.
- This website gives a nice introduction about how to use R Markdown.
- See this website and this website for nice tutorials.
- See this website for an introduction of ggplot.
- See this website for a Chinese version tutorial.
- SQLite is included in MacOS X, start SQLite in termial by typing
sqlite3
- MySQL needs to be installed on MacOS X. The installer can be downloaded from the official webiste
- To install MySQL via installer, check this website for some necessary details, such as aliases. To activate MySQL in terminal, after adding aliases, use
$ mysql -u root -p
if the server is on the same machine. - To use MySQL, re this post for login, create users, grant privileges, and basic database management.
- MySQL comes with default databases information_schema, mysql, performance_schema, and sys. DO NOT put any data in them, they are reserved for MySQL setup.
- Follow this website to add an example database for test purpose.
In the Linux world they can all look the same from the point of view of the user at the keyboard. The differences are in how they interact with each other.
The shell is the program which actually processes commands and returns output. Most shells also manage foreground and background processes, command history and command line editing. These features (and many more) are standard in bash, the most common shell in modern Linux systems.
A terminal refers to a wrapper program which runs a shell. Decades ago, this was a physical device consisting of little more than a monitor and keyboard. As Unix/Linux systems added better multiprocessing and windowing systems, this terminal concept was abstracted into software. Now you have programs such as Gnome Terminal which launches a window in a Gnome windowing environment which will run a shell into which you can enter commands.
The console is a special sort of terminal. Historically, the console was a single keyboard and monitor plugged into a dedicated serial console port on a computer used for direct communication at a low level with the operating system. Modern Linux systems provide virtual consoles. These are accessed through key combinations (e.g. Alt
+F1
or Ctrl
+Alt
+F1
; the function key numbers different consoles) which are handled at low levels of the Linux operating system -- this means that there is no special service which needs to be installed and configured to run. Interacting with the console is also done using a shell program.
In short, terminal is a container, shell is a program running in the terminal, console is a special type of terminal. Bash/zsh/ksh/etc. are all shell programs. iTerm2 is a terminal program. MacOS default terminal runs a bash.
See this website for reference.
See this website for reference.
A shell is the generic name for any program that gives you a text-interface to interact with the computer. You type a command and the output is shown on screen.
Many shells have scripting abilities: Put multiple commands in a script and the shell executes them as if they were typed from the keyboard. Most shells offer additional programming constructs that extend the scripting feature into a programming language.
On most Unix/Linux systems multiple shells are available: bash, csh, ksh, sh, tcsh, zsh just to name a few. They differ in the various options they give the user to manipulate the commands and in the complexity and capabilities of the scripting language.
Interactive: As the term implies, Interactive means that the commands are run with user-interaction from keyboard. E.g. the shell can prompt the user to enter input.
Non-interactive: the shell is probably run from an automated process so it can't assume if can request input or that someone will see the output. E.g Maybe it is best to write output to a log-file.
Login: Means that the shell is run as part of the login of the user to the system. Typically used to do any configuration that a user needs/wants to establish his work-environment. For MacOS, open Terminal app gives a Interactive, login shell.
Non-login: Any other shell run by the user after logging on, or which is run by any automated process which is not coupled to a logged in user. For example, the shell in Emacs is an interactive, non-login shell.
See this website for reference.
Both are bash configuration files. .bash_profile is executed for login shells, while .bashrc is executed for interactive non-login shells.
When you login (type username and password) via console, either sitting at the machine, or remotely via ssh: .bash_profile is executed to configure your shell before the initial command prompt.
But, if you’ve already logged into your machine and open a new terminal window (xterm) then .bashrc is executed before the window command prompt. .bashrc is also run when you start a new bash instance by typing $ /bin/bash
in Terminal.
On MacOS X, Terminal by default runs a login shell every time, so this is a little different to most other systems, but you can configure that in the preferences. That is to say, use .bash_profile is good enough for MacOS X.
See this website and this post for reference.
- Set everything in .bash_profile.
- Create .bashrc and type
$ source ~/.bash_profile
. - Doing so will allow non-login shell use .bash_profile setup.
There are many places to put binary files in Linux. For example, /bin/, /usr/bin, /sbin/, etc. Why is so?
- /bin (and /sbin) were intended for programs that needed to be on a small / partition before the larger /usr, etc. partitions were mounted. These days, it mostly serves as a standard location for key programs like /bin/sh, although the original intent may still be relevant for e.g. installations on small embedded devices.
- /sbin, as distinct from /bin, is for system management programs (not normally used by ordinary users) needed before /usr is mounted.
- /usr/bin is for distribution-managed normal user programs.
- There is a /usr/sbin with the same relationship to /usr/bin as /sbin has to /bin.
- /usr/local/bin is for normal user programs not managed by the distribution package manager, e.g. locally compiled packages. You should not install them into /usr/bin because future distribution upgrades may modify or delete them without warning.
- /usr/local/sbin, as you can probably guess at this point, is to /usr/local/bin as /usr/sbin to /usr/bin.
- In addition, there is also /opt which is for monolithic non-distribution packages, although before they were properly integrated various distributions put Gnome and KDE there. Generally you should reserve it for large, poorly behaved third party packages such as Oracle.
See this website for reference.
- Do not mess around system Python. Instead, install Homebrew to manage MacOS packages. Use
$ brew
to install Python, and use$ pip
to install Python packages.- In /usr/bin/, you can find system Python, and system Python packages can be found in /usr/lib/python27/.
- If homebrewed Python, this copy of Python appears in /usr/local/bin/ and its libraries in /usr/local/lib/python27/.
- When OneDrive or Dropbox sync green check-mark is missing, go to System Preferences -> Extensions -> Finder to enable to finder integration.
-
Jupyter Notebook change default browser:
- In terminal, type
jupyter notebook --generate-config
to createjupyter_notebook_config.py
. - Go to
~/.jupyter/jupyter_notebook_config.py
and change# c.NotebookApp.browser = u''
. - For example, if using Safari as Jupyter Notebook browser, type
c.NotebookApp.browser = u'ppen -a /Applications/Safari.app %s'
.
- In terminal, type
-
Homebrew
-
brew install graphviz
for *dot application
-
- Use Homebrew to manage packages, see installation-log in this repo to find details about MacOS package setup.
- To install all packages using Homebrew at once, first, you need to use
$ brew leaves
to find all top-level non dependency packages. - Then, store the list of package names in list.txt file, and use
$ xargs brew install < list.txt
or$ brew install $(< list.txt)
to install (not tested). - See this post for reference.
- To install all packages using Homebrew at once, first, you need to use
- Use Caskroom to stall some software binaries, such as Java and VLC.
- To check homebrewed packages, use
$ brew list
. - To check casked packages, use
$ brew cask list
.