Skip to content

Instantly share code, notes, and snippets.

View vhbui02's full-sized avatar

VH vhbui02

View GitHub Profile
@vhbui02
vhbui02 / xargs.md
Last active October 17, 2023 14:49
[xargs command] again, I might not remember it by tomorrow, still i noted it #linux

Parallel download and rename at the same time

https://stackoverflow.com/a/51584237/9122512

Intro

xargs reads lines of text from the STDIN or from the pipelined outputs of another command and turns them into multiple commands and executes them.

xargs usually go with find command.

Syntax

xargs [options] [command [initial-arguments]]

@vhbui02
vhbui02 / pip.md
Created September 30, 2023 14:08
[pip command] Maybe I will forget it after today, but i'm still gonna note it now

Update outdated package?

  • DO NOT upgrade package in pip list --outdated, since there're no assurance that something won't be broken.

Revert old version

pip install --force-reinstall -v "<package_name>==<version>"

Use pip or pip3

  • pip3 will try to install packages for python3
  • pip can install packages from python, python2, ONLY IF you installed multiple version of python, otherwise it's the same as pip3
@vhbui02
vhbui02 / awk.md
Last active October 17, 2023 14:27
[awk tutorial] Maybe i won't remember after today but I still wanna note

Pre-defined and automatic variable

RS: Recond Seperator

  • AWK processes data 1 record at a time, records are seperated by RS from a whole input data stream
  • By default, RS=\n

NR: Record Number

  • If you're using RS=\n by default, NR will be the current input line number.

FS/OFS: Field Seperator/Output Field Seperator

  • AWK splits 1 record into multiple field based on the value of FS
@vhbui02
vhbui02 / linux-cheatsheet.md
Last active August 28, 2023 11:28
[Linux cheatsheet] Very very many trivial things
@vhbui02
vhbui02 / stat-cheat-sheet.md
Last active May 31, 2023 04:02
[Statistics Cheat Sheet] #statistic

95% CI of 1 CAT

prop.test(x, n, p) binom.test(x, n, p) p = 1/num. of level

Check Ratio also

table(c(x1, n1), c(x2, n2)) prop.test(table)

Z-test

@vhbui02
vhbui02 / mongodb-text-search-on-atlas-search.md
Last active May 18, 2023 04:05
[MongoDB Text Search on Atlas Search] #mongodb
@vhbui02
vhbui02 / mongodb-non-atlas-text-search.md
Last active May 18, 2023 03:59
[MongoDB Text Search on non-Atlas Self-Managed Deployments] #mongodb

Create a text index, it can include any field whose value is a string or an array of string elements

A collection can only have 1 text search index, but the good news is that text search index can cover multiple fields

Cons

  • does not support fuzzy search
  • does not support autocomplete
  • does not work well with language using diacritic marks

Using find()

@vhbui02
vhbui02 / mongodb-on-demand-materialized-view.md
Created May 17, 2023 19:45
[MongoDB On-Demand Materialized View] #mongodb

On-demand Materialized View

  • pre-computed aggregation pipeline
  • results are stored on and read from disk

Actually, it simply is a normal collection which is the result of a $merge or $out stage. There is no such thing as a special data structure called Materialized View like MySQL

Examples

// create mock data
@vhbui02
vhbui02 / mongodb-agg-pipeline-optimization.md
Created May 17, 2023 18:19
[MongoDB Aggregation Pipeline Optimization] #mongodb

worth to note

  • aggregation can use indexes from input Collection to improve performance => indexing is very essential in aggregation.

  • if indexes are good enough, they can cover a stage, that means simply using index is enough to return all matching documents, nullify the need to read documents directly => This make the query have very high performance

$match, $sort, $group can be benefited from indexes

  • index in $match query field => identify relevent document very fast.
  • index on sorted field can be used to return data for $sort stage => no need to sort again
  • index on grouped field that matches $sort order, can return all of the field values needed to execute the $group state (this means if the data is sorted by grouped field, all identical values of grouped field are put close together). This is a covered stage.
@vhbui02
vhbui02 / mongodb-clustered-collections.md
Created May 17, 2023 17:27
[MongoDB Clustered Collections] #mongodb

CLUSTERED COLLECTIONS

A Collection with 1 clustered index

Pros

    1. fast query without 2nd-ary index, use clustered index key instead to make range or equality comparison.
    1. lower storage size, very good for bulk inserts
    1. eliminate the need of TTL index, since: