The goal of this training is to understand how Git works internally, allowing the teams to make educated decisions about how to use Git in their projects and the impact of branching and merging strategies.
Git uses a linked list to store commits, where each commit points to its parent commit(s). Knowing how Git stores data as blobs in a Key Value store, why are commits/hashes so important?
https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.linkedlist-1?view=net-9.0#examples https://github.com/microsoft/referencesource/blob/master/System/compmod/system/collections/generic/linkedlist.cs
Git uses SHA-1 hashes to uniquely identify commits, trees, and blobs.
What is a hash and why is it important in Git? https://gist.github.com/masak/2415865
Git uses a Key-Value store to manage its objects, where the key is the SHA-1 hash of the object and the value is the object itself.
The objects are stored in the .git/objects
directory and include the following relationships:
- Commit: A snapshot of the repository at a point in time, which includes metadata such as the author, date, and commit message.
- Tree: A directory structure, which can contain blobs and other trees.
- Blob: A file content, stored as a binary object.
npm install -g tree-cli
watch "clear | treee -l 10 --ignore hooks"
https://gist.github.com/masak/2415865
npm install -g tree-cli
cd .git & treee -l 10 --ignore hooks
winget install gource
git cat-file p <hash>
git cat-file t <hash>
git cat-file s <hash>
git cat-file -p <hash> > commit.bin
git ref-log --all
git log --oneline --graph --decorate --boundary `
branch-squash..branch-rebase
-- Show all logs for all branches - including remote
git log --graph --oneline --all --remotes --decorate --date=short --pretty=format:"%C(auto)%h %d %s %Cgreen(%ad, %cr) %C(bold blue)<%an>"
-- Show a commit, including the files changed
git show --name-status --pretty=short <hash>
git log --walk-reflogs --pretty=oneline --all -- FileA.txt
git hash-object FileA.txt
git cat-file -p <hash>
gource -a 1 -c 4 -s 1
The goal is to demo creating a repository, adding files, making commits. Then creating branches, merging, rebasing, and squashing commits. Finally, showing the logs and the tree structure of the repository showing how the different branches look based on the merge type.
Using this time to discuss how Git stores data as objects using a Key, Value store process. Then create a repository, add files, and make commits as follows whist reviwing the git/objects directory structure.
echo "Let's create a new repository"
mkdir src
cd .\src\
dotnet new gitignore
git init
git add .\.gitignore
git commit -m "Add a git ignore"
echo "Let's add a file and make some commits"
echo "FileA" | Out-File -Encoding UTF8 FileA.txt
git add .\FileA.txt
# Presenter: Show how the file is added to the staging area
echo "Update FileA.1"
echo "Update FileA.1" | Out-File -Encoding UTF8 FileA.txt
git add .\FileA.txt
git commit -m "Update FileA.1"
echo "Create FileB"
echo "Create FileB" | Out-File -Encoding UTF8 FileB.txt
git add .\FileB.txt
git commit -m "Create FileB"
# Lets show how a simple merge works
git checkout -b feature-c
echo "FileC" | Out-File -Encoding UTF8 FileC.txt
git add .\FileC.txt
git commit -m "Adding FileC to feature-c"
echo "FileC.1" | Out-File -Encoding UTF8 -Append FileC.txt
git add .\FileC.txt
echo "FileC.2" | Out-File -Encoding UTF8 -Append FileC.txt
git add .\FileC.txt
git commit -m "Updating FileC.2 in feature-c"
echo "FileC.3" | Out-File -Encoding UTF8 -Append FileC.txt
git add .\FileC.txt
git commit -m "Updating FileC.3 in feature-c"
# let's review the diffs
git diff master..feature-c
# Review commits in master but not in feature-c - should be nothing righ""
git log --oneline --graph --decorate --boundary feature-c..master
git log --oneline --graph --decorate --boundary master..feature-c
# Let's do a simple merge now - using a new branch so we can isolate the changes
git checkout master
# Important: we need to add some new commits, just to simulate real world
# Without new commits, this would be a fast-forward merge
echo "Update FileA.2" | Out-File -Encoding UTF8 -Append FileA.txt
git add .\FileA.txt
git commit -m "Updating FileA.2 in master"
echo "Update FileA.3" | Out-File -Encoding UTF8 -Append FileA.txt
git add .\FileA.txt
git commit -m "Updating FileA.3 in master"
# Now lets create some branches to show the different merge types
git checkout -b feature-merge
git merge feature-c
git checkout master
git checkout -b feature-rebase
git rebase feature-c
git checkout master
git checkout -b feature-squash
git merge --squash feature-c
git commit -m "Squashed changes from feature-c"
git checkout master
# Presenter: Show how each merge branch is different. How commites are changed or missing based on the merge type.
git log --oneline --graph --decorate --boundary feature-rebase..feature-c
git log --oneline --graph --decorate --boundary feature-squash..feature-c
git log --oneline --graph --decorate master
git log --oneline --graph --decorate feature-merge
git log --oneline --graph --decorate feature-rebase
git log --oneline --graph --decorate feature-squash
# To rebase or not to rebase, that is the question
# Presenter: show what the answer to rebase, merge and squash is 'it depends'
git checkout feature-c
git checkout -b feature-c-master-rebase
git rebase master
git log --oneline --graph --decorate master
git log --oneline --graph --decorate feature-c-master-rebase
Presenter: Explain how Git uses a linked list to store commits,
Knowing how Git stores data as blogs in a Key Value store, why are commits/hashes so important? How are they used to identify the state of the repository at a point in time?
https://gist.github.com/masak/2415865
Add some commits to a file and make multiple commits. Show the logs and the objects in the tree.
Create a new branch and commit some changes . Create a new branch and merge in . Create a new branch from master and rebase it with . Create a new branch and squash the commits from into one commit.
then show the logs with the graph and the boundary to show the different branches. Then use git diff to show that there are no file content changes between the branches. then how the boundary is shown in the graph to show there are commit diffs.
Then demo a merge conflict, using an XML file with a conflict in the same line across two branches.
Show how using git gc
will clean up the repository and remove any dangling commits. Also how this packs the objects into a single file to save space.
- find a better way to see the tree and include the date created and updated