SlideShare a Scribd company logo
Git internals
internals
Git Objects
Git objects
$ ls .git/
branches
config
description
HEAD
hooks

index
info
logs
objects
refs
The blob
$ git init
Initialized empty Git repository in /tmp/test/.git/

$ echo "test_content" | git hash-object -w -t blob --stdin
915e94ff1ac3818f1e458534b0228a12a99cd6c5
$ ls .git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5
.git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5
$ cat .git/objects/91/5e94ff1ac3818f1e458534b02* | zlib_inflate -d
blob 130test_content
The tree
tree [content size]0
10644 blob a906cb README
10755 blob 6f4e32 run
04000 tree 1f7a4e src
The commit
$ git commit file
[master dbaf944] This is a commit message.
$ cat .git/objects/db/af944a4a9eb72af64042b1e3a128936000dfc2 |

zlib_inflate -d
commit 318
tree 47ec7a250164a21cb14eb64618c3a903db0b7420
parent 402b26df0644f09fc62842c0a4a44a0a3345c530
author Manu <m.cupcic@criteo.com> 1380977766 +0200
committer Manu <m.cupcic@criteo.com> 1380977766 +0200

This is a commit message.
The commit
• Is identified by
• a snapshot of the repo state (the tree).
• parent commit(s)
• a commit message

• Is immutable
• Has a deterministic hash (SHA1)
• Commits form a linked list: the history
Git References
$ cat .git/refs/heads/master
dbaf944a4a9eb72af64042b1e3a128936000dfc2

$ cat .git/HEAD
ref: refs/heads/master
$ echo "dbaf944" > .git/refs/heads/newbranch
$ git checkout newbranch
Switched to branch 'newbranch'
Git References
$ git tag 1.0
dbaf944a4a9eb72af64042b1e3a128936000dfc2

$ cat .git/refs/tags/1.0
dbaf944a4a9eb72af64042b1e3a128936000dfc2
Take home message
• Git stores a snapshot of the whole repo at each commit.
• The SHA1 of a commit depends only on its content, message,
committer and parent(s).
• A git branch/tag is a 40 digits hex number stored in a file.
Things we can play with
git reflog
git fsck

git pack
git config

git rebase -i
git reset
git refspecs
git stash

git add -p
git log (advanced stuff)

git pull –rebase

More Related Content

PDF
Git cheatsheet
PDF
Groovyノススメ
PDF
Data recovery using pg_filedump
PPTX
PDF
Exprimiendo GIT
PDF
New Views on your History with git replace
PDF
Git internals
PPTX
git internals
Git cheatsheet
Groovyノススメ
Data recovery using pg_filedump
Exprimiendo GIT
New Views on your History with git replace
Git internals
git internals

Similar to Git internals (20)

PDF
Git Tutorial Yang Yang
PDF
Git in action
PPTX
Six3 Getting Git
PDF
Git foundation
PDF
Git branching model_for_tap_team
PDF
Git: An introduction of plumbing and porcelain commands
PDF
Git internals
PPTX
Git Basic
PPTX
Learning Basic GIT Cmd
PDF
git. WTF is it doing anyway?
PPTX
How git works
PDF
Understanding Git - GOTO London 2015
PDF
Version Control and Git - GitHub Workshop
PPTX
KEY
Gittalk
PDF
Knowledge is Power: Getting out of trouble by understanding Git - Steve Smith...
KEY
Git使用
PDF
Did you git yet?
PDF
Introducción a git y GitHub
Git Tutorial Yang Yang
Git in action
Six3 Getting Git
Git foundation
Git branching model_for_tap_team
Git: An introduction of plumbing and porcelain commands
Git internals
Git Basic
Learning Basic GIT Cmd
git. WTF is it doing anyway?
How git works
Understanding Git - GOTO London 2015
Version Control and Git - GitHub Workshop
Gittalk
Knowledge is Power: Getting out of trouble by understanding Git - Steve Smith...
Git使用
Did you git yet?
Introducción a git y GitHub
Ad

Recently uploaded (20)

PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Electronic commerce courselecture one. Pdf
PPTX
Machine Learning_overview_presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Spectroscopy.pptx food analysis technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Cloud computing and distributed systems.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Machine learning based COVID-19 study performance prediction
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Electronic commerce courselecture one. Pdf
Machine Learning_overview_presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
Spectroscopy.pptx food analysis technology
Advanced methodologies resolving dimensionality complications for autism neur...
Cloud computing and distributed systems.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Diabetes mellitus diagnosis method based random forest with bat algorithm
The AUB Centre for AI in Media Proposal.docx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Machine learning based COVID-19 study performance prediction
MYSQL Presentation for SQL database connectivity
Big Data Technologies - Introduction.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Ad

Git internals

  • 4. Git objects $ ls .git/ branches config description HEAD hooks index info logs objects refs
  • 5. The blob $ git init Initialized empty Git repository in /tmp/test/.git/ $ echo "test_content" | git hash-object -w -t blob --stdin 915e94ff1ac3818f1e458534b0228a12a99cd6c5 $ ls .git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5 .git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5 $ cat .git/objects/91/5e94ff1ac3818f1e458534b02* | zlib_inflate -d blob 130test_content
  • 6. The tree tree [content size]0 10644 blob a906cb README 10755 blob 6f4e32 run 04000 tree 1f7a4e src
  • 7. The commit $ git commit file [master dbaf944] This is a commit message. $ cat .git/objects/db/af944a4a9eb72af64042b1e3a128936000dfc2 | zlib_inflate -d commit 318 tree 47ec7a250164a21cb14eb64618c3a903db0b7420 parent 402b26df0644f09fc62842c0a4a44a0a3345c530 author Manu <m.cupcic@criteo.com> 1380977766 +0200 committer Manu <m.cupcic@criteo.com> 1380977766 +0200 This is a commit message.
  • 8. The commit • Is identified by • a snapshot of the repo state (the tree). • parent commit(s) • a commit message • Is immutable • Has a deterministic hash (SHA1) • Commits form a linked list: the history
  • 9. Git References $ cat .git/refs/heads/master dbaf944a4a9eb72af64042b1e3a128936000dfc2 $ cat .git/HEAD ref: refs/heads/master $ echo "dbaf944" > .git/refs/heads/newbranch $ git checkout newbranch Switched to branch 'newbranch'
  • 10. Git References $ git tag 1.0 dbaf944a4a9eb72af64042b1e3a128936000dfc2 $ cat .git/refs/tags/1.0 dbaf944a4a9eb72af64042b1e3a128936000dfc2
  • 11. Take home message • Git stores a snapshot of the whole repo at each commit. • The SHA1 of a commit depends only on its content, message, committer and parent(s). • A git branch/tag is a 40 digits hex number stored in a file.
  • 12. Things we can play with git reflog git fsck git pack git config git rebase -i git reset git refspecs git stash git add -p git log (advanced stuff) git pull –rebase

Editor's Notes

  • #8: Internal model of the commit object.DEMO: git cat-file –p &lt;commit sha1&gt;
  • #9: The most important Git object is the COMMIT.The most important thing about the commit is that it is IMMUTABLE.So why is it important?A commit is primarily defined by 3 things: a snapshot of the working directory, the “disk” state ; a commit message ; and most importantly, a parent commit. Every commit has a pointer toward its parent. This is what defined a history of commits, a chained list of commit.So, if you change- a single file -&gt; different commit- the commit message -&gt; different commit- the commit parent or parents -&gt; different commitA commit is uniquely identified by its SHA1. A SHA1 is deterministic : a snapshot with the exact same content will have the same SHA1. A commit refering to the same snapshot, the same parent commit and the same commit message will be identified by the same SHA1.So why is this important?It matters because of the initial Git design choices. Git is primarily a content-addressables file that stores any version of any object as a distinct object accessible for ever. Any file, any snapshot, any commit that has been archived in Git, can be retrieved for ever by its SHA1.These files are stored entirely, all the time. This is a major difference with other versioning systems such as Subversion and Perforce, which stores diff of files.Now, I’m making simplifications, but this is true in a 1st approximation.Why do we care?This means that all snapshots that have been committed once can always be retrieved. Keep this in mind as it will be important later.
  • #10: Commit has a pointer to a tree, which describes the entire git repo content.