Git Tutorial: Objects, References, The Index

To understand the core of Git internals, there are 3 things to we should know: objects, references, the index.

I find this model is elegant. It fits well in a small diagram, as well as in my head.

Objects


All files that you commited into a Git repository, including the commit info are stored as objects in .git/objects/.

An object is identified by a 40-character-long string – SHA1 hash of the object’s content.

There are 4 types of objects:

  1. blob – stores file content.
  2. tree – stores direcotry layouts and filenames.
  3. commit – stores commit info and forms the Git commit graph.
  4. tag – stores annotated tag.

The example will illustrate how these objects relate to each others.

References


A branch, remote branch or a tag (also called lightweight tag) in Git, is just a pointer to an object, usually a commit object.

They are stored as plain text files in .git/refs/.

Symbolic References


Git has a special kind of reference, called symbolic reference. It doesn’t point to an object directly. Instead, it points to another reference.

For instance, .git/HEAD is a symbolic reference. It points to the current branch you are working on.

The Index


The index is a staging area, stored as a binary file in .git/index.

When git add a file, Git adds the file info to the index. When git commit, Git only commits what’s listed in the index.

Examples

Let’s walkthrough a simple example, to create a Git repository, commit some files and see what happened behind the scene in .git directory.

Initialize New Repository


1$ git init canai

What happened:

  • Empty .git/objects/ and .git/refs/ created.
  • No index file yet.
  • HEAD symbolic reference created. $ cat .git/HEAD ref: refs/heads/master

Add New File


1 2$ echo “A roti canai project.” >> README $ git add README

What happened:

  • Index file created.
    It has a SHA1 hash that points to a blob object.
  • Blob object created.
    The content of README file is stored in this blob.

First Commit


1 2 3 4$ git commit -m’first commit’ [master (root-commit) d9976cf] first commit 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 README

What happened:

  • Branch ‘master’ reference created.
    It points to the lastest commit object in ‘master’ branch.
  • First commit object created. It points to the root tree object.
  • Tree object created.
    This tree represents the ‘canai’ directory.

Add Modified File


1 2$ echo “Welcome everyone.” >> README $ git add README

What happened:

  • Index file updated.
    Notice it points to a new blob?
  • Blob object created.
    The entire README content is stored as a new blob.

Add File into Subdirectory


1 2 3$ mkdir doc $ echo “[[TBD]] manual toc” >> doc/manual.txt $ git add doc

What happened:

  • Index file updated.
  • Blob object created.

Second Commit


1 2 3 4$ git commit -m’second commit’ [master 556eaf3] second commit 2 files changed, 2 insertions(+), 0 deletions(-) create mode 100644 doc/manual.txt

What happened:

  • Branch ‘master’ reference updated.
    It points to a lastest commit in this branch.
  • Second commit object created. Notice its ‘parent’ points to the first commit object. This forms a commit graph.
  • New root tree object created.
  • New subdir tree object created.

Add Annotated Tag


1$ git tag -a -m’this is annotated tag’ v0.1 d9976
What happened:
Tag reference created.
It points to a commit object.
1 2
$ cat .git/refs/tags/root-commit d9976cfe0430557885d162927dd70186d0f521e8
Rajesh Kumar
Follow me
Latest posts by Rajesh Kumar (see all)
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x