In looking at how information flows into a Git repository, it seems to follow this general workflow: Working Directory è Staging è Local Repository è Remote Repository. In my previous post, I looked at how to create my initial local repository, and the basic information it contained. In this post, I’m going to create some files in my working directory and move those into the Staging area.
In my post Poking Around the Git Interface, I’ve already set my basic identification information (name and email) that Git needs in order to track my changes. I can see this by running the git config –list command.
Before I can move anything to Staging, I need some content to move. I’m going to create two files: file1.c and file2.c. It doesn’t really matter what is in the file, I just need a couple of files to move around.
To stage these files, I use the add command. I can stage each file individually (git add <filename>), or I can stage them all at once (git add .). I’ll do them all at once.
So what is happening behind the scenes when you do a git add? If I change into the .git subdirectory, I can see that there is a new file called index.
This file is basically the staging area with metadata, including the SHA1 values, timestamps, etc. This is not a human-readable file, and while you can open it, you won’t get much value out of it.
If I change to the objects folder I see the following:
Remember, Git is essentially a key-value store (its more than that, but for what we are trying to explain this simplification helps). When you add a file into staging, Git creates an SHA1 hash for the file. This is a 40-character checksum hash of both the content I am trying to store, plus some header information. When Git goes to store my information into staging, it takes the first two characters of the hash and creates a folder. The file inside that folder is named with the remainder of the SHA1 value.
I’m guessing (and I say guessing cause remember, I’m learning as I go) that each file I add to staging will get stored in its own subfolder. I do wonder what happens if two files’ hashes start with the same two characters. Based off what I’m seeing above, I’m guessing they get put in the same directory.
In a nutshell, when I stage a file, the following happens:
- If this is the first time staging something to the repository, it creates an index file.
- Each file I’m staging has an SHA1 checksum hash created for it.
- The files are stored, based on hashname, in the Objects folder.
At this point I’ve taken files and moved them from my working directory to my staging area. The next step is to commit them to the local repository.
(Looking for a good Git reference? I’m using Professional Git as my guide on this journey.)