Have you ever modified your working code and totally messed it up? And you do not know what you did wrong? Have you ever wondered that how can you test some features in your code without modifying the primary code? You must have gone through all these things, especially if you are a developer. As a single solution to all these real-world problems, developers use git. So, in this blog I will be introducing you to the git, its core concepts and architecture and its working with some examples. So, let us begin with the introduction to git.
What is Git?
In short, Git is an open-source distributed version control system created by Linus Torvalds. In simple language, git is a piece of software to track the changes you have made to your files and folders. Here we have three terminologies: Open Source, Distributed and Version Control System (VCS). So, let us understand them one by one.
Open Source - Open source is a terminology used to describe the software code in which the original source code is made freely available to see under some license and may be redistributed and modified. For example: Python Programming Language, Linux, Mozilla Firefox, Chromium etc.
Distributed - Distributed is opposite to centralized. In the context of git, git offers distribution of your files and folders so that each user on your project can maintain its own version of files. It will be clearer to you when you learn to collaborate with git.
Version Control System - The Version Control System (VSC) is simply a software or a system which helps you to track different version of the same file. For example: Let us assume you created an empty text file which is the first version of that file and later if you add some texts to it, it will be the second version of that file. It will be very difficult to you to remember what you have changed since the first version; in case you have done multiple edits. So, Git does it for you. It keeps record of all the modifications in your file since its first version and allows you to go back to previous version on your need.
Is there only git to do perform these tasks?
No, there are many other version control systems like Source Code Control System (SCCS), Revision Control System (RCS), Concurrent Version System (CVS), Apache Subversion (SVN), Bitkeeper SCM etc. But git is the most widely used VCS around the world.
Git Installation
To run git on your system, you need to install it. The git installation file is found on Git-Scm for Windows, Linux and Mac Systems. You can download it and install with simple steps.
Git Concepts and Architecture
Git uses three tier tree architecture in contrast to other VCSs which use two tier architecture. So, let us understand what git's three tier tree architecture is.
When we are working on a project, we have lot of files and folders saved on our computer's hard disk which is called Working directory. To track changes to those files and folders, we need to initialize git to that directory. When git is initialized to a project directory it starts to track the changes been made to that directory.
In the above figure we can see Repository at the top. Repository is the main tracking directory created by git when git is initialized. It is placed inside your project directory in the name of .git/
. It has all the history of changes. If you delete .git
directory your working directory will not be tracked anymore, and your change history will also get cleared.
Now in the above figure, let's come down to middle where we can see Staging Area. Staging area is like a cache memory which keeps the records of the changes before you commit them to the repository. The staging area tells you which files have been modified and should be committed to the repository. Also, in the figure we can see an arrow coming down straight to working directory from repository. This indicates the ability to undo changes we have made to our files.
Before we get started with the working of git, you need to know some basic concepts or terminologies which you will see when you are working with git. They are:
Hash Values
Git generates a hash value each time you commit some changes. The hash value is calculated based on SHA-1 (Secure Hash Algorithm 1). The hash value is calculated with the help of the data in your commit. With this hash value you can see specific commits and changes done at that commit.
HEAD pointer
I'll show you the head pointer when working with git, but you need to know that HEAD
pointer is a pointer which indicates your current commit. That means when you keep doing changes to your files and you keep committing, git HEAD
pointer keeps moving to the most recent commit.
Now let's see a brief working of three tier architecture of git. I am using git bash
which is a command line tool to use git. It automatically gets installed when you install git. Let me create a directory named Blog and I will create one text file inside it to track.
As you can see in the figure above, I am inside my Blog directory, inside which I have an empty text file file.txt. So, let's initialize git in this directory.
The command to initialize git in the working directory is git init
. As you can see, after I initialized git in the directory .git
folder is created. The .git
folder has all the git configurations to track the changes and also the history of the changes. The above figure also shows the items inside the .git
folder where you can see HEAD
file. This file is the same pointer file I explained earlier.
Now let us see the status of the repository and staging area. The command to do so is git status
.
As the above snippet shows when I entered the command git status
, git says that we haven't committed yet but there are some files in the directory which are not tracked or committed. So, let's add the file.txt
to staging area and commit it to the repository. The command to add file to the staging area is git add <'filename'>
and the command to commit it is git commit -m <'commit message'>
. The commit message is to explain what modification this commit does to the file.
In the above snippet I have first added file to the staging area and then committed it to the repository. And then when I entered the command git status
, git shows that everything in the working directory or working tree is clean and there is nothing to commit.
This is the basic implementation of git's three tier architecture. Now we will see the working of git in little more depth.
Working with Git
Add Files
Adding files mean to add files to the staging area and then commit them. To add the files, you can use the command git add <'filename1'>,<'filename2'>
. This command is only feasible when we have very few files to commit. If there are more files, it is not possible to add files one by one. So, to add all the files, we use the command git add .
.We also have other commands for the same task such as git add -A
or git add -u
etc. You can see the Git Documentation or use git help
command on git bash to see their usage.
To commit the added files, we use the same command git commit -m <'Commit Message'>
. We can also use shortcut to add files and then commit at once. The command to this is git commit -am <'Commit Message'>
. But this will not track new files and deleted files.
So, let's change the contents of the file.txt and commit them to the repository.
As you can see when I modified the file and added to the staging area, git tells that it has been modified.
View Git History
As we have two commits to the repository, let's see history of commits. The command to do this is git log
.
The figure above shows the output of git log
command. This shows all the history of commits till now. Here you can see the two things that I explained earlier. There is a hashed value on the right side of every commit which identifies that commit uniquely. Also, on the right most part of the latest commit, you can see HEAD
which is pointing to that commit. There is a term master
, which represents the main directory of the tracked files on git. You will get to know more about this when you start to work with Git Branches.
View Changes
To view changes, you have made to the file, you can use the command git diff <'commit hash'>
. This command is used to compare the latest version of the file with the specified commit. There are many implementations of the git diff
command. Refer to the Git diff documentation. Here I have added a line "File Changed" to the file1.txt
. Now let us compare it with the previous commit and see the output.
Here, git says that I have removed the "Hello, World!" text and added two lines of texts to the file1.txt
, which is true.
Delete Files
To delete files in the directory, you can simply delete the file from the explorer and then add and commit changes. But the more efficient way would be using the command git rm <'filename'>
which adds your deletion to the staging area automatically, so you don't have to add it. You can directly commit this change.
Rename Files
Similarly, to rename the files in the directory, you can simply rename the file from the explorer and then add and commit changes. But in this case also, the more efficient way would be using the command git mv <'old-filename'> <'new-filename'>
which adds your renamed file to the staging area automatically, so you don't have to add it. You can directly commit this change.
BONUS
What is GitHub?
People often get confused between git and GitHub so let us sort it out now. As we saw, git is the command line tool (you can also find GUI version) used to track changes locally on our own machine. It doesn't require internet connection to perform its tasks. Whereas GitHub is a cloud-based git repository hosting service. GitHub allows repositories to be hosted on cloud and allows many developers to work on the same project remotely. It requires internet connection to connect to the cloud when you need to synchronize your code with the code at remote repository.
Is there only GitHub to do this?
No, there are many providers which provide the similar services. Such as GitLab, SourceForge, Apache Allura etc.
Git is really useful and the most essential tool for any developer. The collaboration between developers around the world, remotely, wouldn't have been possible without git. The modification of the code on production environment also wouldn't have been possible without git. There are uncountable uses of git and due to which it is the most widely used tool on the software development world. I tried to introduce git from a basic level. But this is very basic use of Git. There is a lot more you can do with Git. Tell me how much this article helped you to understand git, in the comment section below.
Thank you for reading.
Happy Learning! ❤️