Skip to content

Git Research

oykuyilmaz1 edited this page Feb 19, 2020 · 10 revisions

Research: Git as a version management system

Have you ever heard about Git or GitHub? Do you ever wonder how Git stores all the versions of you work? Or have you ever lost time just because you made a mistake somewhere in your project and cannot find your old version of code? No worries, you can find the answers you need in this research.

Jump to:

What’s a version control system?

A version control system tracks the history of changes as people and teams collaborate on projects together. Everyone can find out an old version of the project, and the code can be turned back to the earlier versions. They also can answer these questions about a change: Who made it, when was it made, the comments of the developers when these changes were made.

Version control software helps prevent concurrent work from conflicting. So, 2 different developers can make changes to different parts of the same code.

Here is an image that shows the distributed control system mechanism:

What is Git and why do we use it?

From the definition of the official website of Git, "Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency."

  • Collaboration: With a Version Control System, everybody on the team is able to work absolutely freely - on any file at any time. The VCS will later allow you to merge all the changes into a common version. There's no question where the latest version of a file or the whole project is. It's in a common, central place: your version control system.

  • Storing Versions: Saving a version of your project after making changes is an essential habit. But without a VCS, this becomes tedious and confusing very quickly. A version control system acknowledges that there is only one project. Therefore, there's only the one version on your disk that you're currently working on. Everything else - all the past versions and variants - are neatly packed up inside the VCS. When you need it, you can request any version at any time and you'll have a snapshot of the complete project right at hand.

  • Restoring Previous Versions: Being able to restore older versions of a file (or even the whole project) effectively means one thing: you can't mess up! If the changes you've made lately prove to be garbage, you can simply undo them in a few clicks. Knowing this should make you a lot more relaxed when working on important bits of a project.

  • Understanding What Happened: Every time you save a new version of your project, your VCS requires you to provide a short description of what was changed. Additionally (if it's a code / text file), you can see what exactly was changed in the file's content. This helps you understand how your project evolved between versions.

  • Backup: A side-effect of using a distributed VCS like Git is that it can act as a backup; every team member has a full-blown version of the project on his disk - including the project's complete history. Should your beloved central server break down (and your backup drives fail), all you need for recovery is one of your teammates' local Git repository.

Basic Terminology

  • Commit: For the perspective of Git, data is a set of snapshots. For every commit or save, Git takes a snapshot of the project to keep track of every modification.

  • Repository: A directory that contains the project and some files to communicate with Git.

  • Staging Area: A file that stores the information about the changes of your directory. These changes will be the information needed in the next commit.

  • SHA (Secure Hash Algorithm): A unique ID given to each commit.

  • Tagging: We can add some tags to our commits. With using these tags the important commits can be remembered. For example, at the end of the development of each version of the software, a tag can be added to remember to the last commit.

  • Branching: Branches can be thought of as a series of commits and some pointers that point the latest commit in that series. When a new branch is created, a new pointer is initialized at your current branch. If you make any commit to the new branch the old branch is not affected by this commit.

  • Merging: Merging is combining two branches. To merge 2 branches Git creates a new commit and combines 2 branches on that new commit. While merge operation there can be some conflicts. To merge 2 branches all conflicts must be solved. There are some merge conflict indicators that is explained in the picture

merge

Basic Commands

git config - To set your user name and email in the main configuration file.

git init - initializes a repository

git clone - copies a repository from a remote source

git status - gives you the status of your files in your repository

git add - adds changes and creates a staging area

git commit - commits your changes, creates an object and gives them an ID.

git branch branch_name - creates a branch

git checkout branch_name - makes your desired branch active

git push - adds your local changes to the remote

git pull - takes the updates from your remote

git merge - combines two branches

git stash - Save changes that you don’t want to commit immediately.

git reset - You know when you commit changes that are not complete, this sets your index to the latest commit that you want to work on with.

git remote - To check what remote/source you have or add a new remote.

For more commands, you can check this link

What do these commands do behind the curtains?

A Git repository can be viewed as 3 main areas: the working directory, the staging area, and the repository that the commits will be saved.

To make any change in the repository, the first files that have modification must be added to the Staging Area. For each commit, Git stores a snapshot of files with modifications and gives each commit a 7 digit id. Also, each commit has a commit message, it is important to create meaningful messages for each commit.

When you type git init, it creates a repository. That resembles the root if a tree:

1

When you git add files, it creates a staging area like:

2

When you git commit, you see that it gives this commit an id(it is A in the figure but normally it is a long hash, you can check Basic Terminology). Also your default branch(master in the figure) and head branch comes with you along your journey:

3

Then suppose you made a change and added this change. It creates another staging area on the top of your first commit:

4

And when you git commit, the head and master branches come with you to your last commit. Also your commit has another id(in the figure it is B):

5

Now, if you want to make changes but you do not want to mess with your original files, you create a branch. Let's call it feature. The code for that is git branch branchname. If you want to continiue with your new branch you need to git checkout branchname. Or you can type(in this case of course) git checkout -b feature to do both of the tasks in the same line:

6

So that you have a new branch, you can git add your changed files:

7

When you git commit, this time your selected branch(which is green) goes with this commit and your inactive branch(master, blue) stays where you leave it:

8

Now suppose you made some changes and added them and committed them three times:

git add files

git commit

git add files

git commit

git add files

git commit

Then your tree should look like this:

9

Now comes the good part, suppose you want to go back to the point where you left the master branch. All you need to do is git checkout master. your master branch becomes active(green) and your feature branch becomes inactive. Remember, head follows you no matter which branch is selected:

10

You can git add files and git commit and your tree becomes:

11

And if you think that all changes are okay and you no longer need to seperate them, you can git merge and ta-daa, all your changes merge to one last commit:

12

Conclusion

Git is an open source tool that empowers individuals or groups that work on a detailed project. Each user's commit contains the name of it, the description of the commit. Git is useful in cases where a team is working from remote areas. Since, any individual can make changes provided the access is given for the remote repository.

Additional links and References

https://www.atlassian.com/git/tutorials/what-is-version-control

https://medium.com/analytics-vidhya/git-version-control-system-in-15-minutes-ed60aa9e009a

https://confluence.atlassian.com/bitbucketserver/basic-git-commands-776639767.html

https://dev.to/dhruv/essential-git-commands-every-developer-should-know-2fl

https://guides.github.com/introduction/git-handbook/#basic-git

https://git-scm.com

https://www.git-tower.com/learn/git/ebook/en/command-line/basics/why-use-version-control

https://towardsdatascience.com/git-version-control-system-666a1ffd85d3

https://opensource.com/article/19/2/git-terminology









Clone this wiki locally