A Git repository is just a very fancy way to store files. You can make any directory on any computer into a Git repository. Most of the time, it'll behave exactly like it's still a regular directory. But when you run the git commands, Git can keep a record of the history of that directory. You can tell Git to make a checkpoint of what's in the directory now, or tell it to change the directory to look like it did when you checkpointed it some time in the past. And you can share the directory with colleagues, so they can also see this history. You and colleagues can all have your own copies of the directory, and share with each other all (or some, or none) of the checkpoints of that directory you've made.
"repository": A directory that git knows about, which git can (optionally) share with other people. A repository is the directory + the history of that directory.
"commit": Instructing git to create a checkpoint of your directory, making a record of how it looks right now. You can later instruct git to return you to the way the folder looked now.
"commit log": git's history of all of the commits (checkpoints) you've done in your repository (directory).
"version control": The general name for the kind of tool Git is. These tools create record versions of directories, and let you change between newer and older versions of the directory.
That, on its own, is all Git (and other version control tools, like Subversion (SVN)) do. On its own, it doesn't run any programs, and nothing happens automatically. It just stores files when you tell it to, and lets you share the directory with colleagues.
How you use it
So here's the basic day-to-day workflow with git:
- You have a directory (repository) that git knows about (a directory that is "under version control").
- You put some new files into the repository, or maybe you update a file that's already there. This step doesn't have anything to do with git, you're just doing work for your job.
- You make sure git knows about the files you want it to keep a history for (it won't automatically remember everything in the directory).
- You tell git to commit (i.e., checkpoint, or record a new version) of the directory as it is now.
- Rinse & repeat
And that's it. Git is just keeping track of what you did.
Why would you want to do this? Well, from here, you have a couple of options for what could happen next:
- You can tell git to share this commit where your colleagues can see it, so they can download a copy of the work you did.
- Maybe you did some more work, but messed up.
- Git can show you the differences between your current work and your last checkpoint (or any previous checkpoint). So if you knew your code used to work, you can see where you went wrong.
- If you messed up your work too badly, Git can revert individual files, or the entire repository, to the way they were at your last checkpoint.
"push": Send the history of your git repository (all of your checkpoints) to another git repository. Usually this will be a central server, like Github or GitLab, because that's very convenient, but you can also send your history directly to a colleagues' own git repository.
So that's the core Git workflow in a nutshell. Things can get much fancier, but we won't worry about that for now.
Putting your code somewhere else
OK, so let's say you're doing something like writing some code, and now you need to deploy it on a server. What happens there?
- Log onto the server
- If the server doesn't already have a copy of your git repository, clone one
- Pull down the latest history from the repository
"clone": Create a new copy of a repository, usually from Github or GitLab.
"pull": Update your repository to have all of the history that's in a remote repository.
"remote": Another Git repository somewhere on the network.
"branch": The history of a Git repository doesn't have to be a straight line. You can tell Git to keep separate histories for different portions of your project. You can separate development of different code features up into different "branches", so the work on the different features don't interfere with each other.
Now your server has a copy of your repository. This copy is also a repository, since you downloaded the directory + the history of that directory. Your repository, and the repository that you put onto the server, are different repositories, even if they look similar. They don't have to stay similar -- you can do work in one that you never share with the other -- or you can keep them in lock-step. It all depends on what commands you give to Git.
So now your server has a copy of your code, and can do anything you want with it. Usually, you want the server to run the code. The server doesn't need to know anything at all about Git to do that. All it needs to know is your code lives in a directory.
Your code can be run by creating a cron job that looks in your Git repository's directory for the program to run. Or, you could run the program directly from that directory. At this point, Git isn't involved at all. It already did its job of putting the code on the server.
As you work, you'll add new files, and change existing files. You'll want Git to know about all of this, and you'll want other people or servers to get the latest stuff you did.
Git doesn't do anything for you automatically. You have to tell it all of this stuff. When you're ready to create a commit, you need to add the latest files to get (use "git status" to see what files Git doesn't know about, and tell Git about them with "git add your_file.txt").
Once you're ready to share your files, run "git push" to send them to Github/GitLab.
Other computers won't get your changes automatically - you have to run "git pull" on them to make them download your changes.
Common Git commands
|Create a Git repository||Move into the directory you want, then run
|See what files have changed||
|See how the files have changed|
|Tell Git to keep track of a file||
|Create a checkpoint||
|Send your commits to Github/GitLab||
|Download someone else's commits||
Something you'll hear come up in discussions about Git is "merging", "merge conflicts", and "conflict resolution".
The thing about everyone having their own copy of a repository, and being able to work independently in them, is that when you want to download someone else's commits, Git has to figure out how to combine their history with yours. (This is called "merging"). Sometimes it's not clear how it should do that.
Here's the situation: You and I both have a copy of a repository on our computers. Say we both modify the same part of the file, like maybe changing a message that gets printed out by our program.
- Original message: "Welcome to our site!"
- You changed it to: "Welcome to Our Site!"
- I changed it to: "Welcome to oursite.com!"
If you commit and push your changes to Github/GitLab first, well, good for you. Git's happy with whatever you did.
But when I come after you and commit and push my changes, Git will complain that it can't combine your history and my history. The same thing (the message in the program) was changed two different ways, and Git doesn't know which change it should keep, and which change it should throw away. This is called a "conflict".
Fixing this is called "conflict resolution". Git will save the file in a broken state - it'll have both of our changes in it. You run open up the file and modify it to look how you want (you could choose to keep your changes, or mine, or you could do something that combines the things each of us were trying to do).
Once that's complete, you create a new commit that saves the changed version of the file, and you try pushing again.
If you're using Git at a company that already uses it, you might see something called "commit hooks". This is the one place where automation can happen with Git. You may write programs that Git will run whenever you make a commit.
Many companies use these programs to do simple things like clean up extra whitespace, or force project ticket numbers to be referenced in the commit message. Some places go a little further and run their project's unit tests for every commit.
Anything you can build into a program can be triggerd with a commit hook. But Git doesn't run any commit hooks by default.