Maintained by: David J. Birnbaum (djbpitt@gmail.com) Last modified: 2022-03-06T17:32:58+0000
This tutorial describes five core Git commands:
status
, add
(also rm
),
commit
, push
,
pull
. It does not cover all features of these
commands; it concentrates on the ways you are most likely to use them in your course
projects.
Because you’ll be working on the command line, this tutorial also assumes that you already know about command-line navigation. If you are new to the command line, or uncertain about your knowledge, we recommend first working through parts 1–3 of the Software Carpentry tutorial about the Unix shell.
Your local Git repo has three types of spaces: the working area (sometimes called the working tree), the staging area (sometimes also called the index), and the local branch. You won’t understand how Git works without understanding the relationship among these areas, so take a moment now to read about them at Lucas Maurer’s Git Gud: The Working Tree, Staging Area, and Local Repo. The descriptions below assume that you have read this introduction.
There is also a fourth type of space in your local repo, a remote
tracking branch, with which you interact explicitly primarily when
you use git fetch
. We won’t practice
fetching here, but we’ll refer below to remote tracking branches
where appropriate.
In our course you must do all work on the command line inside your local Git repo. Specifically:
Do not use a Git graphical client (GUI; graphical user interface). Graphical clients are easier for beginners, but you won’t be a beginner for very long, and once you become comfortable with the basic commands, the command line will be both faster and more capable.
Do not use the browser to download from or upload individual files to GitHub. After you clone a repo, the only way you should transfer information between your local machine and GitHub is by pushing (to upload) and pulling (to download).
Do not edit any files directly on GitHub, that is, while inside your browser looking at a file on GitHub. The only editing you should do is in the files inside your local repo on your own machine.
Do not work in other space on your system and then copy the files into your repo when you are ready to share them. Do all of your work inside your local repo.
Users often find Git disorienting at first because it is not like other systems we may know for sharing files, like Dropbox or Google Docs, but whatever the virtues of those cloud storage systems, Git is better for collaborative code development because that is what it was designed for. The commands below are enough to get you started, and you will become comfortable with them quickly as you use them regularly to develop your course projects. Don’t be shy about asking any of the instructors if you get stuck or confused!
Branches are different versions of a project that coexist, and Git allows you to work on one, then switch and work on another, and then merge branches to bring all of your work together. Every GitHub repo always has a main branch. A common use of branches might be that the main branch holds the latest stable version of your project, and you create separate branches to develop different features. Once each feature is stable, you merge its branch into the main branch and delete the feature branch. The reason for this workflow is that it keeps the main branch stable; you add new content to the main branch only after you have verified in your feature branch that it is working properly.
Some projects, especially small ones created by small teams, are developed entirely in the main branch, without creating feature branches to develop individual features. This simplifies your repo in obvious ways, at the expense of having different things going on in the main branch at the same time, some of which may be broken because you are still working on them. Whether your project team will use feature branches or work entirely in main is up to each team. We do not discuss here how to create and select feature branches; if you decide to use them, consult with your project mentor for more information.
When we use Git, we typically synchronize our local repo with a repo of the same name on GitHub, and our teammates do the same. GitHub thus serves as the middle man, so that we share new content with teammates by pushing our changes to GitHub, after which our teammates pull them from there, even as we pull our teammates’ changes after they push them. It might appear, then, as if each member of the team is working with two versions of the content: the local repo on the local file system (that is, the files and other information on your own computer) and the version on GitHub (which you can see by visiting GitHub in a web browser). In fact, though, we are working with three versions: 1) the version on GitHub (the remote branch), 2) what we know locally about what’s on GitHub (the local tracking branch), and 3) our local files (the local branch, including our working tree and our staging area). To keep these in sync we need to push new local work to GitHub, and we need to pull new work by our teammates from GitHub.
In GitHub terms the remote branch, the local tracking branch, and the local branch are all branches, which may be confusing because they may all appear to have the same name. For example, if you have a local branch called main, there is typically a branch called main on GitHub as well as a copy of that GitHub branch on your local machine called remotes/origin/main (your local tracking branch, which you update every time you fetch or pull from GitHub). Additionally, your local workspace may contain changes that are not part of your local branch or any other. Changes become part of your local branch only after you commit them, and before that they are located in either the working tree or the staging area. They are in your Git repo directory, but they aren’t part of any branch.
If you run git branch
in your local repo,
it will list all local branches on your machine, with an asterisk next to
the one you are working in. In most cases in this course you won’t have any
feature branches, so your only local branch will be called
main and your output will look like:
* main
If you run git branch -a
(the
-a
switch stands for ‘all’), you see all
branches that your repo machine knows about, which means your local
branches plus any local tracking branches, that is, the state
of branches with the same name on GitHub as of the last time you fetched or
pulled. (This command also shows information about the location of your
HEAD
, which we won’t discuss here.) The
output of git branch -a
might look
like:
* main
remotes/origin/HEAD -> origin/main
remotes/origin/main
Ignoring the line about HEAD
for the
moment, the line that reads simply main with a leading asterisk is
your local branch called main, and
remotes/origin/main
is your local
tracking branch for the main branch on GitHub the last time
you fetched or pulled. If your teammates have pushed new content since your
last fetch or pull, those recent changes will not be reflected in
remotes/origin/main
until you fetch or
pull again.
In the discussion below we will need to distinguish your local branch, your local tracking branch (what your machine knows about what’s in the corresponding branch on GitHub), and your remote branch (what’s actually in the corresponding branch on GitHub).
You can run this command at any time to get more information about the state of your
repo. git status
provides information about the
current state of your local repo, including:
The branch you are on. This will usually be main (or, in older repos, master) unless you have created additional branches. In the discussion below, we’ll assume that you are working in the main branch, and that it is your only local branch.
Whether your local current branch, main, is up to date with the
corresponding local tracking branch, remotes/origin/main,
which is what your local machine knows about what’s on GitHub. Your local
branch may be behind the version you last fetched or pulled from
GitHub (in which case you can tell it to catch up with
git pull
or
git merge
), ahead of the version you last
fetched or pulled from GitHub (in which case you can run
git push
to add your new information to
the repository on GitHub and update your remote tracking branch), or
up to date with the remote tracking branch, that is, the most recent
version you fetched or pulled from GitHub. Your local branch may also,
simultaneously, be both behind your local tracking branch (there is new work
on GitHub that you have not yet merged into your local branch) and ahead
(you have new work in your local branch that you have not yet pushed to
GitHub).
Importantly, up to date
does not mean up to date with what is
on GitHub because Git
git status
doesn’t look at GitHub; it
compares the work in your local branch (e.g., main) with your
remote tracking branch (e.g.,
remotes/origin/main
). If your teammates
have pushed anything to GitHub since your last fetch or pull, that newest
information is not yet in your remote tracking branch, which means that
git status
won’t know about it. For this
reason, it is helpful to pull often.
If you have changes that are not in GitHub,
git status
will tell you which changed
files:
Are ready to be pushed to GitHub, so that GitHub will catch up on any
changes that you have committed locally with
git commit
but not yet
pushed.
Are ready to be committed, that is, have been moved onto your staging
area with git add
but have not
yet been committed.
Have changes not staged for commit, that is, the files have been
committed previously, after which you changed them, but you haven’t
yet run git add
to move your
changes to those files into your staging area.
Are untracked, that is, have never been committed.
For example, when I run git status
in one of my
local repos, I see:
On branch verb-xproc-for-each
Your branch is ahead of 'origin/verb-xproc-for-each' by 1 commit.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: pos/verb/paradigm-values.md
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: pos/verb/verb-generate.xspec
modified: pos/verb/verb-lib.xspec
Untracked files:
(use "git add <file>..." to include in what will be committed)
pos/verb/modules/verb-2b.xsl
Reading from the top down, this status message reports the following:
I am working in a local branch called verb-xproc-for-each. If you have not created additional branches, you will normally be on a branch called main. In this example, I’m working in a feature branch called verb-xproc-for-each.
My verb-xproc-for-each branch is ahead of what my machine knows about what’s on GitHub, that is, with my remotes/origin/verb-xproc-for-each local tracking branch. This means that I have made one local commit that I have not yet pushed to GitHub. If my teammates have pushed new content to GitHub since my last fetch or pull, my machine also doesn’t know about that.
I have changes in one file that I have not yet committed. This means that I
have edited this file and run git add
to
move it into my staging area, but I have not yet run
git commit
to save it into my local
repository.
I have untracked changes (changes not staged for commit
) in two files,
which means that the files were committed previously and I then edited them
in my working tree, but I have not yet moved the edited versions into
my staging area with git add
.
I have one untracked file, which is a new file that I created in my working tree but have never added to the staging area.
Run git status
as often as you’d like to keep
track of the state of your work. We recommend running it before adding, committing,
pushing, or pulling, and looking at the results to ensure that you are about to do
what you think you are about to do. Note that running
git status
prompts you about things you might
want to do next.
git add
, followed by a directory or file name,
moves the specified files from the working tree into the staging area.
git add
is the only command discussed here that
can refer to individual files and directories, so if you want to commit or push only
some new or changed files and not others, you have to specify that at the moment
when you add the files.
We commonly use git add
in the following ways:
git add
plus a filename adds a file to the
staging area, so that it is ready to be committed.
git add
plus a directory name adds all new
content in the directory (and everything below it, if there are
subdirectories) to the staging area.
git add .
(the dot means ‘current
directory’) adds all new content in the current directory and below; if you
are in the top-level directory of your repo, it adds all new content in the
repo.
git add -A
adds all new content (new files
and changed files) anywhere in the local repo, regardless of where you are
located. This differs from the preceding command, which only adds new
content in the current working directory and below.
Using git add -A
will ensure that you haven’t
forgotten to add changes you want to add, but you don’t always want to add
everything that you’ve changed at once (see below, under the discussion of commits,
for details). Don’t get in the habit of using
git add -A
automatically; when you are about to
use it, run git status
first and verify that you
want to stage (add) and commit all of your changes at once. If you accidentally add
something you wanted to add only later, or not at all, you can remove it from the
staging area before you commit; git status
will
tell you how.
Note that:
You should create a .gitignore file in the main directory of your repo (it must have exactly this name, with the leading dot and no filename extension) and commit and push it. See How to use a .gitignore file. Your should mention at least .DS_Store (note the leading dot); this is an annoying MacOS housekeeping file that can confuse Git about whether your files have changed, so you want to tell Git to ignore it.
To remove files or directories from your repo, precede the regular shell
deletion commands with git
, so:
git rm
with the filename for a single
file, git rmdir
with the directory name
for an empty directory, and git rm -rf
with the directory name for a directory with content—be careful with this
last one! If you forget to precede your command with the word
git
, you’ll remove the content from your
local filesystem, but you won’t have told Git that you’ve removed it, much
as if you create a new file you haven’t told Git about it until you run
git add
. This means that this type of
deletion will be an untracked change until you run
git rm
, etc., so you might as well do
that from the start to delete the file and notify Git with a single
command.
Removing content from a repo in this way does not remove it from the Git history, and that means that you (or anyone who has cloned your repo) can restore it later. This is a feature; Git is designed to provide version control, and you want it to maintain a full history of every change you ever make to your repo. Should you need to remove something permanently and without a trace (for example, if you accidentally upload confidential information that should not have been shared), you need to rewrite your Git history, and not just delete the file, to make it irretrievable (ask your instructors for advice if you need to do this).
Because of the way Git tracks changes, you cannot create and commit empty
directories. If you want to add a directory that will eventually hold
content, but that doesn’t contain anything yet, create a small placeholder
file and remove it later (with git rm
)
once you’ve added real content.
You can call a placeholder file whatever you want, and it can have any
content, since you’re just putting it there to enable Git to see the
directory, and you’re going to remove it later. The convention, though,
is to create an empty (zero-byte) file called .gitkeep (with a
leading dot). You can do that by running the command
touch .gitkeep
in the directory where
you want to create it.
We recommend committing by running
git commit -m "commit message goes here"
, that
is, always specifying the commit message as part of the command. We explain why
below.
git commit
moves all files in your staging
area (you put them there with git add
)
into your local branch, that is, it makes them part of the history of the
project. All commits must be accompanied by a commit message, typically a single
line that describes what the changes do, possibly also with more detailed
information. The quality of your commit message matters because Git keeps a
history of all commits, and if you later want to undo a change, the commit message
will help you find it. Stop now to read Chris Beams’s How to write a Git commit
message.
When you run git commit
you cannot specify that
you want to commit some files and not others;
git commit
always commits everything in the
staging area at once. The way you control which files to commit is with
the add instruction; you should add only the files you want to commit together, with
a shared commit message, and then run git commit
to commit all of them. This is why you don’t want to run
git add -A
carelessly. If, for example, you work
on two different features at once, you should add only the files from one feature,
commit them with a meaningful message, and then add the files from the other and
commit them separately, with a separate meaningful commit message.
If you run git commit
with the
-m
switch, followed by text in quotation marks,
that text becomes your commit message. This is the easiest way to add the message.
If you run just git commit
by itself, Git will
open an editor so that you can compose your commit message separately. By default
Git uses an editor called Vim, which can be confusing to new users, and
that’s the editor that will open unless you’ve configured your instance of Git to
use a different editor (if you can’t find instructions that make sense about how to
do this on line, ask your project mentor for help). If you accidentally forget the
-m
switch, then, you’ll find yourself in the
Vim editing environment, and possibly confused about how to get out. To
get out of Vim, hit the Escape key followed by a colon (your cursor should
move to the bottom line when you do that), then a lower-case q
, and then the
Enter key. This should deposit you back at the command line and display a message
that the commit has been aborted. You can then try again with the
-m
switch.
The commit
command also has an
-a
switch, which can be combined with the
-m
switch. The
-a
switch adds all changed files in the repo and
commits them as a single operation, so if you use it, you don’t need to use
git add
first.
git commit -a
differs, through, from
git add -A
followed by
git commit
because
git commit -a
adds all changed files, but not any
new files, that is, files that have never been committed (the technical term for
these is untracked files, and you’ll see them referred to that way when
you run git status
). If you are tempted to use
git commit -a
, you should first run
git status
first to verify that it will add all
the files you want to add, and no others.
You can see a history of commits with git log
.
There are ways to customize the output of this command, and we won’t go into detail
here, but if you need to revert a commit (that is, undo a change you committed), you
can use git log
to find the commit you want to
revert by its commit message (see why meaningful commit messages matter?!) and then
use the appropriate Git commands to revert it.
git push
uploads to GitHub the changes you have
committed to your local branch. This is how you share your work with your
teammates, who then fetch or pull those changes down to their local machines.
git push
pushes all commits at the same time.
That is, just as git commit
commits all changes
you have moved to the staging area with
git add
, and you can’t pick and choose among
files you want to commit at the commit stage,
git push
pushes all commits you have made, so you
can’t pick and choose among commits at the push stage. In other words, if you want
to share only some of your work at a time, you have to start thinking about that
when you add.
If your teammates have pushed content to GitHub since your last fetch or pull, Git
will refuse to let you push until you first fetch or pull that new content. If you
get a message to that effect, run git pull
and
then try again to push.
git pull
does two things at once: it fetches
changes from GitHub to your local version of the corresponding GitHub branch (your
local tracking branch, i.e.,
remotes/origin/main
) and it merges your local
tracking branch into your local branch. This is how you get your
teammates’ work into your local branch. You can fetch and merge separately,
and you can execute other types of merges (that is, not only between branches that
share a name), but most commonly in course projects
git pull
will do what you want.
When you pull you may get an error report about a merge conflict. Merge conflicts happen when the version of a file that you pull from GitHub differs from your local version and Git doesn’t know which version you want to keep as the result of the merge. Git can merge versions of the same file with different changes as long as the changes are on different lines, but because Git tracks changes at the level of lines, you’ll raise a merge conflict if the two versions have different changes on the same line. Here’s how to think about merge conflicts:
Most merge conflicts arise because of poor Git hygiene, that is, from failing to pull before every working session and push at the end of every working session. The more time that elapses between the pull and push activities that synchronize your local files with those on GitHub, the greater the opportunity for you and your teammates to edit the same lines in the same file and cause a merge conflict.
Even with proper Git hygiene, merge conflicts may arise through accidental bad timing. Should you encounter a merge conflict, don’t panic, but do ask your project mentor for guidance in resolving it. All developers encounter merge conflicts occasionally, and they are not difficult to recover from, but they are a nuisance, so you should be mindful about pushing and pulling regularly in order to avoid them as much as possible.