Version Control

using git, for researchers

Thanks to Software Carpentry

Todays Goals

Single person, no internet

Single person, backed up online

Multiple people, collaborative

Why version control using git?

  • Version history ("undo" over larger periods of time)
    Like Dropbox!
  • Backup (in case of your computer failing)
    Like Dropbox!
  • Sharing with collaborators (rather than using email)
    Like Dropbox!

but git is different...

  • Ensure collaborators never overwrite each others changes
  • Comparing what has changed between specific revisions
  • Full version history on your local computer
  • Collaborators have full version history on their local computer
  • Individual changes have labels
  • Individual changes have descriptions
  • Each part of the document can be attributed to the individual that introduced it
  • Each part of the document can traced to the date it was introduced
  • Each part of the document is associated with a description of when it was introduced
  • Backup in multiple places
  • Needn't trust a single organisation with your backups
  • Choice between the services/tools you use to manage versions
  • Tooling allows people to discuss changes before incorporating them
  • Tooling makes it easier to share results with the world
  • Multiple versions of the same document coexist simultaniously
  • Many different advanced workflows to choose from for more complex projects
  • ...

How Dropbox thinks of files...

Each revision is an entire document

How Dropbox thinks of files...

No concept of what changed between revisions

How git thinks of files...

A documents is a series of changes

How git thinks of files...

Each change either adds, removes, or modifies some number of lines

How git thinks of files...

git constructs documents by applying series of changes

"Commits"

Each of these changes is called a "commit" and contains:

  • What lines were added/removed/modified
  • Author
  • Date
  • Description
  • The change that came before it

This is important

We will talk about this at every opportunity

Lets begin

Boring stuff first, that needs to be done once

Exercise: Install git

https://git-scm.com/downloads

Exercise: Configure git

Git needs to know about you to attribute your changes


git config --global user.name "Peter Serwylo"
git config --global user.email "peter.serwylo@monash.edu"

If you use windows, and are unfamiliar with vim:


git config --global core.editor notepad

If you use MacOSX, and are unfamiliar with vim:


git config --global --add core.editor "open -W -n"

The Situation

Dracula, Wolfman, and the Mummy research moving to another planet want to share their research

Creating a "Repository"

Repositories store all the commits for a project.

From the directory you wish to turn into a git repo:

git init

Initialized empty Git repository in .../.git/

What does git know so far?

Nothing.

git only saves to the repository when we explicitly ask it to create a commit

What does git know so far?

git log:   Show commits in your repository

git log

fatal: your current branch 'master' does not have any commits yet

What does git know so far?

git status:   Show changes in your directory that are yet
to be saved to the repository in a commit

git status

On branch master

Initial commit

nothing to commit (create/copy files and use "git add" to track)

Exercise: Your first repo

Create a repo using git init.

Run git status and git log to ask about your repository.

Add content for git to track

Put the following text in a file "mars.txt"


									Cold and dry, but everything is my favorite colour
								

Add content for git to track

Ask git what it knows now...

git status

On branch master

Initial commit

Untracked files:
 (use "git add <file>..." to include in what will
  be committed)

	mars.txt

nothing added to commit but untracked files present (use
"git add" to track)
								

Add content for git to track

Tell git we want to prepare (but not yet create) a commit


git add mars.txt
								

Then ask git again:

git status

On branch master

Initial commit

Changes to be committed:
 (use "git rm --cached >file>..." to unstage)

	new file:   mars.txt
									

Add content for git to track

Now a commit can be created

git commit

Enter a description of this commit in your text editor
(e.g. "First entry about Mars.") then git will output:


[master (root-commit) e95d869] First entry about Mars.
1 file changed, 1 insertion(+)
create mode 100644 mars.txt
								

What does git know now?

git log:   Show commits in your repository

git log

commit e95d8690fcad1374a3b0d1a0231ef63118367b28
Author: Peter Serwylo <peter@serwylo.com>
Date:   Mon Jul 11 16:33:24 2016 +1000

    First entry about Mars.

What does git know now?

git status:   Show changes in your directory that are yet
to be saved to the repository in a commit

git status

On branch master
nothing to commit, working directory clean
								

Exercise: Your first commit

Create a text file and add some content to it

Note: Make sure to only use text editors such as notepad or vim. Don't use MS Word or other word processors.

Prepare a commit using git add

Create a commit and save it to the repo using git commit

Ask git about the commits in your repo using git log.

More changes

Add the following text to "mars.txt"


									The two moons may be a problem for Wolfman
								

The full contents of "mars.txt" should now be:


Cold and dry, but everything is my favorite colour
The two moons may be a problem for Wolfman
									

More changes

Ask git what it knows now...

Remember, this command can and should be run as often as you want

git status

On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in
   working directory)

	modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")
								

More changes

Ask git what has changed since our last commit

git diff

diff --git a/mars.txt b/mars.txt
index 382ccd9..8f7b000 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,2 @@
 Cold and dry, but everything is my favourite colour
+The two moons may be a problem for Wolfman
								

More changes

Tell git we want to prepare (but not yet create) a commit


git add mars.txt
								

Then ask git again:

git status

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   mars.txt
									

More changes

Now a commit can be created

git commit

Enter a description of this commit in your text editor
then git will output:


[master c2d0b8b] Info about moons and their impact on Wolfman.
 1 file changed, 1 insertion(+)
								

Exercise: Another commit

Edit your text file, adding more content on.

Check that git knows stuff has changed with git status

Ask specifically what has changed with git diff

Prepare and create another commit
(git add, git commit, git log)

Exploring History

The goals include:

  • Comparing different revisions
  • Reverting to past revisions

Comparing revisions

  • Edit file again
  • Don't commit
  • git diff, git diff HEAD~1, git diff [sha hash]

Reverting to past revisions

  • While change is there uncommitted...
  • git checkout -- file
  • git checkout [sha hash] -- file

That concludes phase 1

Single person project, without ineternet

Lets go online

Your repo has a series of commits

A "remote" repository also has a series of commits

Looks just like a local repository

Online services for git repos

GitHub (https://github.com)

GitLab (https://gitlab.com)

BitBucket (https://bitbucket.com)

Signup for GitLab

Create a new "Project"

Empty repo

They've effectively done the following on their server:

git init

Connecting your repo to GitLab's

We use a git concept called a "remote"

Remotes tell git where to look for commits other than your local repo

git remote add origin https://gitlab.com/pserwylo/learning-git.git

Connecting your repo to GitLab's

Verify you told git about the gitlab remote:

git remote
origin

Add the -v (for "verbose") option for more detail:

git remote -v

origin	https://gitlab.com/pserwylo/learning-git.git (fetch)
origin	https://gitlab.com/pserwylo/learning-git.git (push)
								

Pushing commits to the remote

What is in the remote initially?

Pushing commits to the remote

Ensure that all your local commits exist in the remote repo

git push origin master

Counting objects: 6, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (6/6), 545 bytes | 0 bytes/s, done.
Total 6 (delta 1), reused 0 (delta 0)
To git@gitlab.com:pserwylo/learning-git.git
 * [new branch]      master -> master

After pushing