Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Git and GitHub LiveLesson Review and Q&A with the Author

Git and GitHub LiveLesson Review and Q&A with the Author

Git and GitHub LiveLessons, published by Addison-Wesley Professional, is a video course based on a live workshop given by Peter Bell. The course is slow-paced, with Peter taking great care of ensuring that each new concept comes through to his audience in a clear way. No previous experience with Git or other source code management tools is assumed. Here, we present the course content and finally ask a few questions to the course's author, Peter Bell.

The course comprises 11 lessons, for an overall duration of about 5 hours.

First lesson: Git configuration

In the first lesson, Peter introduces Git's three levels of configuration and explains when you should use one or another in given contexts. Besides showing a few options that are likely to be often useful, Peter also explains how you can define command aliases.

Second lesson: getting started

This lesson starts from the very beginning: creating a local repo through git init, then populating it with some file and explaining very carefully how git staging area acts as a buffer previous to pushing staged files into git history.

What is characteristic of Peter's presentation throughout all the lessons is the great care he takes to explain certain Git design decision that might not be self-evident in the first place. This is the case, e.g., of Git staging area, which is a key Git feature making it easier to write meaningful commit messages. Another example of Peter's capacity to give his listeners a great insight into Git are his remarks about the reason behind the use of UUIDs, instead of, e.g., a simple counter as SVN does, to identify commits, and how this is essential to Git's behaviour.

Commands reviewed: init, add, commit, status, log.

Third lesson: GitHub

GitHub is introduced as a mechanism allowing to have an off-site backup of your code, and Peter shows how you can create a repo on GiHub and connect it to your local repo.

Commands reviewed: add origin, push, init.

Fourth lesson: renaming, deleting, and .gitignore

The fourth lesson covers the basic of working with files in git and shows how you can rename a file inside of the staging area or remove it altogether. Additionally, .gitignore is introduced to identify files that git should ignore.

Commands reviewed: mv, rm, .gitignore, commit -a.

Fifth lesson: branches, merges, and rebasing

With the usual slow paced mode and plenty of details, Peter explains what a branch is and why you want to use it to manage the development of a feature. The full branch lifecycle is described hands-on, from creating a branch to merging it after solving all conflicts, and eventually deleting it.

Working with branches offers a good opportunity also to explain what information git log provides and how it represents a branch head, last commit, and last push.

Merging deserves a more thorough analysis, with a discussion of both fast-forward and recursive merges, including their general applicability and how to prevent that a fast-forward merge is executed, so the merge information is kept in the history. An important tool for merging is diff, which can be run to show staged changes, unstaged changes, or both; it also allows to handle mixed status files, containing both staged and unstaged changes.

The last command presented in the fifth lesson is git rebase, which is useful to keep the history in a cleaner and more readable state. Peter is very clear in explaining the relationship between rebasing and merging and why you would want to rebase to sort of linearize the change history.

Commands: branch, checkout, merge, diff, commit, log, rebase.

Sixth lesson: Git under the hood

The sixth lesson is dedicated to learn how git works under the hood, starting with the .git/objects directory, whose elements corresponds to actions that git carried through: adding a file, creating a commit, etc. Peter introduces git cat-file -p, which is the command that allows to relate each elements in .git/objects to its corresponding operation.

Besides showing how you can find out interesting information by accessing git objects, Peter also describes how each object is uniquely identified and how it is possible to be sure that two objects, say two commits, with the same identifier are almost surely the same commit. Peter also explains how git makes efficient use of storage thanks to packfiles, which use compression techniques to store efficiently similar files.

Next, Peter shows how you can associate multiple remote repos to your local repo by using git remote add, so you can pull changes from a different remote repositories, if needs be. This gives Peter the chance to explain how git pull is actually equivalent to git fetch followed by git merge, which is a sensible default option. There are cases, though, when you want to do a rebase before merging, as explained earlier, so you can use the split form of the command or the handy git pull --rebase.

Commands: cat-file

Seventh lesson: GitHub

The seventh lesson is dedicated to basic workflows inside of GitHub, namely how to clone a repo, how to fork one, how to create a pull request, and how to accept it.

Peter then reviews several models of collaboration, such as sharing the master branch, using feature branches, or using forks and pull requests.

Commands: clone, fork

Eighth and ninth lessons: GitHub features and configuration

What do you do to get a first look at a repo and have an idea about is state? Peter suggests 3 steps: first of all, have a look at the read me, then to the latest commits/merged pull requests, and finally to outstanding pull requests.

Another github feature is its issue tracker, which is essential, simple, and well integrated. Peter goes all the length to show how to create an issue, for bug tracking or feature management, and how to deal with its lifecycle.

Interesting information can also be found in the analytics about contributors, commits, code change, and punch cards (commit rate per weekday/hour of day), as well as the pulse, which gives you a summary of project status for a given period of time.

GitHub webhooks and services are a great way to build workflows on top of GitHub. Webhooks are a low level mechanism allowing to define actions for a plethora of events (commits, pull requests, delete, add file...). On the other hand, services are predefined workflows that you can already use, e.g., to integrate with CI platforms or with JIRA, etc.

Tenth lesson: the release process

Back to git, this lesson is about the release process, starting with tagging a release so you can retrieve it later. Peter introduces the three types of tags supported by git: basic, annotated, and signed, and explains in which situation you might want to use which one.

Release branches are another useful mechanism that Peter presents, while also describing in which kind of scenarios you might want to use release branches instead of tags to manage your releases. Having introduced release branches, Peter goes into detail to explain how you can copy changes from a branch to another through cherry picking or using git stash to manage a stack of changes.

Eleventh lesson: reverting things

One of the most important features in an SCM is allowing to revert things, and Peter devotes a whole lesson to the topic. First of all, he explains the difference between private and public history and makes clear that the only way to modify the public history is using git revert. For private history there are more possibilities, such as through git commit --amend, git reset, git rebase -i, and Peter explores them all.

Commands: commit --amend, reset, rebase -i.

InfoQ has got the opportunity to talk to Peter Bell about his background and his personal view about git, GitHub and software development.

Hi Peter, you define yourself as a "Startup Technologist”. Could you explain us what this is about?

I help companies to build better software. I work with business people to help them to better hire and manage developers and with developers to help them to build better products and build products better. I help startups to implement best practices and enterprises to learn from the best startups. In addition to founding Pragmatic Learning, an enterprise training company and being a contract member of the GitHub training team, I also founded Speak Geek to help business people better hire and manage developers, the CTO Summit Series to help engineering leaders to be more effective, and I’m an Adjunct Professor at the Columbia Graduate School of business.

Could you tell us a few words about your professional background and what you like the most in the current technology landscape?

I used to be a CTO, most recently running the engineering team for General Assembly. The biggest opportunity in technology is also the biggest challenge. There are a proliferation of open source languages, frameworks, libraries and tools to make developers substantially more effective. The challenge is how to review, select, learn, and adopt the new technologies to keep up with the latest tools without becoming overwhelmed by the speed of change.

Git is a very powerful tool. In your "Git and GitHub LiveLessons,” while not requiring any previous exposure to Git, you still find a way to introduce advanced mechanisms, like rebase and hints at Git internals. All with great clearness and coherence. You seem to have a gift to make git look easy to use. But Git can be daunting to the beginner. What is the secret to get into Git without much hassle?

Git is an amazing version control system and GitHub is a wonderful platform for collaborating on projects, but Git requires some training to get the best out of it. Most people who “learn the ten basic commands” end up having trouble with Git because they don’t understand the way Git thinks about things like branches, checkouts and remotes. I’d strongly recommend taking the time to read a book, take a class or attend a training. It takes a solid day of introductory materials to “get” the basics of Git, and then some dedicated ongoing practice to really internalize and own the skills.

How can you explain that Git has had such a huge adoption? What is the key to its success?

Distributed version control systems like Git provide substantial benefits in terms of ease, flexibility and speed over centralized VCSs. Every distributed VCS has its own strengths and weaknesses but Git seemed to hit the sweet spot and get the majority of the adoption. At this point, it’s not even a matter of technical merit—if you’re doing modern software development it probably makes sense to use Git-if only because almost everyone else is doing so.

What are the basic techniques enabled by Git that any Git user should absolutely know about? 

Firstly, you need to understand the staging area-and why adding files before committing them gives you more control over exactly what changes go into any given commit to tell a better story about how you’re developing your software.
Secondly you need to understand the purpose of branches and realize that a very simple “never commit directly to master, always work on feature branches, do all integration testing on feature branches and don’t create release branches unless you really need them” workflow works really well for a very wide range of use cases.
Thirdly you need to understand how remotes work and to generalize your understanding of branches and merges to what happens when you pull from a remote. Finally you want to learn how to undo almost everything using safe commands like git revert and more dangerous ones like commit —amend, reset, rebase -i and the reflog.

How well does Git, in your opinion, support advanced/complex working environments? What is, if any, a distinctive Git advantage?

Git's real power is its flexibility. It supports a really wide range of complex workflows and strategies. The challenge is picking the right strategies for any given use case. Now the biggest Git advantage is its ubiquity. Technically I love cheap, fast and easy branching and the ability to create repos and commit even when I’m offline.

Last year, Wired wrote about a GitHub revolution. How would you go about describing the shift that GitHub brought in the software development world?

Only a few short years ago, if you wanted to commit to an open source project you needed to find the email of a core committer, email your patch and correspond with the team to get it integrated. It was a completely unscalable, high friction model. With GitHub, anyone can fork and submit a pull request to any open source project, transforming the quantity, quality and ease of collaboration on open source projects. That in turn has transformed the pace of innovation in tools and the potential productivity of developers leveraging those new tools.

You are specifically concerned with agile and startups. How would you describe the value that tools like git and github bring to agile and to startups in particular?

One of the first things I recommend to non-technical founders is that they create a GitHub account and set up the repo where their code is going to be so (a) they always have access to the code, and (b) they can keep an eye on what’s happening without bugging their dev team. Git allows development teams of all sizes to collaborate effectively and GitHub allows everyone-including the management team to keep up to date with what’s being built.

Say that some people watching your Git and GitHub lessons would like to go deeper into Git and learn more advanced techniques; what could be their next step? What useful resources could you suggest them?

Well, I’m working on a book for Pearson that should be finished later this year. Other than that, there are a number of other books out there and I’d also just recommend getting as much hands on practice as possible. Create a test repo and just try things both to see what happens and to build the confidence with the edge cases that sometimes come up.

Thank you, Peter

Thanks so much for taking the time to conduct the interview!

Git is a distributed revision control and source code management (SCM) system that was initially developed by Linus Torvalds for Linux kernel development. It is currently the most widely used version control system for software development.

GitHub is web-based hosting service offering based on Git. Besides offering all of the functionality of Git, it also adds its own features, such as wikis, task management, and bug tracking. GitHub has had a huge success and in December 2013 they announced the site was hosting 10 million repositories.

The Git and GitHub livelessons playlist is embedded below

About the Interviewee

Peter Bell is a startup technologist. He’s a contract member of the GitHub training team and the founder of Pragmatic Learning, an enterprise training company that helps enterprises to connect to the best open source technologies and processes. He is also an adjunct professor at Columbia Business School and teaches business people how to better hire and manage developers.

He has presented at a range of conferences including DLD conference, ooPSLA, QCon NY, QCon SF, RubyNation, SpringOne2GX, Code Generation, Practical Product Lines, the British Computer Society Software Practices Advancement conference, GraphConnect, DevNexus, cf.Objective(), CF United, Scotch on the Rocks, WebDU, WebManiacs, UberConf, the Rich Web Experience and the No Fluff Just Stuff Enterprise Java tour. He has been published in IEEE Software, Dr. Dobbs, IBM developerWorks, Information Week, Methods & Tools, Mashed Code, the Open Source Journal, NFJS the Magazine and GroovyMag.

Rate this Article