October 31, 2011

GitHub hack: speed up git push and git pull

Filed under: git, linux, ssh — Tags: , , — coderrr @ 7:27 pm


`git pull` and `git push` to github can be greatly sped up (especially for users with >100ms round trip times to by keeping an ssh master connection open to at all times. To do this, add these two lines to your ~/.ssh/config

ControlMaster auto
ControlPath /tmp/ssh_mux_%h_%p_%r

and then leave `ssh git-receive-pack your_github_username/your_repo` running in the background.


For users with high latency, pulls and pushes to github can start quite slow. For example, with a RTT to of 250ms, `git pull/push` usually takes a minimum of 4.5s to tell you ‘Already up-to-date’. This is largely due to the fact that git is using ssh and the startup time of an ssh connection requires many round trips. How many round trips exactly? We could read the RFC and OpenSSH implementation details… or we could just check what actually happens.

`ssh -v` shows you what ssh is doing at each step, but it’s not timestamped. We can use this little script to prefix each line with a timestamp.

# time.rb
start =
puts "#{((*1000).to_i}\t#$_"  while $<.gets

To make it easier to determine whether time is spent on a network round-trip rather than client/server CPU time we can artificially increase the RTT to 1000ms using tc:

$ sudo tc qdisc add dev eth0 root netem delay 1000ms

Now we can look at the timestamped `ssh -v` output. I’ve annotated it to show where the round-trips occur.

$ ssh -v echo hi 2>&1 | ruby time.rb

0	OpenSSH_5.5p1 Debian-4ubuntu6, OpenSSL 0.9.8o 01 Jun 2010
0	debug1: Reading configuration data /home/steve/.ssh/config
0	debug1: Reading configuration data /etc/ssh/ssh_config
0	debug1: Applying options for *
0	debug1: auto-mux: Trying existing master
0	debug1: Control socket "/tmp/ssh_mux_github.com_22_git" does not exist

DNS lookup

2331	debug1: Connecting to [] port 22.


3322	debug1: Connection established.
3322	debug1: identity file /home/steve/.ssh/id_rsa type 1
3322	debug1: Checking blacklist file /usr/share/ssh/blacklist.RSA-2048
3322	debug1: Checking blacklist file /etc/ssh/blacklist.RSA-2048
3322	debug1: identity file /home/steve/.ssh/id_rsa-cert type -1
3322	debug1: identity file /home/steve/.ssh/id_dsa type -1
3322	debug1: identity file /home/steve/.ssh/id_dsa-cert type -1


4318	debug1: Remote protocol version 2.0, remote software version OpenSSH_5.1p1 Debian-5github2
4318	debug1: match: OpenSSH_5.1p1 Debian-5github2 pat OpenSSH*
4318	debug1: Enabling compatibility mode for protocol 2.0
4318	debug1: Local version string SSH-2.0-OpenSSH_5.5p1 Debian-4ubuntu6
4318	debug1: SSH2_MSG_KEXINIT sent


5318	debug1: SSH2_MSG_KEXINIT received
5318	debug1: kex: server->client aes128-ctr hmac-md5 none
5318	debug1: kex: client->server aes128-ctr hmac-md5 none
5318	debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
5318	debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP

4/5 ( two round trips )

7335	debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
7335	debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY


8334	debug1: Host '' is known and matches the RSA host key.
8334	debug1: Found key in /home/steve/.ssh/known_hosts:1
8334	debug1: ssh_rsa_verify: signature correct
8334	debug1: SSH2_MSG_NEWKEYS sent
8334	debug1: expecting SSH2_MSG_NEWKEYS
8334	debug1: SSH2_MSG_NEWKEYS received
8334	debug1: Roaming not allowed by server
8334	debug1: SSH2_MSG_SERVICE_REQUEST sent

7/8 ( two round trips)

10350	debug1: SSH2_MSG_SERVICE_ACCEPT received


11344	debug1: Authentications that can continue: publickey
11344	debug1: Next authentication method: publickey
11344	debug1: Offering public key: /home/steve/.ssh/id_rsa


12376	debug1: Remote: Forced command: gerve coderrr


13398	debug1: Remote: Port forwarding disabled.
13398	debug1: Remote: X11 forwarding disabled.
13398	debug1: Remote: Agent forwarding disabled.
13398	debug1: Remote: Pty allocation disabled.
13398	debug1: Server accepts key: pkalg ssh-rsa blen 277


14398	debug1: Remote: Forced command: gerve coderrr


15420	debug1: Remote: Port forwarding disabled.
15420	debug1: Remote: X11 forwarding disabled.
15420	debug1: Remote: Agent forwarding disabled.
15420	debug1: Remote: Pty allocation disabled.
15420	debug1: Authentication succeeded (publickey).
15420	debug1: channel 0: new [client-session]
15420	debug1: setting up multiplex master socket
15420	debug1: channel 1: new [/tmp/ssh_mux_github.com_22_git]
15420	debug1: Entering interactive session.


16416	debug1: Sending environment.
16416	debug1: Sending env LANG = en_US.utf8
16416	debug1: Sending command: echo hi


17417	debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
17417	debug1: client_input_channel_req: channel 0 rtype reply 0
17417	Invalid command: 'echo hi'
17417	  You appear to be using ssh to clone a git:// URL.
17417	  Make sure your core.gitProxy config option and the
17417	  GIT_PROXY_COMMAND environment variable are NOT set.
17417	debug1: channel 0: free: client-session, nchannels 2
17417	debug1: channel 1: free: /tmp/ssh_mux_github.com_22_git, nchannels 1
17417	debug1: fd 1 clearing O_NONBLOCK
17417	Transferred: sent 2296, received 2952 bytes, in 2.0 seconds
17417	Bytes per second: sent 1151.0, received 1479.9
17417	debug1: Exit status 1

So we see a total of 15 round trips before github responds to the actual command we sent.

Luckily we can skip most of these by using ssh master connections. Just add this to the top of your ~/.ssh/config:

ControlMaster auto
ControlPath /tmp/ssh_mux_%h_%p_%r

Now as long you have an ssh connection open to it will be reused when starting a new ssh session. But how do we keep a connection open to since they don’t give you shell access. Well, we know from experience that when doing a long `git pull` it must be keeping the git/ssh session open for a while. So worst case, we could just continually do long `git pull`s in the background. But there’s gotta be a better way. Maybe starting a push but never sending any data will keep the session open indefinitely. How do we test this? First we need to find out what command git actually sends when doing a push. Let’s be lazy and not RTFM, but experiment instead.

The command pointed to in the GIT_SSH env var will be used instead of `ssh` if it is set. So let’s make a little script which writes the arguments passed to it to a file:

$ cat > git_out.rb
#!/usr/bin/env ruby
File.write("/tmp/git.txt", ARGV*" ")
$ chmod +x !$
$ GIT_SSH=./git_out.rb git push
fatal: The remote end hung up unexpectedly
$ cat /tmp/git.txt git-receive-pack 'coderrr/test.git'

Now let’s see what happens if we call `ssh git-receive-pack coderrr/test.git` ourselves:

$ ssh git-receive-pack coderrr/test
007ac836660a1d498131a934badab139fc0d347d2c29 refs/heads/master report-status delete-refs side-band-64k ofs-delta
003e4e454cba21ca64b1eda7d4042c9f86abf3987e8b refs/heads/stats

The connection stays open! Now let’s check how fast `git push` is with our 1000ms latency.

$ time git push
Everything up-to-date

real	0m3.906s

Close the ssh connection and try again:

$ time git push
Everything up-to-date

real	0m23.402s

Down to 4 seconds from 23, not bad.

You can also use autossh to make sure the ssh connection reconnects in case it drops.

While this won’t be so useful for people in the US who have RTT to github of <50ms it can be very helpful for people in other countries where the RTTs are regularly more than 250ms.

Further investigation could include checking how long GitHub allows the git-receive-pack connection to remain open and possibly very slowly sending valid git protocol data into the git-receive-pack ssh connection to keep it open for longer periods of time.

Ethical implications of this are left as an exercise to the reader.

The Silver is the New Black Theme. Create a free website or blog at


Get every new post delivered to your Inbox.

Join 31 other followers