coderrr

October 31, 2011

GitHub hack: speed up git push and git pull

Filed under: git, linux, ssh — Tags: , , — coderrr @ 7:27 pm

tldr

`git pull` and `git push` to github can be greatly sped up (especially for users with >100ms round trip times to github.com) by keeping an ssh master connection open to github.com at all times. To do this, add these two lines to your ~/.ssh/config

ControlMaster auto
ControlPath /tmp/ssh_mux_%h_%p_%r

and then leave `ssh git@github.com git-receive-pack your_github_username/your_repo` running in the background.

 

For users with high latency, pulls and pushes to github can start quite slow. For example, with a RTT to github.com of 250ms, `git pull/push` usually takes a minimum of 4.5s to tell you ‘Already up-to-date’. This is largely due to the fact that git is using ssh and the startup time of an ssh connection requires many round trips. How many round trips exactly? We could read the RFC and OpenSSH implementation details… or we could just check what actually happens.

`ssh -v` shows you what ssh is doing at each step, but it’s not timestamped. We can use this little script to prefix each line with a timestamp.

# time.rb
start = Time.now
puts "#{((Time.now-start)*1000).to_i}\t#$_"  while $<.gets

To make it easier to determine whether time is spent on a network round-trip rather than client/server CPU time we can artificially increase the RTT to 1000ms using tc:

$ sudo tc qdisc add dev eth0 root netem delay 1000ms

Now we can look at the timestamped `ssh -v` output. I’ve annotated it to show where the round-trips occur.

$ ssh -v git@github.com echo hi 2>&1 | ruby time.rb

0	OpenSSH_5.5p1 Debian-4ubuntu6, OpenSSL 0.9.8o 01 Jun 2010
0	debug1: Reading configuration data /home/steve/.ssh/config
0	debug1: Reading configuration data /etc/ssh/ssh_config
0	debug1: Applying options for *
0	debug1: auto-mux: Trying existing master
0	debug1: Control socket "/tmp/ssh_mux_github.com_22_git" does not exist

DNS lookup

2331	debug1: Connecting to github.com [207.97.227.239] port 22.

1

3322	debug1: Connection established.
3322	debug1: identity file /home/steve/.ssh/id_rsa type 1
3322	debug1: Checking blacklist file /usr/share/ssh/blacklist.RSA-2048
3322	debug1: Checking blacklist file /etc/ssh/blacklist.RSA-2048
3322	debug1: identity file /home/steve/.ssh/id_rsa-cert type -1
3322	debug1: identity file /home/steve/.ssh/id_dsa type -1
3322	debug1: identity file /home/steve/.ssh/id_dsa-cert type -1

2

4318	debug1: Remote protocol version 2.0, remote software version OpenSSH_5.1p1 Debian-5github2
4318	debug1: match: OpenSSH_5.1p1 Debian-5github2 pat OpenSSH*
4318	debug1: Enabling compatibility mode for protocol 2.0
4318	debug1: Local version string SSH-2.0-OpenSSH_5.5p1 Debian-4ubuntu6
4318	debug1: SSH2_MSG_KEXINIT sent

3

5318	debug1: SSH2_MSG_KEXINIT received
5318	debug1: kex: server->client aes128-ctr hmac-md5 none
5318	debug1: kex: client->server aes128-ctr hmac-md5 none
5318	debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
5318	debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP

4/5 ( two round trips )

7335	debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
7335	debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY

6

8334	debug1: Host 'github.com' is known and matches the RSA host key.
8334	debug1: Found key in /home/steve/.ssh/known_hosts:1
8334	debug1: ssh_rsa_verify: signature correct
8334	debug1: SSH2_MSG_NEWKEYS sent
8334	debug1: expecting SSH2_MSG_NEWKEYS
8334	debug1: SSH2_MSG_NEWKEYS received
8334	debug1: Roaming not allowed by server
8334	debug1: SSH2_MSG_SERVICE_REQUEST sent

7/8 ( two round trips)

10350	debug1: SSH2_MSG_SERVICE_ACCEPT received

9

11344	debug1: Authentications that can continue: publickey
11344	debug1: Next authentication method: publickey
11344	debug1: Offering public key: /home/steve/.ssh/id_rsa

10

12376	debug1: Remote: Forced command: gerve coderrr

11

13398	debug1: Remote: Port forwarding disabled.
13398	debug1: Remote: X11 forwarding disabled.
13398	debug1: Remote: Agent forwarding disabled.
13398	debug1: Remote: Pty allocation disabled.
13398	debug1: Server accepts key: pkalg ssh-rsa blen 277

12

14398	debug1: Remote: Forced command: gerve coderrr

13

15420	debug1: Remote: Port forwarding disabled.
15420	debug1: Remote: X11 forwarding disabled.
15420	debug1: Remote: Agent forwarding disabled.
15420	debug1: Remote: Pty allocation disabled.
15420	debug1: Authentication succeeded (publickey).
15420	debug1: channel 0: new [client-session]
15420	debug1: setting up multiplex master socket
15420	debug1: channel 1: new [/tmp/ssh_mux_github.com_22_git]
15420	debug1: Entering interactive session.

14

16416	debug1: Sending environment.
16416	debug1: Sending env LANG = en_US.utf8
16416	debug1: Sending command: echo hi

15

17417	debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
17417	debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
17417	Invalid command: 'echo hi'
17417	  You appear to be using ssh to clone a git:// URL.
17417	  Make sure your core.gitProxy config option and the
17417	  GIT_PROXY_COMMAND environment variable are NOT set.
17417	debug1: channel 0: free: client-session, nchannels 2
17417	debug1: channel 1: free: /tmp/ssh_mux_github.com_22_git, nchannels 1
17417	debug1: fd 1 clearing O_NONBLOCK
17417	Transferred: sent 2296, received 2952 bytes, in 2.0 seconds
17417	Bytes per second: sent 1151.0, received 1479.9
17417	debug1: Exit status 1

So we see a total of 15 round trips before github responds to the actual command we sent.

Luckily we can skip most of these by using ssh master connections. Just add this to the top of your ~/.ssh/config:

ControlMaster auto
ControlPath /tmp/ssh_mux_%h_%p_%r

Now as long you have an ssh connection open to github.com it will be reused when starting a new ssh session. But how do we keep a connection open to github.com since they don’t give you shell access. Well, we know from experience that when doing a long `git pull` it must be keeping the git/ssh session open for a while. So worst case, we could just continually do long `git pull`s in the background. But there’s gotta be a better way. Maybe starting a push but never sending any data will keep the session open indefinitely. How do we test this? First we need to find out what command git actually sends when doing a push. Let’s be lazy and not RTFM, but experiment instead.

The command pointed to in the GIT_SSH env var will be used instead of `ssh` if it is set. So let’s make a little script which writes the arguments passed to it to a file:

$ cat > git_out.rb
#!/usr/bin/env ruby
File.write("/tmp/git.txt", ARGV*" ")
$ chmod +x !$
$ GIT_SSH=./git_out.rb git push
fatal: The remote end hung up unexpectedly
$ cat /tmp/git.txt
git@github.com git-receive-pack 'coderrr/test.git'

Now let’s see what happens if we call `ssh git@github.com git-receive-pack coderrr/test.git` ourselves:

$ ssh git@github.com git-receive-pack coderrr/test
007ac836660a1d498131a934badab139fc0d347d2c29 refs/heads/master report-status delete-refs side-band-64k ofs-delta
003e4e454cba21ca64b1eda7d4042c9f86abf3987e8b refs/heads/stats
...
0000

The connection stays open! Now let’s check how fast `git push` is with our 1000ms latency.

$ time git push
Everything up-to-date

real	0m3.906s

Close the ssh connection and try again:

$ time git push
Everything up-to-date

real	0m23.402s

Down to 4 seconds from 23, not bad.

You can also use autossh to make sure the ssh connection reconnects in case it drops.

While this won’t be so useful for people in the US who have RTT to github of <50ms it can be very helpful for people in other countries where the RTTs are regularly more than 250ms.

Further investigation could include checking how long GitHub allows the git-receive-pack connection to remain open and possibly very slowly sending valid git protocol data into the git-receive-pack ssh connection to keep it open for longer periods of time.

Ethical implications of this are left as an exercise to the reader.

July 29, 2009

How to force Flash or any program to use a SOCKS proxy using Transocks and iptables in Linux

Filed under: linux, network — Tags: , — coderrr @ 5:18 pm

Shameless Plug: Force all of your applications through a high powered proxy-like tunnel with a VPN Service.

The Flash plugin for Linux does not respect any browser’s SOCKS proxy settings. This means sites which stream video through a protocol other than HTTP will go direct to the host rather than through your SOCKS proxy.

One way to force Flash or any program through a SOCKS proxy is to use iptables in combination with transocks_em.

First download transocks_em here: http://github.com/coderrr/transocks_em/tarball/master

To run it will require you to have ruby and the eventmachine gem (gem install eventmachine). Now that transocks is ready, we need to setup rules for iptables which will redirect our traffic to be handled by transocks. You can put the following rules in a sh script.

iptables_transocks.sh:

#!/bin/sh

LOCAL_NET=192.168.0.0/16

# Flush all previous nat rules, you might not want to include this line if you already have other rules setup
iptables -t nat --flush

iptables -t nat -X SOCKSIFY
iptables -t nat -N SOCKSIFY

# Exceptions for local traffic
iptables -t nat -A SOCKSIFY -o lo -j RETURN
iptables -t nat -A SOCKSIFY --dst 127.0.0.1 -j RETURN
iptables -t nat -A SOCKSIFY --dst $LOCAL_NET -j RETURN
# Add extra local nets here as necessary

# Only proxy traffic for programs run with group 'transocks'
iptables -t nat -A SOCKSIFY -m owner ! --gid-owner transocks -j RETURN

# Send to transocks
iptables -t nat -A SOCKSIFY -p tcp -j REDIRECT --to-port 1212

# Socksify traffic leaving this host:
iptables -t nat -A OUTPUT -p tcp --syn -j SOCKSIFY

Once you’ve created the script, run it:

chmod +x iptables_transocks.sh
sudo ./iptables_transocks.sh

Note, if you need to, you can clear out all these rules with:

sudo iptables -t nat --flush

The setup I have chosen here is to only proxy traffic for programs run with the group-id of group ‘transocks’. This makes it easy to socksify any program by just running it as a specific group. So the first thing we’ll want to do is create this group:

sudo addgroup transocks
sudo gpasswd transocks
# set an empty password

Next, we need to start transocks and point it to our socks server. Let’s assume our socks server is running at localhost:1080

ruby transocks_em.rb 127.0.0.1 1080 1212

Now that we have created the group with an empty password and started transocks we are ready to socksify whatever program we want:

sg transocks 'firefox'
sg transocks 'opera'
sg transocks 'lynx http://whatismyip.com'

sg (set group) will run the program with your current user but with the group you specify. This is a semi-non-invasive way of notifying iptables you want it to proxy the traffic from this program. Note that any files this program writes out will have the group of transocks. In most cases this won’t matter but you should be aware of this.

Although sg will prompt you for a password (even though you set a blank password), if you create an application launcher through your windowing system it should launch without having to respond to or seeing a prompt.

Note, if your kernel supports it, you can tell iptables to only proxy traffic for programs with certain names by using the -m owner --cmd-owner [cmd name] option. The other option is to use UIDs instead of GIDs (-m owner --uid-owner) to notify iptables which traffic to socksify. This of course means you’ll have to run programs as a different user which will probably cause you more pain.

So… a quick overview of how this will work. You start your browser with sg transocks ‘firefox’. Now when firefox tries to make a connection, linux will intercept it based on the iptables rules we have defined and forward the connection to transocks on port 1212. Transocks will then inspect the connection to determine its original address (for example hulu.com) and proxy it through the SOCKS server you specified. This will happen for any TCP connection coming out of firefox, even ones from Flash.

December 20, 2008

Automatically flushing redirected or piped stdout

Filed under: c, linux — Tags: , — coderrr @ 12:12 am

Whenever you pipe or redirect a program’s stdout it will automatically be set to buffered mode. Meaning the output won’t be written until a certain amount of bytes have been buffered. I think the default on most systems is 4096.

For example:

ruby -e 'loop{puts 1; sleep 1}' | cat

You wouldn’t see any output for 4096 seconds. This can be annoying if you have a long running program whose output you want to tee to a file. (tee writes the output to a file while at the same time printing it to the console). If you just pipe the output to tee you’ll see the output come in chunks. Some programs come with options to automatically flush stdout after every write. Some scripting languages allow you to do this as well

# ruby
STDOUT.sync = true
# perl
$| = 1

And in C you can set the buffering of a stream with:

# unbuffered
setvbuf(stdout, NULL, _IONBF, 0)
# line buffered
setlinebuf(stdout)

But if you can’t modify the program or script a generic solution is required.

expect_unbuffer, or just unbuffer depending on where you get it, does just that. I’m not 100% clear on exactly how it works but it’s something like this. It sets up a pseudo terminal (pty) which it redirects your program’s output to. Since the program thinks it’s writing to a console it remains in line buffered mode. This way even after you redirect or pipe the programs output, it will be flushed as it is being written.

How to get expect_unbuffer varies on the system you use. On ubuntu 8.10 apt-get install expect-dev is enough. On others which don’t include it in the expect package you may need to install the expect-dev package and then download/compile unbuffer.c with:

sudo apt-get install expect-dev
wget http://expect.nist.gov/example/unbuffer.c
sudo gcc -o /usr/bin/expect_unbuffer unbuffer.c -I$(dirname $(find /usr/include -name expect.h | head -n 1)) -lexpect

Here’s the man page.

If anyone knows of a better or simpler way to do this generically I’d like to hear it.

April 20, 2008

Getting idle time in linux

Filed under: c, linux, ruby — Tags: , , — coderrr @ 4:20 pm

Shameless Plug: Use a Linux Compatible VPN Service to protect your privacy when browsing at all hours of the day and night.

For a little alarm prog I’m writing to help myself out with my polyphasic sleep endeavor I needed to get the idle time of an X session, the length of time since the user last used the mouse or keyboard. I searched for a while and there were a few people asking about it but no answers.

So I hacked around and found a simple way to do it in C. This requires you have the Xss (Xscreensaver, not cross site scripting :P) library and includes. (ubuntu: sudo apt-get install libxss-dev)

I wrote this on linux but in theory it should work in any unix X environment with libXss.

#include <X11/extensions/scrnsaver.h>

main() {
  XScreenSaverInfo *info = XScreenSaverAllocInfo();
  Display *display = XOpenDisplay(0);

  XScreenSaverQueryInfo(display, DefaultRootWindow(display), info);
  printf("%u ms\n", info->idle);
}

Compile with: gcc -o idle idle.c -lXss

And here’s a simple Ruby wrapper for it using RubyInline:

require 'inline'

class XScreenSaver
  class << self
    inline do |builder|
      builder.add_link_flags '-lXss'
      builder.include '<X11/extensions/scrnsaver.h>'
      builder.c %{
        double idle_time() {
          static Display *display;
          XScreenSaverInfo *info = XScreenSaverAllocInfo();

          if (!display)  display = XOpenDisplay(0);
          if (!display)  return -1;

          XScreenSaverQueryInfo(display, DefaultRootWindow(display), info);

          return info->idle / 1000.0;
        }
      }
    end
  end
end

if __FILE__ == $0
  loop { puts XScreenSaver.idle_time; sleep 0.2 }
end

Does anyone know a better way to do it?

February 27, 2008

Control+arrow (control+left-arrow / control+right-arrow) behavior with Firefox location bar in Linux

Filed under: firefox, linux — Tags: , — coderrr @ 1:27 am

In Linux, Firefox doesn’t stop at every punctuation (e.g. / ? & .) when using the control+right-arrow or control+left-arrow key combinations in the location bar. Instead it will just skip to the next space, in most urls this means the beginning or end of the line. This annoyed the hell out of me for a while because it meant with really long urls (like http://some.url.com/with/a/really/long/path?and&query=string&too) if I wanted to get to the middle of the url and modify it I’d just have to hold down the arrow key or click the correct position with my mouse. It took me a lot of Googling (I gave up a few times) to find the solution, but I finally figured it out:

Go to about:config and set the option layout.word_select.stop_at_punctuation to true.

Customized Silver is the New Black Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 28 other followers