December 27, 2010

Canonical redirect pitfalls with HTTP Strict Transport Security and some solutions

Filed under: security — Tags: — coderrr @ 4:21 pm


There is a common pitfall when implementing HTTP Strict Transport Security on sites that 301 redirect from -> which leaves your users open to a MITM attack.  Paypal is an example of one of the sites affected by this issue.  There are some solutions but none are ideal.


HTTP Strict Transport Security (HSTS) is a mechanism for websites to instruct browsers to only ever connect to them using an encrypted HTTPS connection.  It solves the problem of man in the middle attacks (MITM) (except on the browser’s first ever connection to the site) which could allow an attacker to keep the user from connecting the secure version of the site and do evil things to them.

It’s generally very simple to setup.  You just 301 redirect any HTTP connections to the HTTPS version of your site and then you add an HTTP header (Strict-Transport-Security: max-age=TTL_IN_SECONDS) to every HTTPS response.  That’s it.  Note that you cannot put the HSTS header on unencrypted HTTP responses because this would allow an attacker to expire your HSTS policy prematurely.

HSTS with Canonical Redirects

Because of that last point it gets a little trickier when dealing with a website that does canonical redirects.  A canonical redirect is just a permanent 301 redirect.  For example, 301 -> or 301 ->  This is fairly standard these days for SEO purposes (go to for an example).

Let’s say your canonical redirect is from to  So you setup the -> and -> 301 redirects and you set the HSTS header on responses.  This protects any user who goes directly to in their browser.  But most often people will type in the domain without the www.  Those users will still be open to MITM attacks as the -> redirect will always be unencrypted.  The HSTS headers on will not affect

Paypal Fail

For a real world example of this let’s check out Paypal.  They were the “first major internet site” to start using HSTS.  To be fair they are still beta testing it (with a max-age of less than 10 minutes) and mention explicitly that it’s only enabled on  Although they don’t mention the ramifications of that.

Using the latest version of Google Chrome (which is the only non beta browser that currently supports HSTS) go to  Then open Developer Tools and go to the Resources tab.  You’ll see an unencrypted 301 redirect from to  The response from will set the HSTS policy.  Now open a new tab and go to  Open Developer Tools again and you’ll see the browser goes directly to the encrypted, HSTS in action!  Now open a new tab and go to (without the www) again.  You’ll see this time Developer Tools shows there’s still an unencrypted 301 redirect, meaning an attacker can still get you with a MITM attack.  This is a perfect example of the canonical redirect pitfall with HSTS.  Another site which has the exact same problem is  My guess is this error will repeat itself many times once more sites start implementing HSTS.

This issue doesn’t seem to be readily apparent to a lot of people or at least they don’t mention it.  Even people who should probably be the most aware of it.  Here’s an except from the Security Now podcast by Steven Gibson talking about Paypal HSTS:

“So, for example, if the user put in just, or even, if there is a Strict Transport Security token that has been received previously, the browser has permission to ignore that non-secure query request from the user and to transparently make it secure.”

And after reading Paypal’s HSTS announcement you’d assume that typing in would be secure too.

Solution 1

To fix this we must set an HSTS policy on both and  Here’s how:

1) Setup a 301 redirect from to
2) Setup a 301 redirect from to
3) Set an HSTS header on the redirect (or all responses) from
4) Setup a 301 redirect from to
5) Set an HSTS header on all responses from

So the redirect chains will look like this: -> ->
OR ->

And the request response cycle will look like this (for the first redirect chain):

- browser requests
- server redirects to

HTTP/1.1 301 Moved Permanently

- browser requests
- server redirects to and sets HSTS header

HTTP/1.1 301 Moved Permanently
Strict-Transport-Security: max-age=2592000

- browser stores HSTS policy for and will never connect to it without encryption for a month
- browser requests
- server sets HSTS header and serves up standard homepage

HTTP/1.1 200 OK
Strict-Transport-Security: max-age=2592000
Content-type: text/html

- browser stores HSTS policy for and will never connect to it without encryption for a month

Now you have secured both users who type into the location bar and those who type  The downside to this is that you require two HTTPS connection which will be slower for the user and more expensive for your servers.  You’ve also still not secured the user who typed in and then a week later typed in  That user’s first ever connection to will still be open to a MITM attack.

Solution 2

To secure both and on the initial connection to either here’s what you would have to do.  First make sure you have the simple 301 redirects from or to as most sites already have done.  Then embed an invisible iframe on the landing page (or every page) of which loads  This /hsts path would do nothing other than send back a blank page with the HSTS header set.  The /hsts response could also contain cache headers (Expires and Cache-Control: public) which would make the browser not re-request it for some amount of time which is less than your HSTS policy.  This still has the issue of requiring two HTTPS connections every time a user types in

How you actually implement any of the previous concepts depends entirely on your web server and framework.  This post is just meant to get the idea out there and people aware of this potential pitfall.  See the HSTS Wikipedia article for some implementation examples of HSTS.

Considering -> canonical redirects are quite ubiquitous across the web it seems like something HSTS should deal with better.  They include an HSTS option called includeSubDomains which would handle -> canonical redirects.  But even that isn’t perfect because it would also enforce the HSTS policy on other possibly unimportant subdomains like

Solution 3: Add a Hack

I wonder if it’s worth putting an explicit exception into HSTS which allows the www subdomain to set its parent domain’s HSTS policy.  Maybe only allowing it to set or increase the TTL of the policy, and never allowing it to decrease.  I think this would make HSTS much easier to implement for the vast majority of websites.

In my next post I’ll suggest some potential additions to HSTS which could make things more simple for implementers.

Thanks to Victor Costan for reviewing this post and helping me brainstorm.

June 18, 2009

anti anti Frame Busting

Filed under: javascript, security — Tags: , — coderrr @ 4:21 pm

In this post I presented a way to prevent a site you were (i)framing from frame busting out. Jeff Atwood recently contacted me to see if I knew a way to get around the prevention (to prevent sites from framing, which of course I didn’t, but I told him I’d think about it and see if I could come up with something. A week later I had come up with a few ideas but none actually worked.

See updates below for latest/best solution

But, due to some extra motivation from his post today (which links to my original post), I may have just come up with something.

  if (top != self) {
    alert('busting you out, please wait...')

It’s so stupid simple, but it seems to actually work. If you present an alert() box immediately after changing top.location you prevent the browser from running any more JS, either from your framed site or from the framing site. But you don’t prevent the browser from loading the new page. So as long as the alert box stays up for a few seconds until the browser has loaded enough of the new page to stop running scripts on the old page, then the anti frame busting code never runs and you successfully are busted out once the user clicks OK on the alert.

I’ve just done a preliminary test of this in FF3 and IE7 and it worked in both. So I’m calling it, anti-anti-frame-busting is here!

Update: Jason brought up in a comment the issue of a user clicking OK before the page finished loading in which case the anti-frame-bust will still prevent you from busting. One thing you can do to make sure that the page loads extremely quickly so that no user will be able to click OK before that is to (pre-)cache it. Here’s an example:

<!-- this is the page being framed -->
    function bust() {
      if (top != self) {
        // this page is now cached and will load immediately
        alert('busting you out, please wait...');
  <!-- cache it! -->

You should have these headers on the page to bust out with.

      Expires: Sun, 19 Aug 2012 14:10:44 GMT    <-- far enough in the future
      Last-modified: Fri, 19 Jun 2009 04:24:20 GMT    <-- now

First the framed page will do the initial load of the cached page in an iframe (which you can make hidden). Now that page will be cached and the next time you visit it the browser should make no network requests, loading the page completely from its local cache.

For this technique to work you’ll probly want to use it with a blank page which contains only a javascript/meta redirect to the actual page that was being framed. For example:

  <script> if (self == top) top.location=''; </script>

Update: On IE7 this caching technique alone is good enough to prevent anti-frame-busting. Meaning no alert() is required after the top.location change. At least this is true for a simple page which only consists of a javascript redirect:

    we've busted out!  redirecting...
      // only redirect if we're busted out
      if (top == self)  top.location = "";

Update: The holy grail of anti-anti-frame-busting: This code, along with the caching technique described above, works in both FF3 and IE7 and has no negative user-experience (ie. alert boxes or frozen browsers):

    function bust() {
      if (top != self)
  <!-- cache it! -->

February 21, 2009

Ridiculous ruby meta programming hack

Filed under: bug, patch, ruby, security — Tags: , , , — coderrr @ 2:55 pm

Ruby 1.8.6 has a bug where Dir.glob will glob on a tainted string in $SAFE level 3 or 4 without raising a SecurityError as would be expected. You can see this from the following code:

lambda { $SAFE = 4; Dir.glob('/**') }.call # raises SecurityError
lambda { $SAFE = 4; Dir.glob(['/**']) }.call # => [ ... files ... ]

So I set out to fix this with pure ruby… and it ended up requiring some really crazy stuff. I’ll first show what I ended up with, then go through and explain why:

class Dir
  class << self
    safe_level_password ='/dev/urandom','r'){|f| }
    m = Dir.method(:glob)
    define_method :glob do |password, safe, *args|
      raise SecurityError if safe_level_password != password
      $SAFE = safe
      # pass along glob opts
      opts = args.last.is_a?(Fixnum) ? args.pop : [] do |arg|, *opts)
    eval %{
      alias safe_glob glob
      def glob(*a)
        safe_glob(#{safe_level_password.inspect}, $SAFE, *a)
# freeze Dir so that no one can redefine safe_glob* to catch password

So first things first. The simple way to fix this bug is to alias glob, iterate over the array passed to the new glob and then call the original glob with one argument at a time, since we know it correctly checks taint with only one argument.

But wait, if we alias the original method then someone could still access the original and pass it an array. So we have to use a “secure” version of alias method chaining. Essentially, we capture the method object of the original method in the local scope, then we use define_method to overwrite the original method name with our new implementation and call the original method object which we have ‘closed’ over. This allows us access to the original method while preventing anyone else from doing so.

But there’s another problem. $SAFE levels are captured in blocks just as local variables. This means our define_method version of glob will always run at the $SAFE level it was defined in, namely $SAFE level 0. Meaning if you call Dir.glob from a $SAFE level 4 it will still get executed at level 0. This is of course the exact opposite of what we want. We are in a worse position now than before. Now we could call Dir.glob with a single tainted parameter and it would work.

How do we fix this? We need to use def to define the method so that the current $SAFE level 0 isn’t captured. Instead $SAFE will reflect the $SAFE level of the caller. But if we use def we can’t use the secure alias method technique.

One option is to have the define_method version set the $SAFE level explicitly before calling glob. But then we run into the issue of how to know what to set it to? There is no way of determining the $SAFE level of the caller without explicitly passing it in.

Ok, so what if we def a method which then calls the define_method method and passes its $SAFE level as an argument? Well then the problem is how do you give the def method access to the define_method method without giving other evil code access to the define_method method as well. Because then that evil code could just lie and pass a level of 0.

This is the crazy part. The way to prevent the evil method from executing the define_method method is to use a password!

Both the def and define_method methods can share a secret which is only available in the local scope where they are defined. Since the password is only a string we can use eval to create the def method and insert the password into it. As long as the define_method method verifies the secret is correct before continuing we can be sure the only method able to call it is the def method.

I never thought I’d be sharing a secret between methods. I know this is a big house of cards. Can anyone figure out how to make it tumble? Or a better way to fix the glob bug (without C).

And yes, you must also do the same for Dir.[] but I left it out for sake of brevity.

BTW, here’s the patch if you actually care enough to recompile:

--- dir.c.orig	2009-02-21 21:49:09.000000000
+++ dir.c	2009-02-21 21:49:38.000000000
@@ -1659,7 +1659,7 @@
     for (i = 0; i < argc; ++i) {
 	int status;
 	VALUE str = argv[i];
-	StringValue(str);
+	SafeStringValue(str);
 	status = push_glob(ary, RSTRING(str)->ptr, flags);
 	if (status) GLOB_JUMP_TAG(status);

February 13, 2009

Preventing Frame Busting and Click Jacking (UI Redressing)

Filed under: javascript, security — Tags: , — coderrr @ 2:05 am

Shameless Plug: Don’t let your clicks be tracked. Protect your browsing habits with a VPN Service.

Some websites are under the impression this very old frame busting code can prevent click jacking attacks:

try {
  if (top.location.hostname != self.location.hostname) throw 1;
} catch (e) {
  top.location.href = self.location.href;

Here’s a very simple way around this which works in both FF and IE7: (update, a way to work around this prevetion here)

  var prevent_bust = 0
  window.onbeforeunload = function() { prevent_bust++ }
  setInterval(function() {
    if (prevent_bust > 0) {
      prevent_bust -= 2 = ''
  }, 1)

The server only needs to respond with:

HTTP/1.1 204 No Content

On most browsers a 204 (No Content) HTTP response will do nothing, meaning it will leave you on the current page. But the request attempt will override the previous frame busting attempt, rendering it useless. If the server responds quickly this will be almost invisible to the user.

Update: If the frame busting code is at the beginning of the page, before any content loads, then even though the frame busting will be prevented, so will the loading of the remainder of the page. This means that your content would be hidden and un-clickjackable (only in FF, see below for IE).

So what can a website do to prevent clickjacking? I’m not a security expert but this seems to cover almost all the cases:

First, have your page load with all content hidden using CSS. Something along the lines of:

<body style="display:none" ...>

Then use some variant of the frame busting code, but instead of busting, use it to determine whether or not to display your content:

try {
  if (top.location.hostname != self.location.hostname)
    throw 1; = 'block';
} catch (e) {
  // possible clickjack attack, leave content hidden

This covers most of the cases. It covers IE’s SECURITY=RESTRICTED which allows you to turn off scripting for an iframe. If your site is loaded like this, your script will not run and your content will remain hidden (as mentioned here). And it covers a standard clickjack attack by not displaying your content if it detects that it has been framed. What it doesn’t cover is a user who comes to your site with javascript disabled (who will see nothing). You of course could present them with a message saying javascript is required (using <noscript>). Sucks, but it seems at this point that is the price to pay for clickjacking protection.

If you have or know of a better solution please let me know.

Note to users: NoScript can protect you from clicking on invisible elements.

December 18, 2008

Rails CSRF vulnerability explanation

Filed under: bug, rails, security — Tags: , , — coderrr @ 9:16 pm

I know this is old and I also usually don’t like posting duplicate info that’s already easily findable elsewhere, but since I discovered this I figured I would blog about it.

Since I find security issues interesting I was reading through a Rails lighthouse ticket dealing with authenticity tokens on AJAX requests. I’ll leave out most of the details of the issue they were discussing but it came down to them deciding to rely on trusting the browser sending only multipart/form-data or application/x-www-form-urlencoded encoding types (enctype) when submitting forms.

We can use the Content-Type of the request because browsers can’t change it. They will only ever send html, url encoded or multipart.

I was in the mood to find a hole so I immediately thought about trying to submit with a different enctype. I tried a few random ones but the browser would always default back to form-urlencoded. So I decided to open up the source for Firefox. After a bit of digging I found this:

} else if (method == NS_FORM_METHOD_POST &&
              enctype == NS_FORM_ENCTYPE_TEXTPLAIN) {
     *aFormSubmission = new nsFSTextPlain(charset, encoder,
                                          formProcessor, bidiOptions);

So immediately I tried <form method="post" enctype="text/plain" ... and voila I could submit without an auth token. So then I thought well maybe Firefox is just crazy so I dug into the Webkit source and found similar code:

} else if (type.contains("text", false) || type.contains("plain", false)) {
    m_enctype = "text/plain";
    m_multipart = false;

In retrospect it would have been a lot quicker if I had found this webpage which lists exactly how each browser deals with enctype. But never trust a website… only trust source code :P Also digging through code is a lot more fun.

There was one more step to make the CSRF scenario complete. Even though I could submit the form without having Rails raise a security exception, it wouldn’t use any of the input values from the form. But I quickly found that any parameters passed in the query string would still be parsed and available. So something like

<form method="post" enctype="text/plain" action="/update_email?">

would do the trick.

So anyway to briefly overview what the vulnerability means: If you are using an affected Rails version (2.1.0-2.1.2 and 2.2.0-2.2.1) anyone can perform CSRF attacks on your application. Meaning if you have a form on your site which updates a user’s email address and a user of your site (who is logged in) visits a malicious page, the attacker can change that user’s email address (after which they can probably reset their password and gain full access to the account).

October 29, 2008

Secure alias method chaining

Filed under: ruby, security — Tags: , — coderrr @ 9:24 pm

Have you ever wanted to redefine a method, chaining it to the original method, but make sure that the original method was uncallable? No? Well yea, most people probably haven’t. But it’s an interesting idea and I actually have a somewhat legitimate use case for it, so I’m going to talk about it. Please note, the below are examples, not what I actually used it for.

The usual way to chain a method is:

class String
  alias_method :original_upcase, :upcase
  def upcase
    s = original_upcase
    "upcased!: #{s}"

But if someone wanted to call the original upcase all they would need to do is call original_upcase. Maybe you think you could remove_method :original_upcase. But no, that would break the new upcase when it tries to call the original.

Luckily there is a way to do this with lambdas, method objects, and enclosed local variables.

class String
  m = instance_method(:upcase)
  define_method :upcase do
    s = m.bind(self).call
    "upcased!: #{s}"

We have now overwritten the original upcase method without having to first alias it. The original method only exists in the local variable m, which was enclosed in the block sent to define_method. After the end of class String that local variable is now out of scope and effectively non-existent. It only exists in the block, but there is no way to extract the value of it from the block without being able to modify the block.

Of course the method object is still in existence, which means it could be found with

methods = []
ObjectSpace.each_object(UnboundMethod) { |m| methods << m }

This is averted by simply removing the ObjectSpace constant:

class Object
  remove_const :ObjectSpace

Update: Pat Maddox pointed out that you could get access to the original Method objects through modification of the Method or UnboundMethod classes. We can prevent this by freezing both classes so that no further modification of them is possible. This includes adding, removing, redefining methods, etc.

[Method, UnboundMethod].each{|klass| klass.freeze }

So there you have it, secure alias method chaining. Or…. can anyone figure out a way to access the original method without using ObjectSpace (and without using C extensions of course)?

Update: Ok this has already been pwned by Maddox. If you redefine Method#call you can get access to the method object. So to keep things secure we’d have to prevent someone from modifying the methods of the Method class. This might be possible using something like I’m not sure if that will prevent all tampering attempts though, I’ll have to look into this.

September 10, 2008

Get the physical location of wireless router from its MAC address (BSSID)

Filed under: network, ruby, security, slashdotted — Tags: , , , — coderrr @ 1:20 am

Shameless Plug: Protect yourself with public wifi security while connected to public hotspots with a VPN Service.

Update: Here’s a coverage map showing what areas they have data on.

A nice company called SkyHook Wireless has been war driving the country for years. Basically they’ve been paying people to ride around in cars and record the unique IDs (BSSID, aka MAC address) that your wireless routers broadcast. Doesn’t matter if your router has encryption on or not, its BSSID is still public and in their database (if they’ve driven past your house). They’ve then taken all this information and put it in a huge database. They’ve even made a nice little javascript API which given a BSSID will tell you its longitude and latitude. But it will only let you do this for yourself, only sending BSSIDs which you are in range of.

For their API to work it requires you to install a browser extension. Which contains, along with the extension source code (which is fully viewable, for Firefox at least), some compiled c++ code (loki.dll for windows). So what does the proprietary stuff do? It does the actual query to their API. And what does it send? It asks your wireless card to list all of the BSSIDs that you are in range of and sends those along with the signal strength of each.

So why can’t you just send any BSSID you want? Simple, because they don’t tell you how. The actual query is done inside of their compiled code, so it’s a secret and no one will ever figure it out. Well, only the people that try at least. After reverse engineering their code I did a google search on one of the unique-ish terms in the XML that is used as part of the API call and it seems there are others who know how to use this secret API of theirs.

So to keep things short. Here’s how to query their service to find the physical location of any wireless router whose BSSID you know.

Send an HTTPS POST request to with XML in the following format:

<?xml version='1.0'?>
<LocationRQ xmlns='' version='2.6' street-address-lookup='full'>
  <authentication version='2.0'>

You’ll receive back either this (success!):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<LocationRS version="2.6" xmlns=""><location nap="1"><latitude>49.2422507</latitude><longitude>11.4624963</longitude><hpe>150</hpe></location></LocationRS>

or this (failure!):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<LocationRS version="2.6" xmlns=""><error>Unable to locate location</error></LocationRS>

Here’s a dirty little ruby script which does the query based on a BSSID you pass in from the command line:

require 'net/https'

mac = ARGV.first.delete(':').upcase

req_body = %{
<?xml version='1.0'?>
<LocationRQ xmlns='' version='2.6' street-address-lookup='full'>
  <authentication version='2.0'>
}.gsub(/^\s+|[\r\n]/, '')

http ='', 443)
http.use_ssl = true

http.start do |h|
  resp = '/wps2/location', req_body, 'Content-Type' => 'text/xml'

  if resp.body =~ /<latitude>([^<]+).+<longitude>([^<]+)/
    puts "#$1, #$2"
    puts "not found"

getloc.rb aa:bb:cc:dd:ee:ff

Or here’s a one liner for UNIX thanks to George:

MYMAC=AABBCCDDEEFF && curl --header "Content-Type: text/xml" --data "<?xml version='1.0'?><LocationRQ xmlns='' version='2.6' street-address-lookup='full'><authentication version='2.0'><simple><username>beta</username><realm></realm></simple></authentication><access-point><mac>$MYMAC</mac><signal-strength>-50</signal-strength></access-point></LocationRQ>"

* note: This API is how the iPhone’s “Locate me” feature works

June 28, 2008

Detecting SSH tunnels

Filed under: network, security, slashdotted — Tags: , , — coderrr @ 6:35 pm

Shameless Plug: Use a VPN Tunnel to secure and encrypt your P2P traffic.

Italian researchers have published a paper on the Detection of Encrypted Tunnels across Network Boundaries. I came across it in a google search because I’ve been thinking of writing a program which does something similar. It doesn’t seem like anyone else has picked up on this research yet so I thought I should mention it. Here is a link to the actual paper: pdf or scribd.

They claim their technique can differentiate between “normal” ssh or scp sessions and ssh sessions which are being used to tunnel traffic (through ssh’s port forwarding mechanism). This is accomplished through a naive Bayes classifier, which they first trained with “normal” ssh sessions. The two variables used to classify a session are the size of the packets and the difference in arrival time of two consecutive packets. With just these, they can classify with 99% accuracy whether an ssh session is a tunnel. They were also able to classify the actual protocol (P2P, POP, SMTP, HTTP) of the tunneled connection with close to 90% accuracy.

Although their research is quite interesting there are a few things which limit its practicality. They can only detect tunnels going through ssh servers which they control. This is because their detection mechanism can only handle a single authentication type whereas an ssh server can (and usually does) allow multiple (e.g. public-key or password). This requires admins of the server to limit the allowed authentication options to a single consistent choice. They also require the ssh server _and_ client to disable compression. Their technique will also falsely classify a second login attempt (after a failed login) as a tunnel and drop the connection. In their words: “However, this should not be a major problem: simply, if the user is entitled to connect, they will try again.”

So it seems the use of a tool like this would be limited to an extremely controlled environment where users are limited to a white-list set of network protocols (so that they can’t use a different tunneling mechanism, stunnel for example) and only allowed to ssh to servers under the control of the censoring party. In which case you would wonder why the admin wouldn’t just set the ssh servers’ AllowTcpForwarding option to false.

It sounds like this is just preliminary work so maybe their future research will solve all these problems. If perfected this technology could be used by ISPs to block or throttle even encrypted P2P traffic.

I’d also like to note that it would probably be easy to create a tunneling mechanism which thwarts their detection attempts. Knowing that they use packet size and inter packet intervals you could easily manipulate these to match whatever protocol type you wanted.
Update: This actually might not be that easy with P2P traffic since you’d need to mimic another protocol where there is a large amount of uploading going on at the same time as downloading. This is pretty hard to speculate on without actually trying it out. But if you could limit a bit torrent connection’s upload to 5% of the download and still get reasonable speed you might be able to mimic a tunneled HTTP connection.

While looking around one of the researchers web pages (Franceso Gringoli) I found a pretty cool Linux/OSX utility called sshgate. It allows you to transparently tunnel all your connections over ssh. This is great for programs which do not give you the option to use a socks server and which do not play nice with socksification. I haven’t tested it out so I’m not sure if it actually works.

Bookmark and Share

March 23, 2008

Simple text watermarking with Unicode

Filed under: ruby, security — Tags: , — coderrr @ 8:20 am

There’s quite a few papers on the watermarking of text. Most of them are pretty complex. I was trying to think of a less robust, but simpler solution, which could help track text being cross posted on websites and blogs. The idea was that you could provide the same block of text with a different watermark to each user. So if the text then showed up later on a blog, you could tell who had “leaked” the text.

I chose to use the spaces between words as the bits for storing the watermark and have the on bit be marked by inserting a zero width unicode character after the space. I decided against inserting a character inside of words because if the unicode character showed up after pasting the text in a non unicode editor, the text would be very un-readable. Inserting between words also allowed for the text to be searchable in the browser. If you had the text Pe[invisible unicode character]ter in your browser and tried to search for “Pet”, your search wouldn’t match it even though it would look jsut like “Peter”. Of course terms with spaces are still unsearchable in my approach.

I tried some other approaches using different unicode space characters but I ran into problems with all of them. This one seems to work the best in Firefox and IE. There’s a crapload of unicode code points so there’s probably a bunch of other possibilities. For example, all the alternative punctuation characters.

Currently the watermark must be an unsigned integer. It would be pretty trivial to make it work with a string.

Here’s the usage:

irb(main):003:0> puts Watermark.apply_watermark('Here is a block of text inside of which a number will be hidden!', 42)
Here is a block of text inside of which a number will be hidden!
irb(main):004:0> Watermark.read_watermark('Here is a block of text inside of which a number will be hidden!')
=> 42

note: The string in the above code actually contains the watermark, but you don’t see it… Try copying the text to a non-unicode aware context

Just in case I’ve also provided a method to convert the unicode characters to HTML entities:

irb(main):011:0> Watermark.apply_watermark('Here is a block of text inside of which a number will be hidden!', 42)
=> "Here is \357\273\277a block \357\273\277of text \357\273\277inside of which a number will be hidden!"
irb(main):012:0> Watermark.escape_unicode _
=> "Here is &#xFEFF;a block &#xFEFF;of text &#xFEFF;inside of which a number will be hidden!"

Here’s the implementation:

class Watermark
  INVISIBLE_SPACE = "\357\273\277"  # U+FEFF

  class NotEnoughSpacesError < StandardError; end
  class BadWatermarkError < StandardError; end

  class << self
    def apply_watermark(text, watermark)
      verify_enough_spaces!(text, watermark)

      bits = bit_map(watermark)
      text.gsub(/ /) { SPACE_CHARS[bits.shift || 0] }

    def read_watermark(watermarked_text)
      bit_map = watermarked_text.scan(SPACE_REGEX).map {|c| SPACE_CHARS.index(c) }

      bit = -1
      bit_map.inject(0) { |watermark, on_off| watermark |= (on_off << bit+=1) }

    def escape_unicode(text)
      text.gsub(INVISIBLE_SPACE, "&#xFEFF;")


    def verify_watermark_format!(watermark)
      raise(BadWatermarkError, "only unsigned integers")  if ! watermark.is_a? Integer or watermark < 0

    def verify_enough_spaces!(text, watermark)
      spaces_count = text.scan(/ /).size
      raise NotEnoughSpacesError  if bits_needed(watermark) > spaces_count

    def bits_needed(integer)
      return 1  if integer == 0
      (Math.log(integer+1)/Math.log(2)).ceil  # solve: integer < 2**bits_needed

    def bit_map(integer) {|i| i }.map {|bit| [integer & (1 << bit), 1].min }

September 14, 2007

Script Accenting

Filed under: security — Tags: — coderrr @ 3:42 pm

Pretty awesome paper with an idea on how to prevent future cross-domain policy vulnerabilities:

Basically what they propose is to symmetrically “encrypt” (XOR) the javascript from each domain with a key that is unique to each domain. They modify the JS engine to decrypt the javascript before running it with the appropriate key from the domain it’s attempting to be executed on. So this way even if an attacker finds a vuln to execute script in another domain the JS engine won’t be able to execute it since it won’t be decrypted with the correct key and would give an error.

Cool idea.

Customized Silver is the New Black Theme Blog at


Get every new post delivered to your Inbox.

Join 28 other followers