The Lazy Man’s URL Parsing in JavaScript

Lazy URL ParsingHave you ever needed to parse a URL using regular expressions? It's not easy to write regular expressions (for a lot of people, including myself) and it's even tougher to test to see if that regular expression is reliable across every situation. You could, of course, just copy and paste a regular expression (or function or library) that someone else developed and use that, but I propose that there is a simpler and more concise way of parsing URLs that doesn't require any regular expressions.

This method – originally posted on Github by John Long, though probably not originally discovered by him – uses native parsing abilities built into the DOM to give you simple access to the parts of a URL simply by querying properties of an anchor element. Check it out:

var parser = document.createElement('a');
parser.href = "http://example.com:3000/pathname/?search=test#hash";

parser.protocol; // => "http:"
parser.hostname; // => "example.com"
parser.port;     // => "3000"
parser.pathname; // => "/pathname/"
parser.search;   // => "?search=test"
parser.hash;     // => "#hash"
parser.host;     // => "example.com:3000"

This code is pulled directly from the Gist that John Long posted at the above link. I haven't seen any statements about which browsers this works with, but I assume that, at a minimum, it works with all modern browsers. If you don't trust it you can either test it yourself, or use a library such as URI.js.

One of the coolest things about this method is that you can enter a partial/relative URL into the href property and the browser will make it a full URL, just like it translates partial URLs on real HTML links into full URLs. For example, try this using your browsers console on this page:

var parser = document.createElement('a');
parser.href = "/";

parser.href; // => "http://www.joezimjs.com/"

You could also just use an empty string for the href and it would give you your current URL (not including the hash, though), but this is a waste because window.location has the exact same properties, so you don't even need to create an anchor element for that.

In all of these examples, you still need to parse the query string, but at least you've got it pulled out of the URL.

UrlParsing.com/Conclusion#Paragraph

I know this is shorter than my usual posts, but I think you still learned something pretty valuable, assuming you didn't already hear about this somewhere else. I definitely wish I knew about this a while back when I was actually doing a project where I needed to parse a URL. Also, don't forget that you need to sign up for the Wijmo contest by midnight tonight (May 7, 2012) in order to be accepted into the drawing. Make sure to spread the parsing technique and Wijmo contest news around to all of your JavaScript programming friends and leave your comments below. Happy Coding!

EDIT May 5, 2012 @ 1:30PM:

I found a post stating that this does not work in IE6 because the href property isn't parsed into a full URL unless it is parsed by the HTML parser. There is a simple workaround that forces the HTML parser to go over it though:

function canonicalize(url) {
    var div = document.createElement('div');
    div.innerHTML = "<a></a>";
    div.firstChild.href = url; // Ensures that the href is properly escaped
    div.innerHTML = div.innerHTML; // Run the current innerHTML back through the parser
    return div.firstChild.href; 
}

About the Author

Author: Joe Zim

Joe Zim

Joe Zimmerman has been doing web development ever since he found an HTML book on his dad's shelf when he was 12. Since then, JavaScript has grown in popularity and he has become passionate about it. He also loves to teach others though his blog and other popular blogs. When he's not writing code, he's spending time with his wife and children and leading them in God's Word.


  • http://profiles.google.com/russell.ballestrini Russell Ballestrini

    If you are coding in python and not javascript, check out https://bitbucket.org/russellballestrini/miniuri/src/tip/miniuri.py

    Short python module to perform uri parsing.

  • http://twitter.com/IORayne Anthony Grimes

    It’s worth noting that the URI spec actually has a regex in the appendix for parsing a URI, so it isn’t that difficult. That said, obviously avoiding regexes is a good thing, so I support this post.

  • pyrotechnick

    node’s CommonJS-compliant url parser is easily ported to the browser: http://nodejs.org/api/url.html

    • http://www.joezimjs.com Joe Zimmerman

       I wasn’t saying that there weren’t libraries (I mentioned libraries twice). I was just noting that there is a simple way to do it without a library or with a very small library based off of this method. Node’s URL Parser function is over 200 lines of code (though well documented) which is several times as much code. I’m currently turning this method into a jQuery plugin and even though I’m adding several things to it (including parsing the search string into an object) and generous comments, it’s still only 70 or so lines of code.

  • http://lukapeharda.com/ Luka Peharda

    Wow, great share!

  • Matt Slocum

    Warning: parser.pathname is not returning a starting ‘/’ in IE, but it does return opening ‘/’ on window.location.pathname.

    • http://www.joezimjs.com Joe Zimmerman

       Yea I heard about this. I’m creating a plugin that utilizes this technique for parsing the URL and I’ll be fixing this problem. The plugin will also parse the query string into an object, so that should be awesome.

  • Tom

    Lovely code, thanks very much. Really useful. Not too bothered about IE6 support in this situation as I’m using this code as a nice little user interface enhancement, so the page will still work fine for IE6 dinosaurs, but just a little smoother for more modern browsers.

  • Fabianx

    Wow, this is a great technique!

  • JamesMGreene

    I’ve tried this, too. Unfortunately, for those of us who still have to support IE < 10 (yes, 10), this approach has plenty of bugs. I started fixing them one by one and basically ended up writing a whole URI parsing library in the end… should've just used jsUri/URI.js/purl/etc. from the get-go, I guess.