Following my article A String is not an Error, I want to bring attention to an issue that similarly applies to JavaScript in general, but has special relevance in the Node.JS environment.
The problem boils down to the usage of {} as a data-structure where the keys are supplied by untrusted user input, and the mechanisms that are normally used to assert whether a key exists.
Consider the example of a simple blog created with Express. We decide to store blog posts in memory in a {}, indexed by the blog post slug. For example, this blog post would be posts['an-object-is-not-a-hash'].
We start writing the route /create, which takes a few POST fields like “title” and “slug” and passes it to a Post constructor.
var posts = {};
app.post('/create', function (req, res) {
if (req.body.title && req.body.slug) {
// avoid duplicates
if (!posts[req.body.slug]) {
posts[req.body.slug] = new Post(req.body);
res.send(200);
} else {
res.send(500);
}
}
});
Our first stab for trying to avoid duplicates is checking whether the key exists in our object.
if (!posts[req.body.slug]) {
Normally this would work fine, but let’s consider that the user could pick any of the keys that are present in any JavaScript object as a name:
__defineGetter__ __defineSetter__ valueOf
__lookupGetter__ __lookupSetter__
constructor hasOwnProperty
isPrototypeOf propertyIsEnumerable
toLocaleString toString
If the user wanted to, for example, name his blog post "constructor" our program would behave incorrectly. We therefore change our code to leverage hasOwnProperty, which will allows to check whether the property has been set by us:
if (!posts.hasOwnProperty(req.body.slug)) {
posts[req.body.slug] = new Post(req.body);
res.send(200);
}
Most JavaScript programmers are already familiar with hasOwnProperty, since in the browser world it’s the standard way of writing libraries that work well in environments where Object.prototype is modified, so this addition should come to most as no surprise.
The hasOwnProperty trap
Our program, however, is still susceptible to potential misbehaving. Let’s say our user decides to call his blog post "hasOwnProperty". The first time our check executes, everything will behave correctly, since the check will return false:
∞ ~ node
> var a = {};
> a.hasOwnProperty('hasOwnProperty')
false
Our code would therefore set the hasOwnProperty value in our object to the Post instance, which is an object. We can now simulate what would happen if the user tries that again:
> a.hasOwnProperty = {}
{}
> a.hasOwnProperty('hasOwnProperty')
TypeError: Property 'hasOwnProperty' of object #<Object> is not a function
at [object Context]:1:3
As a result:
- Our code would throw a (potentially) uncaught exception.
- Execution of our
/createroute would be aborted and response would not be sent to the user. - The request would be left hanging until it times out on both ends.
The solution to this problem is to avoid relying on the “hasOwnProperty” provided by the object we’re dealing with, and use the generic one from the Object.prototype. If we execute it on our test subject, true will be returned after we set it as expected:
> Object.prototype.hasOwnProperty.call(a, 'hasOwnProperty')
true
Conclusions
The first conclusion from this experiment is that from the moment we decide to use a JavaScript object as a general-purpose hash table we can no longer rely on any of its inherited properties, specially hasOwnProperty (which we’re normally bound to use). This oversight, as a matter of fact, afflicted the Node.JS core querystring library in the past.
If your code relies heavily on data structures where the possibility for collisions like this exist, you might want to consider having a has utility around:
function has (obj, key) {
return Object.prototype.hasOwnProperty.call(obj, key);
}
And use it as follows:
if (has(posts, req.body.slug)) { }
As a side note, you should generally stick to this method in the browser environment as well. Host objects such as window in older Internet Explorer versions do not have hasOwnProperty, leading to potentially inconsistent behavior in your code.
As correctly pointed out by different comments and tweets, another possibility surfaces in Node.JS:
var posts = Object.create(null);
This will yield an object with a null prototype, which therefore does not inherit any methods, and would simplify our code to look like the original example:
if (!posts[req.body.slug]) {}
Finally, a hashing scheme such as a prefix could do the job as well:
if (!posts['posts-' + req.body.slug]) {
posts['posts-' + req.body.slug] = new Post(req.body);
}
30 Comments
As an alternative, in Node, you have the luxury of using Object.create(null) which will create an object that does not inherit from Object.prototype. This is an ES5 feature that has landed in many browsers already.
ES6 will introduce Map, Set, and WeakMap, which have a stratified API (meaning that you use [] for the API and .{get,set,has,delete} for the contents of the structure) and accept objects as keys. WeakMaps are available in Node v0.6 with the –harmony_weakmaps flag. I’ve heard reports that the remaining features are available in Node v0.7. They are all available in latest Chromium with an option under chrome://flags
To create a clean object, without a prototype you can use `var posts = Object.create(null)`.
regardless of how to check for the existence of a member, you should still be validating them to make sure they’re safe to use.
good coverage though. we need more concrete examples of how failing to sanitize input data can compromise an application’s integrity.
Kris Kowal I thought the same about Object.create(null), but WeakMaps is a news! Thank you.
Fwiw, I can’t read the code snippets, there is not enough contrast. I tried to select the text to see if there was some special styling on text-selection to help, but there is none… :-(
Another solution would be to prefix the slug keys with something like ‘_’ to ensure there will be no collisions with the built-in methods.
With __proto__ being increasingly implemented by JavaScript engines, there is one more trap: using “__proto__” as a key:
http://www.2ality.com/2012/01/objects-as-maps.html
That is such good advice!
Great article, however this danger only applies if you use obj.hasOwnProperty(prop) instead of a simple obj[prop] check, e.g. if you need to work with nulls, or if you need to use the Object for anything else than plain property storage. Example: https://gist.github.com/1638914
Also – “The request would be left hanging until it times out on both ends.” – I assume you use catch-all error handler to prevent node from Exiting upon uncaught Error? If you are relying on that, your code is actually very weak. That’s why I always enclose my async code in try { } catch, but not manually, I wrote a library for that: https://github.com/ypocat/laeh
And what about a hashing function?, so the “slug” is not stored raw in the object, which is usually a bad a idea.
You can also write some kind of “Dictionary” object, something very simple, with just two or three methods: “store”, “find” and other for exporting the contents. That would create a level of abstraction and you’ll be able to substitute the {} for this dictionary without much effort (and would be also easier to test).
Good article though.
dude, I didn’t even read the article lol… but this site design is great! i especially like the good use of circles. bravo
If prefer Object.create(null) in combination with the “in” operator. I think it reads better than any other combination.
var posts = Object.create(null);
if(req.body.slug in posts) {
}
It might be smarter (even though storing slugs in an object is a bad idea anyways), to prepend a prefix.
var posts = {};
app.post(‘/create’, function (req, res) {
if (req.body.title && req.body.slug) {
// avoid duplicates
if (!posts["slug_"+req.body.slug]) {
posts["slug_"+req.body.slug] = new Post(req.body);
res.send(200);
} else {
res.send(500);
}
}
});
Another alternative is to use Object.keys() which ignores the prototype chain. Then you can do simple indexOf(‘slug’). I’ve implemented it in my hash library hasshu. (Shameless plug i know, but let me know what you think)
https://github.com/seppuku/node-hasshu
// The property burglar!
Object.prototype.hasOwnProperty = function() { return true; };
I agree with @Thierry above: more contrast on code snippets is advisable. (Chrome on OS-X here).
Pingback: JsMag - the magazine for JavaScript developers
Pingback: Rounded Corners 322 – Wat /by @assaf
function has (obj, key) {
return Object.prototype.hasOwnProperty.call(obj, key);
}
Hey, you can do that in a better-looking way! :D
var has = Function.prototype.bind.call(Object.prototype.hasOwnProperty)
Pingback: Web Design Weekly #27 | Web Design Weekly
Pingback: Weekly #31 | fitml.com Blog
Great article! I really appreciate the way you explain the possible misbehavior of using an object as a Hash.
For a lightweight string-map, you can check out my dict repository:
https://github.com/DomenicDenicola/dict
The readme has links to alternatives as well.
It’s a shame about `__proto__`, since otherwise `Object.create(null)` would work wonderfully.
Excellent information. This is something that I’ve never personally taken into account on side projects where I use the JS object as a hash. Thanks for the info – its definitely something I will be taking into consideration moving forward.
Instead of `Object.prototype.hasOwnProperty.call(x, y)`, wouldn’t it also be possible to use `Object.hasOwnProperty.call(x, y)`? Since, you know, `Object.hasOwnProperty == Object.prototype.hasOwnProperty`. Saves a property lookup, and is shorter/faster to type.
Pingback: An Object is not a Hash | phato.blog`
Pingback: JSSpy » Js101: object.create
Pingback: An Object is not a Hash | Guillermo Rauch’s Devthought | Itsaat
You can, in some cases, just set the prototype of the posts object to null to achieve the same effect as node’s Object.create(null):
posts = {};
posts.__proto__ = null;
The safest way to do this is probably:
var hasOwnProperty = {}.hasOwnProperty;hasOwnProperty.call(object, property);