December 2nd, 2010

JavaScript Serialization

GSerializer is a library that allows you to convert complex Javascript objects into simple, serialised, text strings and back. Persist it in cookies or on the server side.

— Matt Fellows —

JavaScript Serialization

Have you ever had the tedious job of pipe delimiting data to a cookie and manually checking and retrieving valid data, all just to re-assemble a JavaScript object? Well think again – what if you could serialize an entire object to a string, store that string (say in a cookie) and read it straight back into a JavaScript object without having to manually rebuild it? Serialization offers one way of persisting an entire user experience for retrieval at a later time. When that time comes, simply retrieving the object from the store and deserializing it gives you back the object that you started with. Provided that an Object Oriented approach is followed, in principle, any object can be serialized and retrieved from an external store.


What is serialization?

Serialization is the process of storing an object, and it’s internal state at the point of serialization, in another medium so that it can be transported or retrieved at a later date. When the object is deserialized, the object is recreated with it’s internal state in tact.

Object persistence with GSerializer

GSerializer is a JavaScript class that provides methods to serialize and deserialize any JavaScript object, using an XML format.

These are the basic steps that are required to serialize any JavaScript object with GSerializer:

  1. Create and populate an object
  2. Create an instance of the GSerializer class
  3. Call the serialize method of the GSerializer object and hold onto the serialized string
  4. Store the serialized string somewhere (i.e. a cookie, database, file etc.)

To deserialize the object:

  1. Retrieve the serialized string from it’s store
  2. (Optional) Check for code tampering (see Security )
  3. Call the deserialize method of the GSerializer and store the resulting object
  4. You can now use that object!

The Code


// The object to serialize
var myObject = new MyObject();

 // The Serializer
var serializer = new GSerializer();

// Grab the serialized XML
var serializedXML = serializer.serialize(myObject, 'MyObject');

// Deserialize the object from the serialized XML string
var deserializedObject = serializer.deserialize(serializedXML);

Download the library from the Google Code project page.

The Examples

  • See a demonstration of JavaScript deserialization using Cookies to persist the object.
  • Download the example from the Google Code project page.

Storage

So where should this serialized object be stored? There are many possibilities, but it comes down to choosing between client-side and server-side:

  1. Cookie (client-side)
  2. Database (server-side)

Storage in Cookies

Since JavaScript is a client-side scripting language, a logical choice is cookies. Cookies are easily accessed via JavaScript and so persisting and retrieving is not limited by bandwidth. In fact, in a recent application I used cookies to store a custom Session object. Each time the Session object was updated (due to an interface change, search etc.) I wrote the object to the cookie so it was essentially storing the live session. This kind of speed and reactivity is not available on the server side where calls to a database are limited by bandwidth.

However cookies suffer from the following drawbacks:

  1. Limited storge size (4kb or so)
  2. Limited number of cookies (30 or so for a particular domain)
  3. Can be accessed on the clients’ machine and thus are susceptible to tampering. Provided no sensitive data is stored in these objects, this may not be an issue

Storage in a database

If cookies can’t get the job done, then perhaps a database can. A database can get around the three main drawbacks of cookies but are more complicated to implement. For example, you would need some service on the server side to store and retrieve the code for a particular user. Live session writing is probably not feasible with a database either; instead the session would likely be retrieved once at the start and saved once when the user is leaving, or perhaps at the users request. If however, the browser crashed or the computer shut down unexpectedly – the changes since the last save will be lost.

Advantages

Using GSerializer provides a number of advantages for a JavaScript application. It;

  1. Provides a means to fix the back button: Many AJAX applications suffer if someone accidently clicks the back button in their browser. Instance variables are lost, the interface might change etc. Storing their session to an object means that when they come back, you can reload that object and the user never had to know the difference.
  2. Provides a common way to store and retrieve any object. For example, one could write generic methods to take an object, serialize it and store it. Then provide another method to retrieve the object from it’s storage and load it back into memory. This could then be used for any object, preventing the need for object specific string manipulation from cookies.
  3. Provides an extensible and re-usable component

Security

There are a number of security risks and precautions that you should take when using this technique:

  1. GSerializer uses the Eval function to generate objects, therefore you must take every precaution to ensure that the data coming into the serializer is from a trusted source, and that that source has not been tampered with. (read more )
  2. Interception over a non-secure connection is possible, so think carefully about persisting sensitive information (read more )
  3. Malicious tampering of the serialized object is possible

Security: Generating a unique key

One approach that could be used to increase the confidence that the code has not been tampered with is to generate a unique key from an undisclosed server side Hash function (or something similar) based on the contents of the serialized XML string. This unique key can be stored on the server side so when the unserialized object is retrieved, the Hash function can be called again on the data to produce another key. If the two keys are identical then the code has not changed between writing and retrieving.

Future

I can see a number of ways in which this technique could be enhanced:

  1. Compression of the output string to save space (particularly in the case of using cookies for persistence)
  2. Encrpytion methods to prevent interception and tampering
  3. Better encoding of functions and objects. Currently, objects of the same type are duplicated in the XML rather than referenced. Ideally only the instance variable/value pairs of the class should be stored and the rest of the class should be referenced to be more efficient on space.
  4. Security function to prevent unsafe code from being evaluated by the eval() function during deserialization

JSON vs JavaScript Serialization

Some of you may be wondering: why not use JSON? Aren’t there serializer library’s for JSON similar to this? Well, yes there are (json.js from JSON.org ) to answer simply, but JSON will not work for our purpose because JSON is a notation for passing data, and will not encode functions which are crucial to the re-serialization in our case. A second drawback is that one must write in the JavaScript literal notation to produce a JSON string, which in my opinion is awkward and unreadable. But in saying this JSON is an internationally recognised standard for passing JavaScript Object data and may well be the real answer you are looking for.

When to use JSON and when to use serialization

If you only need to store data and not the objects themselves, then JSON might be the go. For example, perhaps you have a Session Object that takes only a name and some list of parameters describing the user’s session in it’s constructor. If however, you require a complex Session object containing a number of other complex objects, then it is much easier to use serialization as it takes the hassle our of re-populating each of the objects. The Session wrapper Object will already contains a reference to these pre-populated objects and their associated methods upon deserialization.

Acknowledgements

When I first sought the need of a serializer, I naturally started with Google. I found a number of promising starts, but none worked for complex objects and none actually serialized functions along with the objects.

The code used in GSerializer.js is based on code from (dotnetremoting.com – which is unfortunately down at the time of writing). I chose this as a starting point because I admired the use of recursion to achieve the goal. With this base, I made the following changes:

  1. turned it into a proper class
  2. tidied up the code to use consistent naming conventions, and commented functions
  3. introduced methods to properly serialize functions
  4. introduced methods to properly deserialize arrays and functions