What is serialization?
Serialization is the process of storing an object, and it’s internal state at the point of serialization, in another medium so that it can be transported or retrieved at a later date. When the object is deserialized, the object is recreated with it’s internal state in tact.
- Create and populate an object
- Create an instance of the GSerializer class
- Call the serialize method of the GSerializer object and hold onto the serialized string
- Store the serialized string somewhere (i.e. a cookie, database, file etc.)
To deserialize the object:
- Retrieve the serialized string from it’s store
- (Optional) Check for code tampering (see Security )
- Call the deserialize method of the GSerializer and store the resulting object
- You can now use that object!
// The object to serialize var myObject = new MyObject(); // The Serializer var serializer = new GSerializer(); // Grab the serialized XML var serializedXML = serializer.serialize(myObject, 'MyObject'); // Deserialize the object from the serialized XML string var deserializedObject = serializer.deserialize(serializedXML);
Download the library from the Google Code project page.
- Download the example from the Google Code project page.
So where should this serialized object be stored? There are many possibilities, but it comes down to choosing between client-side and server-side:
- Cookie (client-side)
- Database (server-side)
Storage in Cookies
However cookies suffer from the following drawbacks:
- Limited storge size (4kb or so)
- Limited number of cookies (30 or so for a particular domain)
- Can be accessed on the clients’ machine and thus are susceptible to tampering. Provided no sensitive data is stored in these objects, this may not be an issue
Storage in a database
If cookies can’t get the job done, then perhaps a database can. A database can get around the three main drawbacks of cookies but are more complicated to implement. For example, you would need some service on the server side to store and retrieve the code for a particular user. Live session writing is probably not feasible with a database either; instead the session would likely be retrieved once at the start and saved once when the user is leaving, or perhaps at the users request. If however, the browser crashed or the computer shut down unexpectedly – the changes since the last save will be lost.
- Provides a means to fix the back button: Many AJAX applications suffer if someone accidently clicks the back button in their browser. Instance variables are lost, the interface might change etc. Storing their session to an object means that when they come back, you can reload that object and the user never had to know the difference.
- Provides a common way to store and retrieve any object. For example, one could write generic methods to take an object, serialize it and store it. Then provide another method to retrieve the object from it’s storage and load it back into memory. This could then be used for any object, preventing the need for object specific string manipulation from cookies.
- Provides an extensible and re-usable component
There are a number of security risks and precautions that you should take when using this technique:
- GSerializer uses the Eval function to generate objects, therefore you must take every precaution to ensure that the data coming into the serializer is from a trusted source, and that that source has not been tampered with. (read more )
- Interception over a non-secure connection is possible, so think carefully about persisting sensitive information (read more )
- Malicious tampering of the serialized object is possible
Security: Generating a unique key
One approach that could be used to increase the confidence that the code has not been tampered with is to generate a unique key from an undisclosed server side Hash function (or something similar) based on the contents of the serialized XML string. This unique key can be stored on the server side so when the unserialized object is retrieved, the Hash function can be called again on the data to produce another key. If the two keys are identical then the code has not changed between writing and retrieving.
I can see a number of ways in which this technique could be enhanced:
- Compression of the output string to save space (particularly in the case of using cookies for persistence)
- Encrpytion methods to prevent interception and tampering
- Better encoding of functions and objects. Currently, objects of the same type are duplicated in the XML rather than referenced. Ideally only the instance variable/value pairs of the class should be stored and the rest of the class should be referenced to be more efficient on space.
- Security function to prevent unsafe code from being evaluated by the eval() function during deserialization
When to use JSON and when to use serialization
If you only need to store data and not the objects themselves, then JSON might be the go. For example, perhaps you have a Session Object that takes only a name and some list of parameters describing the user’s session in it’s constructor. If however, you require a complex Session object containing a number of other complex objects, then it is much easier to use serialization as it takes the hassle our of re-populating each of the objects. The Session wrapper Object will already contains a reference to these pre-populated objects and their associated methods upon deserialization.
When I first sought the need of a serializer, I naturally started with Google. I found a number of promising starts, but none worked for complex objects and none actually serialized functions along with the objects.
The code used in GSerializer.js is based on code from (dotnetremoting.com – which is unfortunately down at the time of writing). I chose this as a starting point because I admired the use of recursion to achieve the goal. With this base, I made the following changes:
- turned it into a proper class
- tidied up the code to use consistent naming conventions, and commented functions
- introduced methods to properly serialize functions
- introduced methods to properly deserialize arrays and functions