Building a JSON Parser in Javascript

Hello,

The latest time spent coding has been on a new project I’m developing with a good friend of mine. The project is Called OpenWeatherJS and is an open source JavaScript library to process weather information parsed from the OpenWeatherMap API.

Also i should mention that the code examples is written in TypeScript using TypeScript syntax.

I was immediately assigned the task to build a JSON parser, i found this task very interesting since i have never built one before. So this post will be a post of my progress learning about parsing and building my first parser for a JavaScript library.

First of all i found the information that to retrieve data by sending a request to a website we can use the XMLHttpRequest that JavaScript supports. That lets us send a GET request to a specified URL and it’s not that hard to implement error handling.

So lets look at some code:

var xmlHttp = new XMLHttpRequest();

xmlhttp.onreadystatechange = function() {
if (xmlhttp.readyState == 4 && xmlhttp.status == 200) {
var myParsedJSON = JSON.parse(xmlhttp.responseText);
}
};
xmlhttp.open(“GET”, url, true);
xmlhttp.send();

So lets step by step walk through this initial code. First of all we are creating a new XMLHttpRequest object. After that is the function that contains the event handler that gets called when the readystatechange event fires. The if statement inside it is from the XMLHttpRequest documentation:

Value State Description
0 UNSENT open() has not been called yet.
1 OPENED send() has been called.
2 HEADERS_RECEIVED send() has been called, and headers and status are available.
3 LOADING Downloading; responseText holds partial data.
4 DONE The operation is complete.

So what happends inside the if statement is that the Request was a success and we are currently on the requested url. Then we use the xmlResponseText that returns the contents of the DOM in text form. We are expecting the called url to contain JSON so we are using JSON.Parse on the response text we got out of the request and now our variable called myParsedJSON should contain the parsed JSON from the website.

Since the event is only called when the ready state is changed this wont run by itself so below we are adding open function to open the GET request to the target url and the third argument is if the request should be asynchronous or not.

We finaly send the request with the send function.

So this is a working piece of code we have built a simple JSON parser but what if we would like to go to the next step. In our project OpenWeatherJS we wanted to build a library that functions well. If we would implement the above code in a working library that the public should use there is a high probability that someone that uses the library will stumble on for example a request failing or we would like to timeout a request so its not just waiting forever or if we want to check if the specified url is really an url or if the JSON we parsed is actually JSON.

All of this is something we implemented in our library. And i will explain how.

After this initial part was finished i moved onto checking if the URL was a valid URL. So how does one do this? We have to use something called Regex (Regular expressions). A regular expression lets us filter a string. Regex commands can be very difficult and takes time to understand. Here is a resource to learn about Regex: http://regexone.com/

Lets look at the code i implemented and then explain it.


/**
* Validates provided value is a URL and throws a new TypeError
* with specified message otherwise..
*
* @param value - a value being tested.
* @param message - a short description of the assertion.
*/
static isUrl(value: string, message: string): void {
var URLValidationRegExp = /(^|\s)((https?:\/\/)?[\w-]+(\.[\w-]+)+   \.?(:\d+)?(\/\S*)?)/gi;
var matcher = URLValidationRegExp;
var match = value.match(matcher);
if (!match) {
throw new TypeError(message);
}
}

So what we have here is a function that takes in an URL to check as well as a message to display if the URL was invalid. The function starts with a new variable URLValidationRegExp and this is the variable that contains the Regular expression i mentioned. To not dive too deep into just Regex I’ll just mention that this particular Regex is good for todays URLs’ because today an URL doesn’t have to follow the standard “www.example.com” it can be for example “api.example.com” or “example.com” this Regex expression will allow us to use a wide variety of URLs’ which is good.  Moving on we use a new variable called match to contain this regular expression as well as a match variable that will contain a boolean depending if the value matches the matcher variable (Which contains the regex) if the match is false after our match then the value doesn’t match the regular expression and the value is not a valid URL. Thats when we throw a new TypeError with the message from the parameters.

After that we can use the validation function inside our parsing function to check if its a valid URL we are sending the request to.

Next while we are validating values we can look at how we could validate that the JSON we parsed is actually JSON. Take a look at this function:


/**
* Validates provided value is a JSON Object and throws a new TypeError
* with specified message otherwise.
*
* @param value - a value being tested.
* @param message - a short description of the assertion.
*/
static isJSONString(value: string, message: string): void {
try {
var o = JSON.parse(value);
if ((typeof o !== 'object') || (o == null)) {
throw new TypeError(message);
}
} catch (e) {
throw new Error(message);
}
}
So if we look at this function it has the same structure as the function we used to validate the URL. Inside here we takes in a Parsed JSON value in the parameters and inside a try / catch statement we attempt to parse the value to JSON. So we are Parson a JSON value. JSON.parse returns an JSON Object so we are below checking if the type of the parsed variable is of the type object. And if the type is either not object or is null we want to return a type error since the value couldn’t parse it. Also we want to catch any errors that occur using the catch block to throw a new error with the same message in the parameters.

So we now have validation functions to use protecting our users so they do not either parse something that is not JSON or trying to parse from an invalid url. Also this gives us the freedom to type our own error messages so someone using the library has an easy time understanding what went wrong.
Next up we should give the code the functionality to be able to timeout after a certain time attempting to get the request from the URL. XMLHttpRequest has a property called timeout. Its a unsigned long number that we can use to specify in milliseconds how long the request has until it should timeout.

Something to note when implementing timeout functionality using XMLHttpRequest is that it has to be an asynchronous request. If you wonder what an asynchronous request is you could think of it like this:

Synchronous request would be you for example calling your friend on the phone asking him to get some information for you. You agree that he should call you back when he has the information, when you hang up you are just sitting and staring into the wall waiting for your friend to call back.

Asynchronous request would be you calling the same friend on your mobile phone asking him the same thing, get me some information and call me back. You then proceed with your day and then your friend calls you in the middle of the day and you get the information you wanted.

So adding the timeout feature make sure you are using an Asynchronous request and then you could implement some code like this:

xmlHttp.timeout = 2000;
And then look into adding a function that explains what would happen if the request timed out.
xmlHttp.ontimeout = function () {
xmlHttp.abort();
throw new Error("Request Timed Out.");
};

So what happens here is out request now times out after 2 seconds and then aborts the request and returns an Error.

The thing with adding in this functionality is that you would have to restructure your code a bit. Using an Asynchronous request as i mentioned would make your code continue and then getting the JSON in the middle of doing something else. That means we cannot set a variable equals to the parsed JSON inside the on ready state change event and then just return it and go on with our day. The function would return before our request is finished and our JSON is in our hands. But do not worry there is a solution for this as well.

In JavaScript the solution is called callbacks. We would have to restructure our entire function to have a callback function that the parsed JSON is returned to.

Here is how that would look together with our validation functions:

static Parse(url: string, done: (obj: any) => void): void {
Asserts.isUrl (url, 'URL is invalid.');
var xmlHttp = new XMLHttpRequest();
xmlHttp.onreadystatechange = function() {
if (xmlHttp.readyState == 4) {
try {
if (xmlHttp.status == 200) {
var obj = JSON.parse(xmlHttp.responseText);
Asserts.isJSONString(JSON.stringify(obj), 'Retrieved JSON is invalid.');
done(obj);
}
} catch (err) {
throw new Error("Error connecting: " + err);
}
}
};
xmlHttp.open('GET', url, true);
xmlHttp.timeout = 2000;
xmlHttp.ontimeout = function () {
xmlHttp.abort();
throw new Error("Request Timed Out.");
};
xmlHttp.send();
}

If we look inside this we can see how our parameters for the parse function takes in a function. Thats how we add a callback. That means when we call the parse function we have to add a function into the parameters that takes one argument and when the parse function is finished then the done would be called (the callback) and our JSON object is placed into the parameters.

Then there is more things to do like adding support for Internet Explorer. Anyways our completed function looks almost the same as what i explained here. The difference is that we added another callback function that is called if there is an error getting the request. That means we can have another function in the parameters which we use to run another piece of code if the request was a failure.

Something to mention when implementing cross browser support is that IE5 and IE6 uses something called ActiveX object instead of XMLHttpRequest.

Here is a brief introduction to the subject: W3Schools XMLHttpRequest

And here is another resource: www.jibbering.com/2002/4/httprequest

Next up i will cover some Algorithms i have recently studied. Also if i encounter some new pieces of code like building a parser i will write about it as i do it so it becomes more. I am also soon going to build a project in Java which i will try to cover here as well.

Until next time, see you later!