JavaScript for Algorithm Challenges, Fun and Troubles

Recently I tried to use JavaScript to solved some algorithm challenges in LeetCode. And I solved pretty much all the challenges in LeetCode that marked as “Hard”. You can checkout my code in this Github repo.. And in my experience, my submissions are usually faster than submissions with Ruby and lots of Python submissions. Considering that my first goal is to express the algorithm nicely, and I am avoiding to do tricky performance tuning, I feel this is a quiet acceptable result.

This blog is going to share some features in JavaScript that I found handy and useful when solving these challenges, and the parts that bother me. But before that, I think I should explain a little bit about why I want to do this. Even it is used widely in back-end nowadays, few people may use JavaScript to implement complex algorithms in production.

The major reason of implementing algorithms in JavaScript is that implementing an algorithm elegantly and clearly requires you to be familiar with features and concepts of the language you are using. And during this process, I actually feel more confident and delightful writing algorithms with JavaScript. I did found some good features that make my life easier! Sadly I don’t utilize them at the very beginning. Those code makes me feel awful now >_<.

And the other reason is that information could be gathered during execution of algorithms. WIth these information, the process of algorithms could be later visualize in web page. With some good presentation, the visualization can be super helpful and fun to watch. Actually, there are a few projects going through this road map.

Now let’s start to talk about the handy tricks and things that bother me.

Array

The first credit I want to give it to Array. Array is very useful in many ways. If you want a stack, you can do it with Array.prototype.pop(), Array.prototype.push() and Array.length. Want a queue? Array.prototype.push() and Araay.prototype.shift() can handle this.

And the fact that Array is actually associated map rather than an actual consecutive buffer sometime helps. You don’t bother to check if the index is out of range before accessing. Array just silently returns undefined. But you also need to be very concise about this otherwise it will be a bug that drags you all the way to very late night debugging.

Plain Object

JavaScript’s plain object has many limitation comparing to Python’s Dict or C++’s map. Its key can only be string. So when you write object[233] = 1 what actually happened is object["233"] = 1. Despite of that, in lots of situation plain object is just what you want. If you want a hash table, plain object could be it in most situation.

When you are doing dynamic programming with memoizating (which is my favorite way to implement a dynamic programming solutions when the memory footprint is not a problem, since it makes the solution looks nice and clean), plain object could be the memo since JavaScript does not have a direct way to allocate a given size tabular memory. It is like,

1
2
3
4
5
6
    var f = {};
    function fib(n) {
	    if (n < 2) return n;
	    if (f[n] !== undefined) return f[n];
	    return f[n] = f(n - 1) + f(n - 2);
    }

The default value of missing key in plain object is undefined, which tells us the result of the current argument n has not been memoizated.

Plain object can memoizate more complicate functions. For example, if the memoizating function has two parameters, plain object could store them like

1
2
3
4
5
6
    var memo = {};
    function f(i, j) {
	    if (memo[[i, j]] !== undefined) return memo[[i,j]];
	    // .....
	    // .....
    }

Note that it should be memo[[i, j]] instead of memo[i, j]. memo[i, j] is a valid express but what it means is memo[j] since the evaluation of i, j is just j according the comma operation. And [i,j] could be convert into a string like "[22,33]" which is an one-to-one map of arguments.

Or it could be like,

1
2
3
4
5
6
7
8
    var memo = {};
    for all potential i do
	    memo[i] = {};
    function f(i, j) {
	    if (m[i][j] !== undefined) return m[i][j];
	    // .....
	    // .....
    }

memo[i][j] approach has better performance in my experience. But I don’t know the implementation details of V8, so this may not be very accurate. Actually, I can not find any very definite documents about how Array or plain object, so this is all of guessing. Maybe I should dig into this topic more deeply in the future.

And ES6’s Map can use any type of value (objects or primitive values) as key. So it also could be a good choice for memoizating. I don/t use it in my solution so far, since I am kind of afraid it is implemented as a BST rather than a hash table. Anyway, I will investigate it and do some updates.

Closure

Another nice thing that JavaScript has is its closure. Closure is very useful when you want to implement your solution without populating any scope outside the function you are implementing. Take Edit Distance as an example. You are implementing minDistance. With closure, you can define another function f inside of minDistance, access and modify variables that is available in minDistance. This is not just about keeping the global namespace clean. It also makes it very easy to ensure minDistance does not have any side-effect since now it will not modify and access any variable outside its scope.

var minDistance = function(word1, word2) {
  var n = word1.length; var m = word2.length;
  var t = {};
  function f(i, j) {
    // you can access t, n, m, word1 and word2 here.
  }
  return f(0, 0);
};

Functional Programming and Arrow Function

Modern JavaScript supports many concepts in functional programming. There are Array.prototype.forEach(), Array.prototype.reduce() and so on. And the introducing of arrow function in ES6 makes them more handy and usable. If I want to iterate an array in order, forEach is always my first choice, since it saves me the time of typing assignments like var element = elements[i] and also keeps the scope clean.

var and let

The problem that bothers me the most is JavaScript’s var, which declares variables in the function scope. I am very uncomfortable with it and keeping writing codes like

for (var i = 0; i < n; i++) {
    // ....
}

for (var i = n; i >= 1; i--) {
    // ....
}

This is awful! I keep doing these only because I am not super sure if all the loops are needed when I wrote them. And I don’t like the feel that I need to modify the code in two places when I am applying one idea simply because they have variable with same name, even though these variables actually do not have to be same. So I keep using var everywhere to avoid this. Shame on me!

Luckily, ES6 introduced let as a way to declare variale in block-level scope. And it also works well with for statement. Sadly I know this too late and don’t use any let in my code. Shame on me! In fact, since all most all the var in my code are actually means let, I think I could write a simple tool to help me to update my code with let.

The Silent NaN

And one thing that may drive you into late late night debugging is the Silent NaN. In JavaScript, when you do undefined + 1, there will be no error raised and JavaScript just silently return a NaN. And you can operate with this NaN long enough. You can store it into a Array or object, or doing number operations with in. Until something serious bad happened, everything just silently runs. And when the error raises to you, it is usually to late to give you direct clue of where the problem is.

I know it is because of JavaScript’s highly dynamic design. But something I also want some guards to ensure I don’t do stupid things. I feel TypeScript and flow may be helpful. I will give them a shot in the following challenge implementation.

Implementation of Data Struct

The other thing that bothers me is that I can not found I definite document of how data structs in JavaScript are implemented in V8. There is detailed documents of semantic of these data structs like MDN. And this is important when you are implementing algorithms since how the data structs in JavaScript are implemented will affect your run-time complexity analysis. I do hope there is a document like this, so that I can refer to. But if it is not, I will look into V8’s code to figure it out in the future.

Conclusion

After finishing 30+ challenges in JavaScript, I feel it is actually quite delightful to solve algorithm challenges with JavaScript. Due to its highly-dynamic nature, it is easier to find a way to express my idea elegantly. In the meanwhile, thanks to the JS community, the performance is still acceptable.

tern: a tool to analyze javascript source code

I use NodeJS and JavaScript a lot for web development. I love it because it is
a lightweight language comparing to Java, C++. It has many handy feature brings
from function programming, like clourse, anonymous function and function as
first-class object.
NodeJS’s “callback-driven” style is also quiet staightforward and simple model to handle
async operation in single thread.

And after you get familiar with a language. You may frequently feel like some
parts of coding should be done automatically. To get these jobs automatically
done, we need to teach computers to get some kinds of “understanding” of the
language and source code, and also need to teach them what the job is.

The first task is hard for all most every popular language. Languages have their
dark part. C has a very clear semantic. But the macro and preprocessor
instructions could be a mess. C++ has its template system. And for dynamic
language with eval, you could never statically know excatly what will happens in
the runtime. Besides of that, features like delay-binding also could obstruct
building a static analyzing tool to understand source code.

Luckily there is a theory for this. It is so called abstract
interpretation. i.e. you run the code line by line, constructing variables from
others. But all the variables does not contain any real data. Instead, they
contains infomation you are interesting like possible integer range.

tern is that kind of library for JavaScript. The author (not me, it is the same
author of acorn and CodeMirror) built up abstract interpretation to keep track
of type infomation for variables and functions. With those infomation, tern
could be able to determine type of variables and functions. In addition, it is
scope-sensitive, so the result could also be used for refactoring.

sgrep idea followup: graspjs

In the last post I mentioned about a syntax-sensitive grep tool for code
searching. Lately, when digger around to find a nodejs refactoring tool, I found
graspjs, very similar tool out there.

Grasp is a command line utility that allows you to search and replace your JavaScript code - but unlike programs such as grep or sed, it searches the structure behind your code (the abstract syntax tree), rather than simply the text you’ve written.

Pretty much the exactly idea of sgrep, isn’t it? Graspjs is
aiming at code refactoring rather that searching. So it designs a little new
language of what you are searching and capturing. The backward of it is user
needs to learn something new. The good part is that with this little
domain-specific language, code structure is much esaier to address rather
regular expression. To me, the DSL is clear and easy to use if you got some
basic concept of Mozilla’s de facto JavaScript AST.

Though graspjs should be a great code searching tool. I don’t think it would be
sufficient as a refactoring tool. Handling code in grammar level (instead of
char sequence) is good. But it is not good enough for refactoring. Imagine you
want to rename a local variable, in grammar level you could say that you want to find
the declaration and usage of it and do the replace. The bad part is in the AST
you don’t have concept of scope. So if you declare another local variable in a
different scope with the same name, graspjs would replace both of them.

So, in short, give it a shot if you are searching javascript code. But don’t use
it as a refactor tool. You desire a better one.

sgrep: grep with syntax-sensitive

Though there are lots of cross-reference tools nowadays, times to times we still
need to using grep to explore some codebases. grep is a great tool, but it is
not quiet well-suite for searching codes. grep is line-base. That makes it quiet
hard to match patterns like

1
2
3
4
5
6
int f()
{
    ....
    ....
    ....
}

via regex like f\([^)]*\) {. Thing is also true for assignment, and some
another syntaxes.

So here is the idea, why not create a grep-like tool but it could take the
syntax of a programming langugue as extra information to transform user’s regex
to a better-fit one for grepping. It try to check what kind of syntaxes the
user’s regex may fall into, and then generate a new regex, which will match any
string that could be reduced to the syntaxes. Since it is a regex to regex
transform tool, it will in fact be buildup on grep and do not require AST
constructing for every sources.

My First Post

Okay. So here is it, my blog. And this blog is all the way same as other
blogs. It will put my thoughts, ideas and some works if there is any:). And I
would like to talk about my self a little bit before everything actually
starts.

Basically, I am an IT geek and interested in system level programming
like compiler, interpreter, os etc. And I also like playing nodejs a lot. It’s
cool when things works. But of course, when some things broken, due to its
“callback nature”, there may be lots of pain to debug. I may want to build some
tools on it.

Hmmm, a very shorty short post. Anyway, hope it is a beginning rather
than an end ^_^.