Why isn't my node.cljs app working in :advanced mode?

The ClojureScript github repo has two node sample applications. One of them is a trivial Hello World app (nodehello.cljs). The other one is a slightly more complicated app that recursively lists all the files in a given directory (nodels.cljs).

The nodels app has this comment on the first source line:

; This one doesn't yet work with :optimizations :advanced

And indeed when compiling the app with :advanced :optimizations it doesn't work:

% node nodels.js .
% # We should see a listing of all the files in the project directory. Instead we see nothing.

Can't we use advanced mode optimizations with node.cljs apps? Does the ClojureScript compiler need more work to support advanced mode compilation on node? Maybe advanced mode optimizations are unreliable on node and the compiled app works only sometimes?

What's going on?

One of the optimizations that the Closure compiler performs in advanced mode is global symbol renaming. In the nodels.cljs example app the Closure compiler renames a number of Node.js standard library symbols, thus breaking the app. More specifically, the compiler renames process.argv, fs.statSync(), and fs.readdirSync() when it shouldn't.

We can fix the app by declaring symbols that should not be renamed in an externs file. After providing an appropriate externs file the nodels app works just fine, ie it's not broken and the comment in the source code should be updated to reflect this.

To play along...

  • create a new project with lein new nodecljs nodels
  • copy the contents of the nodels.clj example file over the auto-generated src/nodels/core.cljs
  • and finally change the :optimizations setting from :simple to :advanced in project.clj.

You can compile the app with:

lein cljsbuild once

And you can run the resulting JS with:

node nodels.js .

Where's our file listing?

We're not getting any output from the app, not even an error. It seems that the app is running just fine, but not producing the output we would expect.

Let's print out paths in the -main function to see what command line arguments are passed to our app. Add the following println line to src/nodels/core.cljs:

(defn -main [& paths]
  (println "paths:" paths) ;; add this line
  (dorun (map println (mapcat file-seq paths))))

Compile and run the app again:

% lein cljsbuild once
% node nodels.js .
paths: nil

So paths is nil when -main is being called. No wonder we're not seeing any file names being listed!

What happened to process.argv?

One of the reasons why a CLJS app on node is supposed to set its entry point by calling (set! *main-cli-fn* -main) is so that CLJS can pass the command line arguments to the -main function. Have a look at out/cljs/nodejscli.cljs in the project:

...
; Call the user's main function
(apply cljs.core/*main-cli-fn* (drop 2 (.-argv nodejs/process)))

CLJS ignores the first two CLI arguments, which are node and the script name (ie nodels.js in our test project), and then passes the remaining arguments to our -main.

So we should be able to look at the generated JS and find the process symbol in it:

% grep process nodels.js
var id = require, jd = process;

It seems like process has been rebound to jd by the Closure compiler. Let's look for jd:

% grep jd nodels.js
var id = require, jd = process;
}(2, jd.hb));

The second line corresponds to the call to drop in out/cljs/nodejscli.cljs.

The generated code is passing jd.hb to drop instead of jd.argv! No wonder it's not working: jd.hb (ie process.hb) has not been defined, so we're passing undefined to drop, and drop is then returning nil to our -main. The Closure compiler has mangled our symbol and we need to protect it!

Externs to the rescue!

The Closure compiler takes as an optional input an externs file that is used specify symbols that should not be mangled. By defining our own externs file we can make sure the the compiler doesn't mangle process.argv.

First, add an :externs key to the :compiler map in project.clj:

                    ...
                    :compiler {:output-to "nodels.js"
                               :output-dir "out"
                               :target :nodejs
                               :externs ["externs.js"]
                               :optimizations :advanced}}]})

externs.js is the name of the file that we use to declare our protected symbols. You could use whatever name you wanted to.

Next, create the externs.js file:

% cat > externs.js <<-EOF
var process = {};
process.argv = [];
EOF

The externs documentation doesn't really go into a lot of detail what the externs declarations should look like. Basically we want to procect the argv key of the process object and the above incantations seem to do it.

If you recompile the source and look at the generated JS again, you'll see that process is still assigned to jd, but that the argv reference is now present in the JS as jd.argv.

Let's try the app again!

% node .\nodels.js .
(.)

.../nodels/nodels.js:3883
    return kd.tb(a).isDirectory(K);
              ^
TypeError: Object #<Object> has no method 'tb'
    at .../nodels/nodels.js:3883:15
    at .../nodels/nodels.js:2158:38
    at Mb (.../nodels/nodels.js:1544:46)
    at Lb.f.A (.../nodels/nodels.js:1554:3)
    at G (.../nodels/nodels.js:505:14)
    at .../nodels/nodels.js:2112:15
    at Mb (.../nodels/nodels.js:1544:46)
    at Lb.f.A (.../nodels/nodels.js:1560:11)
    at G (.../nodels/nodels.js:505:14)
    at .../nodels/nodels.js:2037:15

The output now includes our command line argument: (.), so that's progress, but we're now bombing out with an unknown method tb().

If you go back to the CLJS source, you'll find the line where we're calling isDirectory():

(fn [f] (.isDirectory (.statSync fs f) ()))

So it seems that statSync() has been mangled to tb() by the compiler. Let's try adding statSync() to our externs file. Edit the externs file to look like this:

var process = {};
process.argv = [];
var fs = {};
fs.statSync = function() {};

Compile and run the app again:

% node .\nodels.js .
(.)

.../nodels/nodels.js:3887
    }, kd.sb(a));
          ^
TypeError: Object #<Object> has no method 'sb'
...

Some other function name has now been mangled. Search for kd in the generated JS and you'll find:

var kd = id.d ? id.d("fs") : id.call(null, "fs"), ld = id.d ? id.d("path") : id.call(null, "path");

So it seems that kd is the node file system module that we pull into the app with (def fs (nodejs/require "fs")). The only other use of fs, apart from the .statSync call, is the call to .readdirSync in the following anonymous function:

(fn [d] (map #(.join path d %) (.readdirSync fs d)))

Let's add that to our externs. Edit the externs file to look like this:

var process = {};
process.argv = [];
var fs = {};
fs.statSync = function() {};
fs.readdirSync = function() {};

And then we can compile and run the code again:

% node .\nodels.js .
(.)
.
.gitignore
.lein-repl-history
externs.js
nodels.js
out
out\cljs
out\cljs\core.cljs
out\cljs\core.js
out\cljs\nodejs.cljs
out\cljs\nodejs.js
out\cljs\nodejscli.cljs
out\constants_table.js
out\nodels.js
project.clj
README.md
src
src\nodels
src\nodels\core.cljs
target
target\classes
target\stale
target\stale\extract-native.dependencies

Success!

Observations

If you followed along closely, you'll have noticed that there were two symbols that we didn't have to protect in our externs file: .join and .isDirectory.

My theory is that since the built-in JS Array class has a join() method, the Closure compiler will automatically not mangle .join, regardless of what the type of the target object is.

Similarly, I think that .isDirectory is protected, because of the new JS File API.

Conclusions

In some sense CLJS seems to be broken for Node.js developement, since you cannot enable advanced mode out of the box without running into problems (unless you're not using any node libraries, which is a bit pointless).

Arguably the easiest way to avoid advanced mode problems is simply to use simple mode! But this might not be appropriate for all use cases. Eg if you want to distribute your code as a library code size does matter. Our example project goes from 530kB in simple mode to 90kB in advanced mode.

In addition to dead code elimination, advanced mode also performs global inlining, including constant folding. I would expect these optimizations to provide performance improvements in certain cases. Benchmark your app to know for sure!

Fortunately Daniel Wirtz has written an npm package that contains externs for the Node.js standard library! I haven't tried it yet myself, but I'll make sure I'll write a post about my experiences when I get around to it.

Update: I tried out the closurecompiler-externs package and wrote a post about it.

Finally, David Nolen wrote a post about a neat hack of using the actual JS source code as the externs file for a library that does not have an externs file defined.


Want to read more about ClojureScript on Node?
Sign up for our mailing list!

Unsubscribe at any time. No spam, ever.
comments powered by Disqus