Monday, June 03, 2013

compiling typescript with rhino javascript engine

I spent way too long the last couple days patching the typescript compiler (tsc) to run with java's rhino javascript engine. I mentioned in an earlier post how my dell laptop died on me. I wound up buying an hp envy m4 laptop on clearance at BestBuy to replace it, but before that I did that I was goofing around for a couple days getting by with my Android phone and the old PowerBook G4 I pulled off the shelf that amazingly boot up for me. Anyway - I thought I'd try to get tsc running from the command-line. The compiler is javascript compiled from typescript code, so I first tried to build node from source, but node's build setup requires newer versions of python, make, and gcc than what the powerbook had, so then I got the idea to try to run tsc with rhino, since the powerbook did have java 1.5 installed (java 1.6+ ships with rhino and a jrunscript command line tool). Of course it didn't "just work", and I let myself get sucked into banging my head on it - even after I got the new laptop. Ugh.

Anyway - I eventually got tsc running with rhino (the patch listing is further below). The easy part of the project was implementing a rhino version of tsc's IIO interface for file IO. The rhino implementation just calls through to java.io classes, and tsc runs a few feature tests to figure out which javascript engine it's running under:

    if (typeof ActiveXObject === "function")
        return getWindowsScriptHostIO();
    else if (typeof require === "function")
        return getNodeIO();
    else if ( typeof java != "undefined" )
        return getRhinoIO();
    else
        return null; // Unsupported host

The IO code was straight forward; the painful part was working around rhino's quirks. The first quirk I ran into was that rhino appears to treat the names of java's primitive types as reserved words, so things like:
var byte = 0;
or option.short = "v";
are illegal. Fortunately - that only popped up a couple places in the tsc code, but it's an unfortunate "feature" for rhino to have.

Another problem I ran into was invoking "delete" on an instance of java.io.File. Javascript includes delete in its collection of reserved words, but it should be legal to include a "delete()" method on some class. Rhino's javascript grammar probably just needs some love. The workaround was to access the method via f["delete"]() instead of f.delete().

C:\Users\Reuben\Documents\Code\typescript\src\compiler
> node
> var Foo = function() { return this; }
undefined
> Foo.prototype.delete = function() { return "bla"; }
[Function]
> (new Foo()).delete();
'bla'

...


> C:\Users\Reuben\Documents\Code\typescript\src\compiler
> jrunscript
js> var Foo = function() { return this; }
js> Foo.prototype.delete = function() { return "bla"; }
script error: sun.org.mozilla.javascript.internal.EvaluatorException: missing name after . operator (<STN> at line number 1
js>

Another problematic feature of rhino is that it does not hide the distinction between javascript's string type (which tsc expects), and java's java.lang.String. I discovered that Rhino has methods for converting between java types and javascript types - including the String() method mentioned here:

    resolvePath: function (path) {
        return <string> String( (new java.io.File(path)).getCanonicalPath() );
    },

There were one or two other small rhino quirks to work out, but the big one that surprised me was that rhino's regular expression objects apparently don't respect javascript's normal scoping rules. The tsc compiler was failing under rhino with various undefined types that weren't properly pulled in via the file-reference comments (see section 9.1.1 in the typescript spec). I eventually had a test setup, and found rhino would load every other referenced file, so a tsc run with rhino had this output:

(1)Reading code from C:/Users/Reuben/Documents/Code/typescript/src/compiler/typescript.ts
Found code at C:/Users/Reuben/Documents/Code/typescript/src/compiler/typescript.ts
 file reference: diagnostics.ts
 file reference: nodeTypes.ts
 file reference: ast.ts
 file reference: astWalkerCallback.ts
 file reference: astLogger.ts
 file reference: base64.ts
 file reference: emitter.ts
 file reference: parser.ts
 file reference: scanner.ts
 file reference: scopeWalk.ts
 file reference: symbols.ts
 file reference: tokens.ts
 file reference: typeCollection.ts
 file reference: types.ts
 file reference: referenceResolution.ts
 file reference: incrementalParser.ts

The output with nodejs was:

   (1)Reading code from C:/Users/Reuben/Documents/Code/typescript/src/compiler/typescript.ts
   Found code at C:/Users/Reuben/Documents/Code/typescript/src/compiler/typescript.ts
    file reference: diagnostics.ts
    file reference: flags.ts
    file reference: nodeTypes.ts
    file reference: hashTable.ts
    file reference: ast.ts
    file reference: astWalker.ts
    file reference: astWalkerCallback.ts
    file reference: astPath.ts
    file reference: astLogger.ts
    file reference: binder.ts
    file reference: base64.ts
    file reference: sourceMapping.ts
    file reference: emitter.ts
    file reference: errorReporter.ts
    file reference: parser.ts
    file reference: printContext.ts
    file reference: scanner.ts
    file reference: scopeAssignment.ts
    file reference: scopeWalk.ts
    file reference: signatures.ts
    file reference: symbols.ts
    file reference: symbolScope.ts
    file reference: tokens.ts
    file reference: typeChecker.ts
    file reference: typeCollection.ts
    file reference: typeFlow.ts
    file reference: types.ts
    file reference: pathUtils.ts
    file reference: referenceResolution.ts
    file reference: precompile.ts
    file reference: incrementalParser.ts
    file reference: declarationEmitter.ts

Anyway, long story short, it turned out that the reference strings were each processed by a function with a regular expression, and rhino had this crazy behavior where regular expression objects appear to be global.

with rhino:

js> function doTest( s ) { var rx =  /^\s*(\/\/\/\s*/gim;
return (rx.exec(s) == null); }

js> doTest(comment);
false
js> doTest(comment);
true
js> doTest(comment);
false
js> doTest(comment);
true

with node:
> function doTest(s) {
... var rx =  /^\s*(\/\/\/\s*/gim;
... return (rx.exec(s) == null);
... }
undefined
> comment
'///<reference path=\'sourceMapping.ts\' />'
> doTest(comment);
false
> doTest(comment);
false
> doTest(comment);
false
> doTest(comment);
false
> doTest(comment);
false

Unbelievable. Anyway - the work around is to reset the rx.lastIndex before each run, so:

    function getFileReferenceFromReferencePath(comment: string): IFileReference {
        var referencesRegEx = /^(\/\/\/\s*<reference\s+path=)('|")(.+?)\2\s*(static=('|")(.+?)\2\s*)*\/>/gim;
        referencesRegEx.lastIndex = 0;  // work around ridiculous bug in rhino ...
        var match = referencesRegEx.exec(comment);
        ...

Finally - I just tested this stuff by running the compiler on itself. The typescript repo on codeplex includes a bunch of test cases and an nmake based Makefile, but I was too lazy to download visual studio and get that working. In the end - rhino compiled tsc in 2 minutes, and node did it in 3 seconds. Ugh!

C:\Users\Reuben\Documents\Code\typescript\src\compiler
> date; jrunscript tsc.js --out tsc2.js tsc.ts; date

Monday, June 3, 2013 12:49:41 PM
Monday, June 3, 2013 12:51:39 PM


C:\Users\Reuben\Documents\Code\typescript\src\compiler
> date; node tsc2.js --out tsc3.js tsc.ts; date

Monday, June 3, 2013 12:58:40 PM
Monday, June 3, 2013 12:58:43 PM

Update 2013/06/29: I tried java 8's new nashorn javascript engine (in a jdk8 pre-release) to see how it did. Nashorn currently runs the tsc compile about 10% faster than rhino - still a lot slower than node. Doh!

> date; & 'C:\Program Files\Java\jdk1.8.0\bin\jrunscript.exe' .\tsc2.js --out tsc2.js tsc.ts; date;

Saturday, June 29, 2013 5:29:11 PM
Saturday, June 29, 2013 5:30:57 PM

Anyway - I'll check to see if the typescript maintainers will accept this patch, but I'll be surprised if they want anything to do with rhino after reading this sad tale ...


diff --git a/src/compiler/base64.ts b/src/compiler/base64.ts
index ee2d3c5..b4fc315 100644
--- a/src/compiler/base64.ts
+++ b/src/compiler/base64.ts
@@ -67,20 +67,21 @@ module TypeScript {
 
             var shift = 0;
             for (var i = 0; i < inString.length; i++) {
-                var byte = Base64Format.decodeChar(inString[i]);
+                // note: "byte" is reserved in java Rhino javascript environment - ugh
+                var bite = Base64Format.decodeChar(inString[i]);
                 if (i === 0) {
                     // Sign bit appears in the LSBit of the first value
-                    if ((byte & 1) === 1) {
+                    if ((bite & 1) === 1) {
                         negative = true;
                     }
-                    result = (byte >> 1) & 15; // 1111x
+                    result = (bite >> 1) & 15; // 1111x
                 } else {
-                    result = result | ((byte & 31) << shift); // 11111
+                    result = result | ((bite & 31) << shift); // 11111
                 }
 
                 shift += (i == 0) ? 4 : 5;
 
-                if ((byte & 32) === 32) {
+                if ((bite & 32) === 32) {
                     // Continue
                 } else {
                     return { value: negative ? -(result) : result, rest: inString.substr(i + 1) };
diff --git a/src/compiler/io.ts b/src/compiler/io.ts
index a5eb1ad..6e75bf2 100644
--- a/src/compiler/io.ts
+++ b/src/compiler/io.ts
@@ -13,6 +13,9 @@
 // limitations under the License.
 //
 
+declare var arguments:any;
+var javaArgs:any = arguments; // Rhino sets global arguments ... bla
+
 interface IResolvedFile {
     content: string;
     path: string;
@@ -105,7 +108,10 @@ declare class Enumerator {
     constructor (o: any);
 }
 declare function setTimeout(callback: () =>void , ms?: number);
+
 declare var require: any;
+declare var java: any;
+
 declare module process {
     export var argv: string[];
     export var platform: string;
@@ -123,7 +129,7 @@ declare module process {
 }
 
 var IO = (function() {
-
+    
     // Create an IO object for use inside WindowsScriptHost hosts
     // Depends on WSCript and FileSystemObject
     function getWindowsScriptHostIO(): IIO {
@@ -533,12 +539,251 @@ var IO = (function() {
             },
             quit: process.exit
         }
-    };
+    }
+    ;
+        
+    
+    function getRhinoIO():IIO {
+        var utf8 = java.nio.charset.Charset.forName( "UTF-8" );
+        var jscriptArgs = [];
+   
+        for( var i=0; i < javaArgs.length; ++i ) {
+            //
+            // convert java string to javascript string (so javascript string methods work - ugh!)
+            // see https://groups.google.com/forum/?fromgroups#!topic/mozilla.dev.tech.js-engine.rhino/FV15_KJVLGM
+            //
+            jscriptArgs.push( String( javaArgs[i] ) );
+        }
+        
+        
+        /**
+         * Byte-order-mark detector - ugh.
+         * @param streamInn java.io.InputStream
+         * @return java.io.Reader
+         * @see http://blog.publicobject.com/2010/08/handling-byte-order-mark-in-java.html
+         */
+       function inputStreamToReader(streamIn) {
+         // buffered stream supports mark and reset
+         var stream = new java.io.BufferedInputStream( streamIn );
+         stream.mark(3);
+         var byte1 = stream.read();
+         var byte2 = stream.read();
+         if (byte1 == 0xFF && byte2 == 0xFE) {
+           return new java.io.InputStreamReader(stream, "UTF-16LE");
+         } else if (byte1 == 0xFF && byte2 == 0xFF) {
+           return new java.io.InputStreamReader(stream, "UTF-16BE");
+         } else {
+           var byte3 = stream.read();
+           if (byte1 == 0xEF && byte2 == 0xBB && byte3 == 0xBF) {
+             return new java.io.InputStreamReader(stream, "UTF-8");
+           } else {
+             stream.reset();
+             return new java.io.InputStreamReader(stream);
+           }
+         }
+       };
+
+          return {
+            readFile: function (file):string {
+                try  {
+                    var f = new java.io.File( file );
+                    if( (! f.exists()) || (! f.isFile()) ) { return ""; }
+                    var buffer = java.lang.reflect.Array.newInstance( java.lang.Character.TYPE, f.length() + 128 ); // 128 fudge
+                    var reader = new java.io.BufferedReader( 
+                       inputStreamToReader(
+                              new java.io.FileInputStream( f ) 
+                         ) 
+                    );
+                    try {
+                     var offset = 0;
+                        for( var step = reader.read( buffer, offset, buffer.length - offset ); 
+                             step >= 0; step = reader.read( buffer, offset, buffer.length - offset ) ) {
+                             offset += step;
+                             //java.lang.System.out.println( "Just read num bytes: " + step );
+                        }
+                        var javaString = new java.lang.String( buffer, 0, offset )
+                        //java.lang.System.out.println( "Read: " + javaString );
+                        // convert java string to javascript string ... ugh
+                        return <string> String( new java.lang.String( buffer, 0, offset ) );
+                    } catch (ex) { 
+                        reader.close();
+                        ex.printStackTrace( java.lang.System.err );
+                        throw ex; 
+                    }
+                } catch (e) {
+                    IOUtils.throwIOError("Error reading file \"" + file + "\" - " + e.toString(), e );
+                }
+            },
+            writeFile: function( path, content ) {
+               var f = new java.io.File( path );
+               if( f.exists() && f.isFile() ) {
+                   var writer = new java.io.OutputStreamWriter(
+                         new java.io.FileOutputStream( f ), utf8
+                    );
+                    writer.write( content );
+                    writer.close();
+               }
+            },
+            deleteFile: function (path) {
+               var f = new java.io.File( path );
+               if( f.exists() && f.isFile() ) {
+                   // delete is reserved in javascript - confused Rhino parser - ugh
+                   f["delete"]();
+                }
+            },
+            fileExists: function (path) {
+                var result:bool = (new java.io.File( path )).exists();
+                return result;
+            },
+            createFile: function (path, useUTF8?) {
+                var f = new java.io.File( path );
+                if ( f.exists() && (! f.isFile()) ) {
+                    IOUtils.throwIOError("Error creating file \"" + path + "\".", null ); 
+                } else if ( ! f.exists() ) {
+                    var dir = f.getParentFile();
+                    dir.mkdirs();
+                }
+                try  {
+                    var writer = new java.io.OutputStreamWriter(
+                             new java.io.FileOutputStream( f ), utf8
+                            );
+                } catch (e) {
+                    IOUtils.throwIOError("Couldn't write to file '" + path + "'.", e);
+                }
+                return new IOUtils.BufferedTextWriter( {
+                    Write: function (str:string ) {
+                        writer.write( str );
+                    },
+                    WriteLine: function (str:string) {
+                        writer.write( str + "\n" );
+                    },
+                    Close: function () {
+                        writer.close();
+                        writer = null;
+                    }
+                } );
+            },
+            dir: function dir(path, spec?, options?):string[] {
+                options = options || {
+                };
+                function filesInFolder(folder:any):string[] {
+                    var paths = [];
+                    var files = folder.listFiles();
+                    for(var i = 0; i < files.length; i++) {
+                        var f = files[i];
+                        if(options.recursive && f.isDirectory()) {
+                            paths = paths.concat(filesInFolder(f));
+                        } else if(f.isFile() && (!spec || f.getName().match(spec))) {
+                            paths.push( String( f.getPath() ) ); // convert to javascript String
+                        }
+                    }
+                    return paths;
+                }
+                return filesInFolder( new java.io.File( path ) );
+            },
+            createDirectory: function (path) {
+                try  {
+                    if(!this.directoryExists(path)) {
+                       (new java.io.File( path )).mkdirs();
+                    }
+                } catch (e) {
+                    IOUtils.throwIOError("Couldn't create directory '" + path + "'.", e);
+                }
+            },
+            directoryExists: function (path) {
+                var f = new java.io.File( path );
+                var result:bool = f.exists() && f.isDirectory();
+                return result;
+            },
+            resolvePath: function (path) {
+                return <string> String( (new java.io.File(path)).getCanonicalPath() );
+            },
+            dirName: function (path) {
+                return <string> String( (new java.io.File( path )).getCanonicalFile().getParent() );
+            },
+            findFile: function (rootPath, partialFilePath) {
+                var scan = new java.io.File( rootPath + "/" + partialFilePath ).getCanonicalFile();
+                while(true) {
+                    if( scan.exists() ) {
+                        try  {
+                            var content = this.readFile( scan.getPath() );
+                            return {
+                                content: <string> content,
+                                path: <string> String( scan.getPath() )
+                            };
+                        } catch (err) {
+                        }
+                    } else {
+                        // climb up the file system ... ?
+                        var parent = (new java.io.File( rootPath )).getParent();
+                        if( parent == null ) {
+                            return null;
+                        } else {
+                            scan = new java.io.File( parent, partialFilePath );
+                        }
+                    }
+                }
+            },
+            print: function (str) {
+                java.lang.System.out.print( str );
+            },
+            printLine: function (str) {
+                java.lang.System.out.println( str );
+            },
+            arguments: <string[]> jscriptArgs, 
+            stderr: {
+                Write: function (str) {
+                    java.lang.System.err.print(str);
+                },
+                WriteLine: function (str) {
+                    java.lang.System.err.println(str );
+                },
+                Close: function () {
+                }
+            },
+            stdout: {
+                Write: function (str) {
+                    java.lang.System.out.print(str);
+                },
+                WriteLine: function (str) {
+                    java.lang.System.out.println(str );
+                },
+                Close: function () {
+                }
+            },
+            
+            /**
+             * Could implement watchFile() with java-7 nio2 code, but too lazy to bother,
+             * since WindowsScriptHost skips this method too ... :)
+             * @see http://docs.oracle.com/javase/tutorial/essential/io/notification.html 
+             */
+            watchFile: null,
+            run: function(source, filename) {
+                try {
+                    eval(source);
+                } catch (e) {
+                    IOUtils.throwIOError("Error while executing file '" + filename + "'.", e);
+                }
+            },
+            getExecutingFilePath: function () {
+                return this.arguments[0];
+            },
+            quit: function (exitCode? : number = 0) {
+                try {
+                    java.lang.System.lang.exit(exitCode);
+                } catch (e) {
+                }
+            }
+        };
+    }
+    ;
 
     if (typeof ActiveXObject === "function")
         return getWindowsScriptHostIO();
     else if (typeof require === "function")
         return getNodeIO();
+    else if ( typeof java != "undefined" )
+        return getRhinoIO();
     else
         return null; // Unsupported host
 })();
diff --git a/src/compiler/optionsParser.ts b/src/compiler/optionsParser.ts
index a10fb8f..7a7eb73 100644
--- a/src/compiler/optionsParser.ts
+++ b/src/compiler/optionsParser.ts
@@ -18,7 +18,7 @@
 interface IOptions {
     name?: string;
     flag?: bool;
-    short?: string;
+    shorty?: string;  // note: "short" is reserved in java
     usage?: string;
     set?: (s: string) => void;
     type?: string;
@@ -34,7 +34,7 @@ class OptionsParser {
 
         for (var i = 0; i < this.options.length; i++) {
 
-            if (arg === this.options[i].short || arg === this.options[i].name) {
+            if (arg === this.options[i].shorty || arg === this.options[i].name) {
                 return this.options[i];
             }
         }
@@ -89,8 +89,8 @@ class OptionsParser {
             var usageString = "  ";
             var type = option.type ? " " + option.type.toUpperCase() : "";
 
-            if (option.short) {
-                usageString += this.DEFAULT_SHORT_FLAG + option.short + type + ", ";
+            if (option.shorty) {
+                usageString += this.DEFAULT_SHORT_FLAG + option.shorty + type + ", ";
             }
 
             usageString += this.DEFAULT_LONG_FLAG + option.name + type;
@@ -110,27 +110,27 @@ class OptionsParser {
         }
     }
 
-    public option(name: string, config: IOptions, short?: string) {
+    public option(name: string, config: IOptions, shorty?: string) {
         if (!config) {
-            config = <any>short;
-            short = null;
+            config = <any>shorty;
+            shorty = null;
         }
 
         config.name = name;
-        config.short = short;
+        config.shorty = shorty;
         config.flag = false;
 
         this.options.push(config);
     }
 
-    public flag(name: string, config: IOptions, short?: string) {
+    public flag(name: string, config: IOptions, shorty?: string) {
         if (!config) {
-            config = <any>short;
-            short = null;
+            config = <any>shorty;
+            shorty = null;
         }
 
         config.name = name;
-        config.short = short;
+        config.shorty = shorty;
         config.flag = true
 
         this.options.push(config);
diff --git a/src/compiler/precompile.ts b/src/compiler/precompile.ts
index 88adf32..c66f937 100644
--- a/src/compiler/precompile.ts
+++ b/src/compiler/precompile.ts
@@ -131,6 +131,7 @@ module TypeScript {
 
     function getFileReferenceFromReferencePath(comment: string): IFileReference {
         var referencesRegEx = /^(\/\/\/\s*<reference\s+path=)('|")(.+?)\2\s*(static=('|")(.+?)\2\s*)*\/>/gim;
+        referencesRegEx.lastIndex = 0;  // work around ridiculous bug in rhino ...
         var match = referencesRegEx.exec(comment);
 
         if (match) {
@@ -294,6 +295,7 @@ module TypeScript {
             
             if (!comment.isBlock) {
                 var referencedCode = getFileReferenceFromReferencePath(comment.getText());
+                //CompilerDiagnostics.debugPrint( "Considering comment as possible reference (" + (referencedCode ? "ok" : "no") + "): " + comment.getText() );
                 if (referencedCode) {
                     referencedCode.minChar = comment.startPos;
                     referencedCode.limChar = referencedCode.minChar + comment.value.length;
diff --git a/src/compiler/referenceResolution.ts b/src/compiler/referenceResolution.ts
index 442d8ec..52ae47c 100644
--- a/src/compiler/referenceResolution.ts
+++ b/src/compiler/referenceResolution.ts
@@ -102,7 +102,7 @@ module TypeScript {
                 // if the path is relative, or came from a reference tag, we don't perform a search
                 if (isRelativePath || isRootedPath || !performSearch) {
                     try {
-                        CompilerDiagnostics.debugPrint("   Reading code from " + normalizedPath);
+                        CompilerDiagnostics.debugPrint("   (1)Reading code from " + normalizedPath);
                             
                         // Look for the .ts file first - if not present, use the .ts, the .d.str and the .d.ts
                         try {
@@ -116,19 +116,19 @@ module TypeScript {
                                 else if (isTSFile(normalizedPath)) {
                                     normalizedPath = changePathToSTR(normalizedPath);
                                 }
-                                CompilerDiagnostics.debugPrint("   Reading code from " + normalizedPath);
+                                CompilerDiagnostics.debugPrint("   (2)Reading code from " + normalizedPath);
                                 resolvedFile.content = ioHost.readFile(normalizedPath);
                             }
                             catch (err) {
                                 normalizedPath = changePathToDSTR(normalizedPath);
-                                CompilerDiagnostics.debugPrint("   Reading code from " + normalizedPath);
+                                CompilerDiagnostics.debugPrint("   (3)Reading code from " + normalizedPath);
 
                                 try {
                                     resolvedFile.content = ioHost.readFile(normalizedPath);
                                 }
                                 catch (err) {
                                     normalizedPath = changePathToDTS(normalizedPath);
-                                    CompilerDiagnostics.debugPrint("   Reading code from " + normalizedPath);
+                                    CompilerDiagnostics.debugPrint("   (4)Reading code from " + normalizedPath);
                                     resolvedFile.content = ioHost.readFile(normalizedPath);
                                 }
                             }
@@ -148,7 +148,10 @@ module TypeScript {
 
                     // if the path is non-relative, we should attempt to search on the relative path
                     resolvedFile = ioHost.findFile(parentPath, normalizedPath);
-
+                    CompilerDiagnostics.debugPrint("   Attempting to resolve (" + parentPath + ", " + normalizedPath + ") got: " + 
+                         (resolvedFile == null) ? "null" : resolvedFile.path 
+                    );
+                    
                     if (!resolvedFile) {
                         if (isSTRFile(normalizedPath)) {
                             normalizedPath = changePathToTS(normalizedPath);
@@ -187,6 +190,10 @@ module TypeScript {
                     var resolvedFilePath = ioHost.resolvePath(resolvedFile.path);
                     sourceUnit.referencedFiles = preProcessedFileInfo.referencedFiles;
 
+                    for (var i = 0; i < preProcessedFileInfo.referencedFiles.length; i++) {
+                        var fileReference = preProcessedFileInfo.referencedFiles[i];
+                        CompilerDiagnostics.debugPrint("    file reference: " + fileReference.path);
+                    }
                     // resolve explicit references
                     for (var i = 0; i < preProcessedFileInfo.referencedFiles.length; i++) {
                         var fileReference = preProcessedFileInfo.referencedFiles[i];

No comments: