Skip to content

Conversation

@svaarala
Copy link
Owner

@svaarala svaarala commented Dec 5, 2016

Placeholder for now.

Basic idea is to store _Source in the compiler when source support is enabled. Then, in Function.prototype.toString(), if source is available, use it as the toString() result directly.

So it's up to the compiler to come up with a useful _Source that matches ES6 requirements.

Later on it'd make sense to use some form of compression on the source code, e.g. bit packing optimized for ASCII and RLE compressing white space.

Or maybe an actual compression algorithm: for example, a 1kB footprint for a compression/decompression algorithm would be worth it if the ~3kB strings and built-ins init data was reduced to 2kB. Also, the more strings and built-ins are present (which increases over time), the better this trade-off would be.

@svaarala svaarala force-pushed the add-function-source-property branch from 61ab513 to 0abc326 Compare December 5, 2016 17:22
@svaarala
Copy link
Owner Author

svaarala commented Dec 5, 2016

So initial, quite useless, output for now:

duk> ""+Math.cos
= "function cos() { [native code] }"
duk> ""+function foo(a,b,c) {}
= "XXX"

Where the "XXX" is stored as an internal _Source property of the function template and copied over to the instance.

@fatcerberus
Copy link
Contributor

I experimented with this maybe a year ago, the wall I hit was that I couldn't figure out how to pass the input source code from the lexer/compiler to the point where it was actually needed (i.e. the point of function object creation).

@svaarala
Copy link
Owner Author

svaarala commented Dec 5, 2016

The approach here should allow that - for now, _Source is a string but more ideally it'd be a bit packed or compressed plain buffer which would be decompressed by Function.prototype.toString(). But one step at a time.

@svaarala
Copy link
Owner Author

svaarala commented Dec 5, 2016

Getting the function source is actually a little bit difficult because it doesn't exist as a full source at any point in the compilation. The input data is decoded on the fly to a lookup window which is used for tokenization. That window contains codepoints rather than string data, and the decoding process does e.g. surrogate creation so that raw input is not suitable as is as the source property.

So, to do this properly one may need to store the lexer position, reinitialize the lexer to the start of the function body, and decode codepoints until the end of the function. Then, encode the codepoints (on-the-fly of course) into a CESU-8 source text.

@svaarala
Copy link
Owner Author

svaarala commented Dec 5, 2016

But I won't work actively on this right now: just wanted to open this for discussion because it's been scattered into other issues/pulls and it's good to try to keep issues and pulls on topic.

@cheako
Copy link

cheako commented May 6, 2018

humbletim/glm-js#10

For some reason they do this a lot.

@cheako
Copy link

cheako commented May 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants