Meaning Of Code. How to Require ('Node.js'). Part 3
Ok, now it is time for some javascript, but... we still need to get through some C++ code first. We have learned about GetInternalBinding function, and it is going to be important soon. But first let's get to the point, where the first javascript code enters virtual machine and what exactly it does.
As we discussed in startup section, the thing that prepares javascript environment is Realm. It sets up all bunch of stuff, all by running predefined javascript code. It all starts with Realm::RunBootstrapping (see CreateEnvironment function in startup section):
src/node_realm.cc #RunBootstrapping
MaybeLocal<Value> Realm::RunBootstrapping() {
EscapableHandleScope scope(isolate_);
CHECK(!has_run_bootstrapping_code());
Local<Value> result;
if (!ExecuteBootstrapper("internal/bootstrap/realm").ToLocal(&result) ||
!BootstrapRealm().ToLocal(&result)) {
return MaybeLocal<Value>();
}
...
}
The first thing that bootstrapping does, it executes realm script - internal/bootstrap/realm, by calling ExecuteBootstrapper function. If it succeeded, bootstrap continues by calling BootstrapRealm function. Let's look at each of these functions separately:
src/node_realm.cc #ExecuteBootstrapper
MaybeLocal<Value> Realm::ExecuteBootstrapper(const char* id) {
EscapableHandleScope scope(isolate());
Local<Context> ctx = context();
MaybeLocal<Value> result =
env()->builtin_loader()->CompileAndCall(ctx, id, this);
...
}
The function just forwards request to BuiltinLoader::CompileAndCall.
src/node_builtins.cc #CompileAndCall
MaybeLocal<Value> BuiltinLoader::CompileAndCall(Local<Context> context,
const char* id,
Realm* realm) {
Isolate* isolate = context->GetIsolate();
if (strcmp(id, "internal/bootstrap/realm") == 0) {
Local<Value> get_linked_binding;
Local<Value> get_internal_binding;
if (!NewFunctionTemplate(isolate, binding::GetLinkedBinding)
->GetFunction(context)
.ToLocal(&get_linked_binding) ||
!NewFunctionTemplate(isolate, binding::GetInternalBinding)
->GetFunction(context)
.ToLocal(&get_internal_binding)) {
return MaybeLocal<Value>();
}
Local<Value> arguments[] = {realm->process_object(),
get_linked_binding,
get_internal_binding,
realm->primordials()};
return CompileAndCall(
context, id, arraysize(arguments), &arguments[0], realm);
} else if (strncmp(id, "internal/main/", strlen("internal/main/")) == 0 ||
strncmp(id,
"internal/bootstrap/",
strlen("internal/bootstrap/")) == 0) {
Local<Value> arguments[] = {realm->process_object(),
realm->builtin_module_require(),
realm->internal_binding_loader(),
realm->primordials()};
return CompileAndCall(
context, id, arraysize(arguments), &arguments[0], realm);
}
UNREACHABLE();
}
The function is a wrapper around overloaded version of CompileAndCall. Its sole purpose is to prepare arguments for builtin scripts depending on the script type. As you most probably know, each executed script is first wrapped in a function, then it is compiled and the function is called with arguments. For CommonJS scripts these arguments are - exports, require, module, _filename_, dirname. Here we see that that is not the case for built in modules. All builtin scripts receive process object as first argument and primordial as a last one. We all know well the process object, and about primordials you can read more in its own readme (primordials.md). But two other arguments are different for realm script and all other builtin scripts. Realm script receives get_linked_binding and get_internal_binding functions as second and third arguments respectively. get_linked_binding function works with external bindings, and get_internal_binding is bound to GetInternalBinding function, which we've already seen, it resolved Internal Binding module by name and returns its export object (see Internal Bindings section). The arguments for other case, non realm script, will be clear after we see what happens in realm script.
The last step in both cases is call to overloaded version of CompileAndCall function, which finds requested script (which we will also see later), wraps it in a function, compiles and executes it with provided arguments. Now it is time to dive deeper in realm.js:
lib/internal/bootstrap/realm.js
...
let internalBinding;
{
const bindingObj = { __proto__: null };
internalBinding = function internalBinding(module) {
let mod = bindingObj[module];
if (typeof mod !== 'object') {
mod = bindingObj[module] = getInternalBinding(module);
ArrayPrototypePush(moduleLoadList, `Internal Binding ${module}`);
}
return mod;
};
}
First we see the definition of the internalBinding function. This function wraps the getInternalBinding function (the one that resolves Internal Bindings and is passed as an argument to this script) and also implements caching on top of it. That's why we didn't see caching in the GetInternalBinding function, caching is implemented on the javascript side.
continue lib/internal/bootstrap/realm.js
const selfId = 'internal/bootstrap/realm';
const {
builtinIds,
compileFunction,
setInternalLoaders,
} = internalBinding('builtins');
Right away internalBinding is used to get builtin module bindings, specifically builtinIds, compileFunction and setInternalLoaders. Builtin module bindings are registered in the same way as the fs module:
src/node_builtins.cc
void SetInternalLoaders(const FunctionCallbackInfo<Value>& args) {
...
Realm* realm = Realm::GetCurrent(args);
realm->set_internal_binding_loader(args[0].As<Function>());
realm->set_builtin_module_require(args[1].As<Function>());
...
}
void BuiltinLoader::CreatePerIsolateProperties(IsolateData* isolate_data,
Local<ObjectTemplate> target)
{
...
SetMethod(isolate, target, "getCacheUsage", BuiltinLoader::GetCacheUsage);
SetMethod(isolate, target, "compileFunction",
BuiltinLoader::CompileFunction);
SetMethod(isolate, target, "hasCachedBuiltins", HasCachedBuiltins);
SetMethod(isolate, target, "setInternalLoaders", SetInternalLoaders);
}
...
NODE_BINDING_PER_ISOLATE_INIT(
builtins, node::builtins::BuiltinLoader::CreatePerIsolateProperties)
NODE_BINDING_CONTEXT_AWARE_INTERNAL(
builtins, node::builtins::BuiltinLoader::CreatePerContextProperties)
...
We see the already known macros used to register the module and add methods to the module export object. The SetInternalLoaders binding function accepts two functions from javascript world and assigns them to a Realm internal_binding_loader and builtin_module_require properties. Keep it mind as it will be important later. Back to realm.js:
continue lib/internal/bootstrap/realm.js
...
/**
* An internal abstraction for the built-in JavaScript modules of Node.js.
* Be careful not to expose this to user land unless --expose-internals is
* used, in which case there is no compatibility guarantee about this class.
*/
class BuiltinModule {
...
constructor(id) {
...
this.id = id;
...
}
compileForInternalLoader() {
if (this.loaded || this.loading) {
return this.exports;
}
const id = this.id;
this.loading = true;
try {
const requireFn = StringPrototypeStartsWith(this.id, 'internal/deps/')
?
requireWithFallbackInDeps : requireBuiltin;
const fn = compileFunction(id);
fn(this.exports, requireFn, this, process, internalBinding,
primordials);
this.loaded = true;
} finally {
this.loading = false;
}
ArrayPrototypePush(moduleLoadList, `NativeModule ${id}`);
return this.exports;
}
}
Now the surrogate module system is established which is represented by BuiltinModule class. This system is only used by internal scripts and Builtin Modules. This module system can resolve requested Builtin Modules, compile them, run them, and return resulting exports. We can see this in compileForInternalLoader, function. First it compiles the request builtin module by its id using compileFunction binding from builtin Internal Binding module. The compilation process returns callable function. This function is executed with arguments expected by all Builtin Modules (like exports, require, module etc.). Later we will see its implementation. Finally it returns resulting export object.
continue lib/internal/bootstrap/realm.js
...
const loaderExports = {
internalBinding,
BuiltinModule,
require: requireBuiltin,
};
function requireBuiltin(id) {
if (id === selfId) {
return loaderExports;
}
const mod = BuiltinModule.map.get(id);
if (!mod) throw new TypeError(`Missing internal module '${id}'`);
return mod.compileForInternalLoader();
}
...
setInternalLoaders(internalBinding, requireBuiltin);
Because realm script can be required by other Builtin Modules (which makes sense, the realm script defines builtin modules system), it first defines its own export object. Next requireBuiltin function is defined and it serves as a helper function, which implements caching of Builtin Modules and calls BuiltinModule::compileForInternalLoader method if there is a cache miss. This function also takes care, if one of the Builtin Modules requires ream script itself. In this сase it does not go through compilation process (why would it if it is already compiled and executed), but instead returns loaderExports object directly. Finally we see a call to setInternalLoaders function, which we have seen already.
The last stop we need to make is to take a closer look at the setInternalLoaders binding and see the effect of that call. To do that, let's revisit SetInternalLoaders binding and CompileAndCall functions:
src/node_builtins.cc
void SetInternalLoaders(const FunctionCallbackInfo<Value>& args) {
...
Realm* realm = Realm::GetCurrent(args);
realm->set_internal_binding_loader(args[0].As<Function>());
realm->set_builtin_module_require(args[1].As<Function>());
...
}
src/node_builtins.cc #CompileAndCall
MaybeLocal<Value> BuiltinLoader::CompileAndCall(Local<Context> context,
const char* id,
Realm* realm) {
...
if (strcmp(id, "internal/bootstrap/realm") == 0) {
...
} else if (strncmp(id, "internal/main/", strlen("internal/main/")) == 0 ||
strncmp(id,
"internal/bootstrap/",
strlen("internal/bootstrap/")) == 0) {
Local<Value> arguments[] = {realm->process_object(),
realm->builtin_module_require(),
realm->internal_binding_loader(),
realm->primordials()};
return CompileAndCall(
context, id, arraysize(arguments), &arguments[0], realm);
}
UNREACHABLE();
}
We know that setInternalLoaders function is invoked with internalBinding and requireBuiltin functions as arguments. It means that as a result SetInternalLoaders binding sets this functions as properties of Realm object (internal_binding_loader and builtin_module_require respectively).
The CompileAndCall function passes exactly these tow functions as second and third arguments respectively to every Builtin Module. This means that any Builtin Module, when compiled and executed, has internalBinding and requireBuiltin functions available, and therefore has access to any Internal Binding and any Builtin Module.
Now we understand, that realm script implements the module system that is used by the Builtin Modules. It defines internalBinding function that is used to load Internal Bindings by Builtin Modules and also provides requireBuiltin function, which is used as simple require function by all Builtin Modules to load other Builtin Modules. Before diving into User Code module system let's make a quick detour and see how requireBuiltin resolves Builtin Modules.