Sandbox for code execution¶

Our pipeline relies on Python interpreter to execute code generated by LLMs. This creates a security risk, since we are executing arbitrary code that we do not have full control over. To partially address this, we provide a basic sandbox that we use to execute code and validate the correctness of LLM-generated answers.

Local sandbox¶

The default sandbox option used in our pipeline is a local docker container. Check out nemo_skills/code_execution/local_sandbox for implementation details.

Please note that our provided sandbox is not fully secure and you are strongly encouraged to setup a properly configured virtual machine such that generated code executes in an unprivileged environment with no external network access unless necessary.

Most of the time, the pipeline scripts will launch sandbox automatically when requested. But if you want to launch it manually, you can use the following command

docker run --rm --network=host igitman/nemo-skills-sandbox:0.7.0

If docker is not available, you can still run a sandbox (although less efficient version) like this

python -m nemo_skills.code_execution.local_sandbox.local_sandbox_server

Other sandboxes¶

Our sandbox API makes no assumptions on where or how the code is executed, so it's very easy to extend it. E.g. you can use AWS Lambda functions or other similar offerings. Please open an issue if you'd like us to add support for another sandbox in the future.