You'd only incur any sort of performance penalty when doing a syscall though. This article talks about sandboxing something like bzip2, which would do a few syscalls on startup and from there on only for I/O. Most of the execution time is spent doing computation, which wouldn't be affected by ptrace's performance overhead.
It's quite easy to fork a child, ptrace it, and have the child execve the actual program you'd like to sandbox.