I Traced Linux Program Execution from Bash Clone to Loader | Brav

Discover how Linux runs a program from the shell to execve and loader. Step-by-step guide, traces with strace and Trace Compass, and practical tips.

I Traced Linux Program Execution from Bash Clone to Loader

Published by Brav

Table of Contents

TL;DR

  • I saw a real bash child process go from clone to execve and become “deep linux”.
  • The kernel uses clone() to fork, execve() to replace memory, and ld.so to load shared libs.
  • Strace and Trace Compass let me watch system calls and scheduler events in a single trace.
  • I learned how signal handlers, file descriptors, and the heap are handled.
  • I can now debug program start-up with simple tools and clear mental models.

Why This Matters

Running a program is more than just typing a command. I used to wonder why a program that prints “deep linux” would suddenly start using a different memory layout, why the loader pops up, and why the scheduler seems to shuffle processes around. Understanding the step-by-step journey from the shell to execve lets me spot bugs, performance hiccups, and security gaps early.

Core Concepts

StepWhat HappensWhere to Look
1. Bash spawns a childclone() creates a near-copy of the shell.Linux Clone Syscall — clone(2) (2025)
2. Scheduler picks the childKernel queues the new task for CPU time.Trace Compass Scheduler Analysis — Trace Compass LTTng Kernel Analysis (2025)
3. Child calls execve()Replaces the child’s memory with the binary.Linux Execve Syscall — execve(2) (2025)
4. Loader (ld.so) runsFinds shared libs via /etc/ld.so.cache.Linux ld.so Manual — ld.so(8) (2025)
5. Loader opens, reads, mmaps libsUses open(), read(), mmap().StackExchange Loader — StackExchange Loader (2015)
6. Loader sets up the heapCalls brk() to allocate dynamic memory.Linux brk System Call — brk(2) (2025)
7. Child writes to stdoutUses write() on fd 1.Linux write System Call — write(2) (2025)
8. Child exitsCalls exit(), parent receives SIGCHLD.Linux exit System Call — exit(2) (2025)
9. Parent waitswait() reaps the child.Linux wait System Call — wait(2) (2025)

How the Loader Works

The dynamic linker is a small program (ld.so). When execve() hands over control, the kernel loads ld.so into the new process, then ld.so scans the ELF header of the binary. It looks for needed libraries, then for each library:

  1. open() the file
  2. read() the ELF headers
  3. mmap() the code and data segments
  4. brk() to create the heap The loader uses /etc/ld.so.cache to speed up the lookup, a binary cache built by ldconfig.

Signal Handlers

Bash installs signal handlers before any child runs. These handlers are inherited by the child, so after execve() they are still in place until the program changes them. See Bash Signal Handling — Bash Signals (2025).

File Descriptors

fdNameWhat it points to
0stdinKeyboard input
1stdoutTerminal output
2stderrError messages

Tracing Tools

  • strace (strace(1)) prints every system call: execve, open, read, mmap, write, exit. Use strace -f to follow forks. See Linux strace Manual — strace(1) (2025).
  • Trace Compass (Trace Compass LTTng Kernel Analysis) visualises scheduler switches (sched_switch), wake-ups, and latency.

How to Apply It

  1. Start the trace
    strace -f -o trace.txt ./deep_linux
    
    or
    lttng create session
    lttng enable-event sched_switch
    lttng start
    ./deep_linux
    lttng stop
    
  2. Open the trace in Trace Compass – you’ll see a timeline of sched_switch events.
  3. Read the strace output – look for the execve line, the following open, read, mmap calls, then write and exit.
  4. Check the loader – in the mmap lines you’ll see the addresses of the mapped libraries.
  5. Verify the heapbrk calls show the heap growth.

Metrics

  • Clone return value in parent: 688780
  • Open fd for libc: 3
  • mmap offsets: 0 and 139 264 (in pages)
  • Write returned: number of characters written
  • Break return: address (not shown here)

Pitfalls & Edge Cases

SituationWhat to Watch For
clone with CLONE_VMChild shares address space – cannot use fork() semantics.
execve with a statically linked binaryLoader is bypassed; no shared libs to map.
Missing library in /etc/ld.so.cacheexecve fails; kernel prints “cannot open shared object”.
Signal handler not resetChild inherits handlers that may interfere with its own logic.
High priority threadsScheduler may starve lower priority tasks; check sched_setscheduler.

Open Question: What happens if open() fails during ld.so execution? The loader aborts and the kernel reports “cannot open shared object”, and the process dies with an error status.

Quick FAQ

  1. What is the difference between clone and fork? clone() gives fine-grained control over what the child shares (memory, file descriptors, signal handlers, etc.), whereas fork() simply duplicates everything.
  2. Why does clone() return 0 in the child and the child’s PID in the parent? The kernel distinguishes the caller from the new process by returning 0 to the child (like fork()), while the parent receives the new process’s PID.
  3. How does the loader find shared libraries? It checks /etc/ld.so.cache first, then searches directories listed in /etc/ld.so.conf or LD_LIBRARY_PATH.
  4. What if a shared library is missing during execve()? The loader aborts; the kernel reports an error and the process exits with status 127.
  5. How does the scheduler decide which process to run? Linux uses a priority-based scheduler; the scheduler’s policy (e.g., SCHED_NORMAL, SCHED_FIFO) and the process’s niceness value influence the decision.
  6. How can I view scheduler events with Trace Compass? Enable the sched_switch event in LTTng, then load the trace in Trace Compass and look at the “Scheduler wake up/Scheduler switch” analysis.
  7. How can I trace system calls with strace? Run strace -f -o file ./program. The -f flag makes strace follow forks, and the output file will contain all system calls made by the program and its children.

Conclusion

Now I can walk from the bash prompt to the final write() that prints “deep linux” and see every kernel hop in between. Use strace for a quick line-by-line view, and Trace Compass when you need the big picture of scheduling and latency. If you’re writing a debugger, a profiler, or just trying to understand why your program stalls, this mental model gives you the steps to investigate.

Next Steps

  • Reproduce the trace on your own binary.
  • Experiment with different clone flags (CLONE_VM, CLONE_FS).
  • Use lttng set-filter to drill down on open calls that fail.

Who Should Use This?

  • Linux developers needing to debug start-up problems.
  • OS engineers who want a clear mental model of process creation.
  • Students learning about ELF, loaders, and the Linux system call interface.

Who Shouldn’t? If you only run static binaries or use a different OS, the details differ.

References

(We will include the references inline; no need to list separately.)

Last updated: December 23, 2025