I’m working on a relatively complex project with many dependencies. It is organised as a git super project that aggregates several tens of git submodules. The author has chosen to implement a strategy of performing the submodule work directory management through cmake functions as opposed to using recursive git clone techniques.
Following latest best practice, I am applying continuous integration to this project, but hit a few problems with gitlab authentication. To figure out where the problems with the gitlab authentication were, I started instrumenting the cmake functions to capture the information from the standard output and standard error streams when the git commands were invoked. Since I wanted to be able to see this clearly for each of the large number of submodules, I chose to use the OUTPUT_FILE and ERROR_FILE options from the cmake execute_process function.
At the point of writing the cmake functions, I was sufficiently far away from thinking about the calls that would be made that the abstraction that each would be passed a submodule name left me thinking of the submodule name as a simple token.
When triggering the CI, the build was failing with incomplete working directories for the sub modules. The last output I saw from cmake was a bare “No such file or directory” but with little additional context to infer which missing file or directory was causing the problem.
Eventually, I reached for my most trusted debug tool when I am having difficulty resolving errors from an interacting set of shells – in this case the combination of CI spinning up a docker container, running some kind of bash shell, driven by a CI .gitlab-ci.yml file, running a git clone followed by a cmake instruction, triggering a cascade of subsidiary CMakeLists.txt scripts calling cmake helper functions, calling git. Note that in all of the above, keeping track of the current working directory and various flavours of text variables with much substitution makes following the thread of control difficult. So – simply running “strace -f -o /tmp/trace.output cmake ../” in my own instance of the docker container helped me see the problem immediately: there was an open call trying to acccess a filename with the form “git.output.foo/Bar/Baz” and this directory clearly did not exist, which was what cmake was unhappy about. The problem was obvious in hindsight. For top level submodules with a simple name, “git.output.foo” would be a perfectly valid name for OUTPUT_FILE. For a nested submodule whose name includes a path separator character however, I was asking cmake to create an output file with an invalid path. What confused me was that I was focussed on the error from the git command being invoked, and was ignoring the fact that the cmake machinery might have a separate internal error. Trying to reproduce the error from the git command directly with an interactive shell failed for obvious reasons.
When in doubt, strace, system calls and the data being passed from user space to kernel will tell you a lot about many processes and is really clear at identifying where some classes of fault reside.