Skip to content

Instantly share code, notes, and snippets.

@JonathonReinhart
Last active October 30, 2024 03:53
Show Gist options
  • Save JonathonReinhart/8c0d90191c38af2dcadb102c4e202950 to your computer and use it in GitHub Desktop.
Save JonathonReinhart/8c0d90191c38af2dcadb102c4e202950 to your computer and use it in GitHub Desktop.
mkdir -p implemented in C
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h> /* mkdir(2) */
#include <errno.h>
/* Make a directory; already existing dir okay */
static int maybe_mkdir(const char* path, mode_t mode)
{
struct stat st;
errno = 0;
/* Try to make the directory */
if (mkdir(path, mode) == 0)
return 0;
/* If it fails for any reason but EEXIST, fail */
if (errno != EEXIST)
return -1;
/* Check if the existing path is a directory */
if (stat(path, &st) != 0)
return -1;
/* If not, fail with ENOTDIR */
if (!S_ISDIR(st.st_mode)) {
errno = ENOTDIR;
return -1;
}
errno = 0;
return 0;
}
int mkdir_p(const char *path)
{
/* Adapted from http://stackoverflow.com/a/2336245/119527 */
char *_path = NULL;
char *p;
int result = -1;
mode_t mode = 0777;
errno = 0;
/* Copy string so it's mutable */
_path = strdup(path);
if (_path == NULL)
goto out;
/* Iterate the string */
for (p = _path + 1; *p; p++) {
if (*p == '/') {
/* Temporarily truncate */
*p = '\0';
if (maybe_mkdir(_path, mode) != 0)
goto out;
*p = '/';
}
}
if (maybe_mkdir(_path, mode) != 0)
goto out;
result = 0;
out:
free(_path);
return result;
}
#ifndef MKDIR_P_H
#define MKDIR_P_H
int mkdir_p(const char *path);
#endif /* MKDIR_P_H */
env = Environment(
CCFLAGS = ['-Wall', '-Werror'],
)
env.Program('mkdir_p_test', ['mkdir_p.c', 'test.c'])
#include <stdio.h>
#include "mkdir_p.h"
int main(int argc, char **argv)
{
const char *path;
int rc;
if (argc < 2) {
fprintf(stderr, "Missing argument: path\n");
return 1;
}
path = argv[1];
rc = mkdir_p(path);
fprintf(stderr, "mkdir_p(\"%s\") returned %d: %m\n", path, rc);
return (rc == 0) ? 0 : 2;
}
@a32sailor
Copy link

One note on PATH_MAX. This is not a guide to the longest permissible path. See, e.g. realpath(3).
If all you want to do is mkdir -p, then I would take the simple approach, and not worry about the path length; just try, and if the filesystem's max path is exceeded, mkdir will fail with errno ENAMETOOLONG. You will need to track progress so you can unwind, of course. But that
is easily done with an vector of created dirs.
Dealing max path lengths is tricky.

@db-inf
Copy link

db-inf commented Jan 30, 2018

Your check to continue if an error is just because of an existing path component, could also include checking that that component is a directory.

@tranqv
Copy link

tranqv commented Mar 6, 2020

PATH_MAX is provided by <linux/limits.h>, not <limits.h>.

Yes. It's true.

In <limits.h>, we may have some thing like POSIX_PATH_MAX (=256) which is imported from <bits/posix1_lim.h> when we turn USE_POSIX on.

In <linux/limits.h>, PATH_MAX = 4096.

@llothar
Copy link

llothar commented Feb 16, 2022

Never use PATH_MAX.
Never.
It's never the truth even if you try to get it programmatically from sys_config.
Always use malloc/free.

@tangxinfa
Copy link

if (mkdir(_path, S_IRWXU) != 0) {
    if (errno != EEXIST)
        return -1;
}

Must check whether _path is really a directory if errno is EEXIST, see my fork:

if (errno == EEXIST) {
  struct stat st;
  ret = stat(path, &st);
  if (ret != 0) {
    return ret;
  } else if (S_ISDIR(st.st_mode)) {
    return 0;
  } else {
    return -1;
  }
}

@JonathonReinhart
Copy link
Author

@llothar @tangxinfa Thanks everyone for the feedback. I made a few improvements:

  1. Switch from fixed array to dynamic allocation to avoid PATH_MAX problems
  2. Check that existing paths are really directories, and return ENOTDIR if they are not
  3. Don't leak EEXIST on success
  4. Change default mode from 0700 to 0777 to match command line mkdir -p behavior (will be affected by umask)
  5. Added a test app

@gblargg
Copy link

gblargg commented Feb 23, 2023

BTW mkdir_p( "" ) accesses beyond allocated memory since it assumes the path length is non-zero. Seems it should return ENOENT in this case, as mkdir does for an empty string.

As for optimization, would it make sense to just try maybe_mkdir in the beginning, for the probably-common case where the parent directories already exist?

Thanks for the useful functions and careful implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment