Talk:Find common directory path

From Rosetta Code

Shorter version for the C language:

<lang C>

  1. define PATH_MAX 127
  2. include <string.h>
  3. include <stdio.h>

static void longestSharedPath(const char *fixed, char *moving) { char *t; unsigned n = 0, l = strlen(fixed); while (moving[n] == fixed[n] && n < l) n++; if (strlen(moving) == n && (l == n || (l > n && fixed[n] == '/'))) return; moving[n] = '\0'; t = strrchr(moving, '/'); if (t && t != moving) *t = '\0'; }

int main() { char *dir_list[] = { "/home/user1/tmp/coverage/test", "/home/user1/tmp/covert/operator", "/home/user1/tmp/coven/members", NULL }; int i = 0; char tmp[PATH_MAX]; strcpy(tmp, dir_list[0]); while (dir_list[++i]) { longestSharedPath(dir_list[i], tmp); } printf("%s\n", tmp); return 0; } </lang>

I think that the last two lines of leastCommonPath are wrong: <lang c> t = strrchr(moving, '/');

if (t && t != moving) *t = '\0';</lang>. This should place the null after the slash, and not eliminate the slash otherwise the slash will not be available for future comparisons. If you change "coverage" to "dovetail" in the example data, I think this problem would raise its head. --Rdm 11:46, 14 April 2011 (UTC)

It is intended to work that way, and it works with "dovetail" also. The first conditional handles cases both with and without terminating slash. Try it. Per 12:07, 14 April 2011 (UTC)

ok, yes, nevermind. However, that does not mean that the algorithm is completely valid. Consider, for example, what happens if you change an instance of "home" to "hone". Here's my proposed alternative: <lang c>static void longestSharedPath(const char *fixed, char *moving) {
       char *t;
       unsigned n = 0;
       while (moving[n] == fixed[n] && moving[n]) n++;
       if (!moving[n]) return;
       t = strrchr(moving, '/');
       if (t)
               if (t == moving)
                       moving[1]= '\0';
               else
                       *t = '\0';

}</lang> --Rdm 14:22, 14 April 2011 (UTC)

That function will break on a path that starts the same and continues different, eg if you add "/home/user1/tmp2/coven/members" to the list. However, there are some good ideas there. Here is my combined version:

<lang C> static void longestSharedPath(const char *fixed, char *moving) { char *t; unsigned n = 0; while (moving[n] == fixed[n] && moving[n] && fixed[n]) n++; if (!moving[n] && (!fixed[n] || fixed[n] == '/')) return; moving[n] = '\0'; t = strrchr(moving, '/'); if (t && t != moving) *t = '\0'; // drop conflicting remainder else moving[1] = '\0'; // keep filesystem root } </lang>

The thing about that first if line is that if all of moving is identical to matching chars in fixed, and either that is all of fixed or fixed continues into subdirectories, then we're done. But not otherwise.

I was not able to make my version break. Can you give me the complete definition for dir_list[] which makes it break? Thanks. --Rdm 16:08, 14 April 2011 (UTC)

<lang C>

    	char *dir_list[] = {
               "/home/user1/tmp/coverage/test",
               "/home/user1/tmp/covert/operator",
               "/home/user1/tmp/coven/members",
               "/home/user1/tmp2/coven/members",
               NULL
       };

</lang> should return "/home/user1". Per 16:14, 14 April 2011 (UTC)

(elided) --Rdm 16:54, 14 April 2011 (UTC)

"home/user1/tmp" is not the longest common shared path any more. How about you drop by on irc to discuss this, instead of us abusing this talk page further? ;-) Per 17:03, 14 April 2011 (UTC)

Correct version of python without reimplementing commonprefix <lang Python> os.path.sep.join(os.path.commonprefix([p.split(os.path.sep) for p in ['/home/user1/tmp/coverage/test', '/home/user1/tmp/covert/operator', '/home/user1/tmp/coven/members']])) </lang>