Integrating with Everything to add folder sizes to Explorer

Plug-in and third party software discussion.
m417z
Posts: 27
Joined: Wed Nov 06, 2024 10:53 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by m417z »

Kilmatead wrote: Tue Dec 17, 2024 9:56 pm It's a poor-man's GetReparseTarget since it fails for any paths which may be temporarily unavailable (detached drive, etc), and can't provide the reparse buffer contents (the true path) to compensate. Though as that doesn't tend to much matter for non-power-users, it has potential.
I think it's fine, it will just result in a lack of size in this case, which makes sense if the drive is detached. I mean, what would I do with a path to a detached drive anyway?

Previously I thought that all GetReparseTarget/GetFinalPathNameByHandle does is resolving a single path, as long as the full path ends with a symbolic link. In my solution, there's no path traversal at all, and I can't think of a reason it can make things worse (aside from the minor pessimization I mentioned).
Kilmatead
Posts: 24
Joined: Thu Nov 07, 2024 1:22 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by Kilmatead »

m417z wrote: Tue Dec 17, 2024 10:03 pm In my solution, there's no path traversal at all...
Yeah, that part's the real bonus. Interestingly enough the WInAPI entry notes for SMB it walks the paths for you anyway...
The Server Message Block (SMB) Protocol does not support queries for normalized paths. Consequently, when you call this function passing the handle of a file opened using SMB, and with the FILE_NAME_NORMALIZED flag, the function splits the path into its components and tries to query for the normalized name of each of those components in turn. If the user lacks access permission to any one of those components, then the function call fails with ERROR_ACCESS_DENIED.
Not particularly important, but interesting nonetheless.
m417z
Posts: 27
Joined: Wed Nov 06, 2024 10:53 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by m417z »

Yesterday, I got an interesting bug report related to reparse points.

The report says that for OneDrive folders, non-empty folders show the size correctly, while empty folders don't show the size instead of showing "0 bytes". From the logs and screenshots, OneDrive folders are marked as FILE_ATTRIBUTE_REPARSE_POINT and the path is resolved to the same path, which is not something I considered to be an option.

This is my implementation (pseudo-code) that manifests the bug:

Code: Select all

unsigned es_get_size(const WCHAR* folder_path, uint64_t* size) {
  // ...
  if (*size == -1 || (*size == 0 && IsReparse(folder_path))) {
    return ES_QUERY_NO_INDEX;
  }
  return ES_QUERY_OK;
}

int64_t size;
unsigned result = es_get_size(folder_path, &size);

if (result == ES_QUERY_NO_INDEX) {
  string resolved_folder_path = GetFinalPathNameByHandle(folder_path);
  if (resolved_folder_path.succeeded && resolved_folder_path.str != folder_path) {
    result = es_get_size(resolved_folder_path.str, &size);
  }
}
As a solution, I changed the implementation to the following:

Code: Select all

unsigned es_get_size(const WCHAR* folder_path, uint64_t* size) {
  // ...
  if (*size == -1) {
    return ES_QUERY_NO_INDEX;
  }
  if (*size == 0 && IsReparse(folder_path)) {
    return ES_QUERY_ZERO_SIZE_REPARSE_POINT;
  }
  return ES_QUERY_OK;
}

int64_t size;
unsigned result = es_get_size(folder_path, &size);

if (result == ES_QUERY_ZERO_SIZE_REPARSE_POINT) {
  string resolved_folder_path = GetFinalPathNameByHandle(folder_path);
  if (resolved_folder_path.succeeded) {
    if (resolved_folder_path.str == folder_path) {
      size = 0;
      result = ES_QUERY_OK;
    } else {
      result = es_get_size(resolved_folder_path.str, &size);
    }
  }
}
The new implementation considers zero-sized reparse points which resolve to themselves as actual zero-sized folders.

Does it look correct to you? Any gotchas you can think of with this implementation?
Kilmatead
Posts: 24
Joined: Thu Nov 07, 2024 1:22 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by Kilmatead »

This would seem to boil down to you not reporting any size at all (zero or otherwise) unless you get ES_QUERY_OK. The way I handle it is just to return size (0) on all errors, even once you've gone through the GetFinalPathNameByHandle fallback for any ES_QUERY_NO_INDEX events.

The whole reason the "size == 0 && IsReparse" test is necessary is because ES returns 0 when querying the link itself (the "root", if you will), not -1... however, ES does return -1 for anything behind the link, so the GetFinalPathNameByHandle is needed in both cases. I too disregard any paths which resolve to themselves (no need to waste a query), but, again, size remains 0.

All that said, your solution looks clean, and really not all that different - you just guarantee the ES_QUERY_OK - which solves the issue.

Incidentally, in case you're interested in gaining a few more microseconds per call, I've been experimenting with just leaving the pipe-connection open, and referencing it this way (pseudo-code):

Code: Select all

if (!pClient) pClient = Everything3_ConnectW(PIPENAME);
if (pClient) {
	...GetFolderSizeFromFilenameW...

	if (size == EVERYTHING3_UINT64_MAX && GetLastError() == EVERYTHING3_ERROR_DISCONNECTED) {
		DestroyClient(pClient);
		pClient = nullptr;

		return ES_QUERY_NO_ES_IPC;
	}
}
So you only need to destroy the client once, if disconnected (ES becomes unavailable - crashed, or the user closed it for some reason). Any following queries will just try to reconnect, and this has proven quite stable in my tests (killing/restarting ES over and over...). You might lose a query here and there but that would happen anyway if ES is on holiday. And the extra GetLastError() check costs nothing.

It's (a bit) faster than closing/reopening 16,000 times, like our old friend WinSxS ;)

Code: Select all

00001 | 100 μs :: C:\Windows\WinSxS\amd64_3ware.inf.resources_31...
00002 | 62 μs :: C:\Windows\WinSxS\amd64_1394.inf.resources_31bf3...

...

16497 | 26 μs :: C:\Windows\WinSxS\x86_wwf-system.workflow.runtime...
IPC: Pipe+  Browsed in: 2.61s  Total Time: 553.56ms  Folders: 16497  Average: 33.56 μs
In other words, WinSxS is processed in 2.61 seconds, averaging 34 μs per subfolder, which isn't bad.

I haven't uploaded this approach yet or anything, since I still consider it experimental - but I've yet have any trouble auto-reconnecting... unlike my original tests on ES build 1384a (way back when), the current 1391a seems much more stable when intentionally poked and prodded.

Anyway, I no longer consider it "safer" to create/destroy instead. Just a thought. (For the multithreaded milieu you work in, some atomicity for pClient might be needed.)

Just something to toy with if you're back messing around in the code anyway. :)
m417z
Posts: 27
Joined: Wed Nov 06, 2024 10:53 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by m417z »

Kilmatead wrote: Fri Feb 14, 2025 3:23 pm The way I handle it is just to return size (0) on all errors, even once you've gone through the GetFinalPathNameByHandle fallback for any ES_QUERY_NO_INDEX events.
Previously you said:
Kilmatead wrote: Tue Dec 17, 2024 7:59 pm I don't like half-arsed solutions to things, so I just delegated any folder that was not a legitimately empty folder (0-size) to the file-manager and let it do it the slow way.
Am I missing something? I also don't see any reparse point resolving code in your latest version (3.0.0.0).
Kilmatead wrote: Fri Feb 14, 2025 3:23 pm however, ES does return -1 for anything behind the link, so the GetFinalPathNameByHandle is needed in both cases.
Good catch, that's a bug in my implementation which doesn't handle subfolders of reparse points. I now changed if (result == ES_QUERY_ZERO_SIZE_REPARSE_POINT) to if (result == ES_QUERY_ZERO_SIZE_REPARSE_POINT || result == ES_QUERY_NO_INDEX). It was like this before, and I didn't remember why. It just emphasized the importance of leaving comments to future self.

BTW I found having both ES_QUERY_NO_INDEX and ES_QUERY_NO_RESULT slightly confusing, especially with the old/new SDK differences. I just changed all usages to ES_QUERY_NO_RESULT.
Kilmatead wrote: Fri Feb 14, 2025 3:23 pm I too disregard any paths which resolve to themselves (no need to waste a query), but, again, size remains 0.
Interestingly, I wasn't able to create such a folder. Running C:\>mklink /j self-link C:\c-self-link results in a folder which fails to be resolved.
Kilmatead wrote: Fri Feb 14, 2025 3:23 pm [...]
In other words, WinSxS is processed in 2.61 seconds, averaging 34 μs per subfolder, which isn't bad.
Nice, down from 4 seconds (from your previous post), right?
Frankly I don't feel adventurous enough, maybe some day...
Kilmatead
Posts: 24
Joined: Thu Nov 07, 2024 1:22 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by Kilmatead »

m417z wrote: Fri Feb 14, 2025 4:39 pm Am I missing something? I also don't see any reparse point resolving code in your latest version (3.0.0.0).
Ah, yes, sorry, due to "real life" stuff I haven't updated the plugin properly (I just left the old version up because for most users reparse-points are "an amusing oddity", and my old way was sufficient, if slower).

If you're interested, this is the current version source-code (v3.0.0.2), with a bunch of refactoring thrown in just to confuse the issue (there have been no SDK changes relevant to the code we use, so that all remains the same). Since I still support XP, et al, I had to add all the LoadLibrary/GetProcAddress stuff around GetFinalPathNameByHandle and others for proper use. Suffice to say that "now" the code handles reparse-points as gracefully as possible. :)
m417z wrote: Fri Feb 14, 2025 4:39 pm BTW I found having both ES_QUERY_NO_INDEX and ES_QUERY_NO_RESULT slightly confusing, especially with the old/new SDK differences. I just changed all usages to ES_QUERY_NO_RESULT.
Yeah, I did something similar (well, I kept NO_INDEX) - originally they were two separate codes because the old ES SDK2 had those conditions as distinctly separate possibilities and it was useful for debugging, but the new SDK3 simplifies things a bit and doesn't have a NO_RESULT clause regarding GetFolderSize, so that stuff is legacy anyway. I still support the WM_COPYDATA usage model, but will actively discourage anyone from using it if they ask for my opinion. :D
m417z wrote: Fri Feb 14, 2025 4:39 pm
Kilmatead wrote: Fri Feb 14, 2025 3:23 pm I too disregard any paths which resolve to themselves
Interestingly, I wasn't able to create such a folder.
My observed behaviour of GetFinalPathNameByHandle is that any path (doesn't have to be reparse) will simply resolve back to itself unless it actually does have an alternate (such as link-redirection). So I discard duplicates just as a matter of course, and unless a link is broken in some way (or has a SYMLINK_FLAG_RELATIVE inside its REPARSE_DATA_BUFFER structure), most reparse-points behave the same, and worrying about weird edge-cases is not too productive.
m417z wrote: Fri Feb 14, 2025 4:39 pm Nice, down from 4 seconds (from your previous post), right? Frankly I don't feel adventurous enough, maybe some day...
Yeah, I had let the project rest for awhile then woke up with a "what if?" idea a few weeks ago and decided to mess around a bit. I imagine most humans wouldn't be able to tell the difference between 150 μs and 50 μs, but no doubt there's some autistic kid who really really really likes refreshing WinSxS over and over, so I code for that child, if he exists, simpatico. :D
m417z
Posts: 27
Joined: Wed Nov 06, 2024 10:53 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by m417z »

Kilmatead wrote: Fri Feb 14, 2025 6:12 pm Yeah, I did something similar (well, I kept NO_INDEX) - originally they were two separate codes because the old ES SDK2 had those conditions as distinctly separate possibilities and it was useful for debugging, but the new SDK3 simplifies things a bit and doesn't have a NO_RESULT clause regarding GetFolderSize, so that stuff is legacy anyway. I still support the WM_COPYDATA usage model, but will actively discourage anyone from using it if they ask for my opinion. :D
refreshing WinSxS over and over, so I code for that child, if he exists, simpatico. :D
You're also detecting whether indexing is disabled, so ES_QUERY_NO_INDEX had that additional meaning. I ignore that and just return whatever Everything returns in this case (-1 in SDK2 and 0 in SDK3). I'll align with your choice :)

And 1.4 is still the latest stable version, so I assume most people use it. I did before I got into this mod and got some awareness about 1.5.
Kilmatead wrote: Fri Feb 14, 2025 6:12 pm Yeah, I had let the project rest for awhile then woke up with a "what if?" idea a few weeks ago and decided to mess around a bit. I imagine most humans wouldn't be able to tell the difference between 150 μs and 50 μs, but no doubt there's some autistic kid who really really really likes refreshing WinSxS over and over, so I code for that child, if he exists, simpatico. :D
Yeah, refresh it while it's sorted by size, not less.

BTW it looks like you still have some semantic difference compared to my implementation, which is: if a zero-sized reparse point resolves to itself, I consider it success, and you consider it an error. In your case it seem to affect the status column (right?), as well as the FT_DEFERHANDLING result. In case of the OneDrive I mentioned it will be slightly weird, folders with size will have the "Ok" status, but empty folders will have the "No Index" status. Not too much of a big deal, just an observation.

I added ES_QUERY_ZERO_SIZE_REPARSE_POINT as a separate result because I considered having a success result in case of a zero-size reparse point which resolves to itself only if Everything actually indexes it as zero-sized, not if Everything doesn't know about it (ES_QUERY_NO_INDEX) which I don't think should happen at all.

So my current pseudo-code is:

Code: Select all

unsigned es_get_size(const WCHAR* folder_path, uint64_t* size) {
  // ...
  if (*size == -1) {
    return ES_QUERY_NO_INDEX;
  }
  if (*size == 0 && IsReparse(folder_path)) {
    return ES_QUERY_ZERO_SIZE_REPARSE_POINT;
  }
  return ES_QUERY_OK;
}

int64_t size;
unsigned result = es_get_size(folder_path, &size);

if (result == ES_QUERY_ZERO_SIZE_REPARSE_POINT ||
    result == ES_QUERY_NO_INDEX) {
  string resolved_folder_path = GetFinalPathNameByHandle(folder_path);
  if (resolved_folder_path.succeeded) {
    if (resolved_folder_path.str == folder_path) {
      if (result == ES_QUERY_ZERO_SIZE_REPARSE_POINT) {
        size = 0;
        result = ES_QUERY_OK;
      }
    } else {
      result = es_get_size(resolved_folder_path.str, &size);
    }
  }
}
Kilmatead
Posts: 24
Joined: Thu Nov 07, 2024 1:22 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by Kilmatead »

m417z wrote: Fri Feb 14, 2025 6:33 pm You're also detecting whether indexing is disabled, so ES_QUERY_NO_INDEX had that additional meaning. I ignore that and just return whatever Everything returns in this case (-1 in SDK2 and 0 in SDK3).
The extra indexing-enabled check is only for SDK2, as that's the one that doesn't have a clearer way of distinguishing a proper 0-size folder from "any old error", and I expended a lot of effort on variations around that. Like I said, I'd just toss out SDK2 if I could, but like you said, a lot of people probably still use it. Heathens, philistines, and those who like pineapple pizzas. May they see the light, soon.
m417z wrote: Fri Feb 14, 2025 6:33 pm BTW it looks like you still have some semantic difference compared to my implementation, which is: if a zero-sized reparse point resolves to itself, I consider it success, and you consider it an error. In your case it seem to affect the status column (right?), as well as the FT_DEFERHANDLING result. In case of the OneDrive I mentioned it will be slightly weird, folders with size will have the "Ok" status, but empty folders will have the "No Index" status. Not too much of a big deal, just an observation.
In the worse-case scenario (the initial query fails and the secondary link-resolution also fails) the size is still going to be zero, so that's returned - there's only so many ways to handle it. Whether it's an error or not is... debatable. In my case I always have FT_DEFERHANDLING to fallback on (let the file-manager itself handle it), so that's my get-out-of-jail-free card.

It's not a perfect system - I only added the status column to at least give users a way to see if something is working correctly or not. I had a lot of complaints from (lazy) network-obsessed users who "couldn't be arsed" to configure ES correctly themselves, and complained to me instead. Those are the ones who deserve pineapple pizza.
m417z wrote: Fri Feb 14, 2025 6:33 pm I added ES_QUERY_ZERO_SIZE_REPARSE_POINT as a separate result because I considered having a success result in case of a zero-size reparse point which resolves to itself only if Everything actually indexes it as zero-sized, not if Everything doesn't know about it (ES_QUERY_NO_INDEX) which I don't think should happen at all.
That's the problem - Everything doesn't index it, it just returns 0 because it's a link (which ES doesn't concern itself with), but it's technically still a folder object so it can't be said to not exist in the library. Index-limbo.

Without examining the actual contents of the OneDrive "self-link" reparse-buffer you begin to see the limitations of GetFinalPathNameByHandle - it doesn't really tell you anything when things get squirrelly. Remember when I called it a "poor-man's GetReparseTarget"? I wasn't being flippant. The NTFS reparse buffer tag itself contains multiple possibilities, IO_REPARSE_TAG_SYMLINK which has both absolute and relative path variants, and IO_REPARSE_TAG_MOUNT_POINT which (if GUID) is almost always a drive mount-point, but can also be a path to a same-drive non-root junction-point folder, depending on how the link was created. Interestingly enough, the REPARSE_DATA_BUFFER itself is technically a Union definition, not a static structure, so it's meant to be flexible.

They just all happen to share the same superficial FILE_ATTRIBUTE_REPARSE_POINT attribute, but underneath they are not the same things, and I'd wager you found one of them and confused its behaviour for another. Hard to know without digging into the link.
Last edited by Kilmatead on Fri Feb 14, 2025 8:33 pm, edited 1 time in total.
m417z
Posts: 27
Joined: Wed Nov 06, 2024 10:53 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by m417z »

Kilmatead wrote: Fri Feb 14, 2025 8:16 pm That's the problem - Everything doesn't index it, it just returns 0 because it's a link (which ES doesn't concern itself with), but it's technically still a folder object so it can't be said to not exist. Index-limbo.
Which case are you referring to? The OneDrive folders are being indexed (both their existence and their size), despite of the FILE_ATTRIBUTE_REPARSE_POINT attribute, see the user's screenshots here:
https://github.com/ramensoftware/windha ... 2658027944

For "regular" reparse points, what I observe is that Everything indexes the folder's existence, but not its size (zero is returned).

So for me, the "resolves to itself" check is a way to distinguish between the "zero-sized reparse point" and "unknown-sized reparse point". That allows to have the same result for zero-sized and other OneDrive folders. I don't know how good is this way to distinguish between the two cases, but at least for the OneDrive case it works.
Kilmatead wrote: Fri Feb 14, 2025 8:16 pm They just all happen to share the same superficial FILE_ATTRIBUTE_REPARSE_POINT attribute, but underneath they are not the same things, and I'd wager you found one of them and confused it with another.
Probably the most correct way to handle it is to look at what's underneath, but if I have a way that works at all (or the vast majority of) cases, I'd rather not go there for now.
Kilmatead
Posts: 24
Joined: Thu Nov 07, 2024 1:22 pm

Re: Integrating with Everything to add folder sizes to Explorer

Post by Kilmatead »

m417z wrote: Fri Feb 14, 2025 8:26 pm Which case are you referring to? The OneDrive folders are being indexed (both their existence and their size), despite of the FILE_ATTRIBUTE_REPARSE_POINT attribute...

For "regular" reparse points, what I observe is that Everything indexes the folder's existence, but not its size (zero is returned)
Exactly, that the OneDrive (and only the OneDrive) folders are indexed is the crux, it's called a reparse-point, but under the bonnet it's doing its own network thing which for some reason ES is perfectly happy with.
m417z wrote: Fri Feb 14, 2025 8:26 pm but if I have a way that works at all (or the vast majority of) cases, I'd rather not go there for now.
Now you know how I felt about the SDK2 approach to pseudo-empty folders. :)
Post Reply