MD5 Property indexing for network drives seems to die
MD5 Property indexing for network drives seems to die
Hi,
I'm using Everything 1.5.0.1361a (x64) on Windows 10 22H2. I've several network drives on a Synology NAS attached, with lots of files. I know I've quite some duplicates there, so I defined an MD5 property so that information would be added to the index permanently (MD5 calculation on demand is quite slow). I'd like to use the MD5 property to find duplicates, since file names for files with same content might differ, and file size is not distinctive enough.
The problem I have with this setup is that a large part of the files on those NAS Network drives do not get the MD5 property during indexing, and I cannot figure out why. It looks like the indexing of the property halts somewhere. What would be the best way to find out what is happening here?
Regards, EricB
I'm using Everything 1.5.0.1361a (x64) on Windows 10 22H2. I've several network drives on a Synology NAS attached, with lots of files. I know I've quite some duplicates there, so I defined an MD5 property so that information would be added to the index permanently (MD5 calculation on demand is quite slow). I'd like to use the MD5 property to find duplicates, since file names for files with same content might differ, and file size is not distinctive enough.
The problem I have with this setup is that a large part of the files on those NAS Network drives do not get the MD5 property during indexing, and I cannot figure out why. It looks like the indexing of the property halts somewhere. What would be the best way to find out what is happening here?
Regards, EricB
Re: MD5 Property indexing for network drives seems to die
Progress is shown in the status bar.
Hover over the progress bar to show the current file being indexed.
Errors are reported in the Debug Logs.
Please try enabling debug logging and refreshing files with missing md5 properties:
Hover over the progress bar to show the current file being indexed.
Errors are reported in the Debug Logs.
Please try enabling debug logging and refreshing files with missing md5 properties:
- In Everything, from the Tools menu, under the Debug submenu, check Start Debug Logging.
- From the Tools menu, under the Debug submenu, check Verbose.
- Type in the following search:
!md5: - Select all files (Ctrl + A)
- Press Ctrl + F5.
- wait for Everything to stop indexing md5 values.
- Check your %TEMP%\Everything Debug Log.txt for any errors.
- Look for the following lines:
get property <property-id> <filename>
CreateFileW(): <error-code>: Failed to open file <filename>
Re: MD5 Property indexing for network drives seems to die
Hello void,
Thanks for the prompt response. I've done as you described, and again saw indexing stopping intermittently.
Some observations:
Will give an update later on.
Regards, EricB
Update: even when excluding the c: drive from the !md5 query, I see several CreateFileW errors on c: appearing in the log. Does this mean that the indexing has a "backlog" that it wants to keep up with? Even when it broke off?
Update2: when confining to the single drive, I do not see those c: entries anymore. I excluded c: before by "!md5 !c:" but maybe the ! operator is not working properly in this case?
Thanks for the prompt response. I've done as you described, and again saw indexing stopping intermittently.
Some observations:
- A lot of files in Windows system seem to render the CreateFileW error. I excluded drive c: from the !md5 query for now. Not sure if that is one of the reasons for halting.
- When not confined to a single network drive, indexing seems to go all over the place, seemingly picking random files for property calc. This is seen when hovering over the index progress bar. Probably due to the position in the index?
- Whenever the machine locks itself, and being left a bit longer, also indexing seems to halt. Still it does not seem to be in sleep mode.
Will give an update later on.
Regards, EricB
Update: even when excluding the c: drive from the !md5 query, I see several CreateFileW errors on c: appearing in the log. Does this mean that the indexing has a "backlog" that it wants to keep up with? Even when it broke off?
Update2: when confining to the single drive, I do not see those c: entries anymore. I excluded c: before by "!md5 !c:" but maybe the ! operator is not working properly in this case?
Re: MD5 Property indexing for network drives seems to die
I also found out on another machine that certain entries on a Google Drive volume are not MD5 property indexed:
extensions are gdoc gslides gsheet gjam gform lnk. I guess these are treated like hardlinks here and therefore skipped?
extensions are gdoc gslides gsheet gjam gform lnk. I guess these are treated like hardlinks here and therefore skipped?
Re: MD5 Property indexing for network drives seems to die
If you want to exclude (from Index rather then display), you do that in:I excluded c: before by "!md5 !c:"
Tools | Options | Indexes | Properties -> Exclude
How many is lots?lots of files
Re: MD5 Property indexing for network drives seems to die
Well, I wouldn't exclude c: from indexing, but exclude it from the list of files not having an md5 yet. Those are selected and triggered into property indexing by ctrl-F5.
About 320K files, which might be reduced quite some by deduplication.
Re: MD5 Property indexing for network drives seems to die
Correct.I wouldn't exclude c: from indexing
But as you do want to exclude the Property Indexing (of MD5) for C:, you would do that as above.
Re: MD5 Property indexing for network drives seems to die
Another observation:
After the select all and ctrl-F5 the indexing for the single network drive was busy. From the corner of my eye (this machine is next to me on the desk, working on another one) I saw suddenly the GUI refreshing, getting rid of the selection and the indexing stopped. So could it be the GUI is interrupting here?
I saw in the log that c: indexing kicked in at the very same time, so following TheRube's suggestion I excluded c: now from property indexing. However this triggered a full property indexing of all network drives. 279K files to go....
After the select all and ctrl-F5 the indexing for the single network drive was busy. From the corner of my eye (this machine is next to me on the desk, working on another one) I saw suddenly the GUI refreshing, getting rid of the selection and the indexing stopped. So could it be the GUI is interrupting here?
I saw in the log that c: indexing kicked in at the very same time, so following TheRube's suggestion I excluded c: now from property indexing. However this triggered a full property indexing of all network drives. 279K files to go....
Re: MD5 Property indexing for network drives seems to die
I recommend using sidecar files to store md5 values.
Everything will only access files with standard user rights to calculate the md5.
Everything will not be able to access system files.
Have you enabled Tools -> Options -> Advanced -> default_multithreaded? -Please make sure this is set to: (Use default)
Have you enabled multiple threads for your network index under Tools -> Options -> Folders -> Right click Network Drive -> Advanced -> Threads -Please make sure this is set to: (Use default)
Could you please send your Help -> Troubleshooting information.
This is normal.A lot of files in Windows system seem to render the CreateFileW error.
Everything will only access files with standard user rights to calculate the md5.
Everything will not be able to access system files.
Everything should be gathering properties in path order.When not confined to a single network drive, indexing seems to go all over the place, seemingly picking random files for property calc. This is seen when hovering over the index progress bar. Probably due to the position in the index?
Have you enabled Tools -> Options -> Advanced -> default_multithreaded? -Please make sure this is set to: (Use default)
Have you enabled multiple threads for your network index under Tools -> Options -> Folders -> Right click Network Drive -> Advanced -> Threads -Please make sure this is set to: (Use default)
I'm checking things my end..Whenever the machine locks itself, and being left a bit longer, also indexing seems to halt. Still it does not seem to be in sleep mode.
Could you please send your Help -> Troubleshooting information.
Re: MD5 Property indexing for network drives seems to die
Hi void,
Regards, EricB
I understand that sidecar files would lessen the load on Everything itself, but the beauty of internal checksums, once generated, is that they are maintained with every file operation, beit copy, move or delete. Less issues with lingering remains, so to say.
I've checked the multithreaded settings, and all is set to default. I was thinking that setting higher priority might work advantageously, but didn't want to tinker.void wrote: ↑Tue Dec 05, 2023 3:30 am Everything should be gathering properties in path order.
Have you enabled Tools -> Options -> Advanced -> default_multithreaded? -Please make sure this is set to: (Use default)
Have you enabled multiple threads for your network index under Tools -> Options -> Folders -> Right click Network Drive -> Advanced -> Threads -Please make sure this is set to: (Use default)
Gladly. What is the preferred method? The contact address that is listed on voidtools.com?
Regards, EricB
Re: MD5 Property indexing for network drives seems to die
support@voidtools.com
-or-
anonymously with Bug Report (please paste the information in details)
Thank you.
-or-
anonymously with Bug Report (please paste the information in details)
Thank you.
Re: MD5 Property indexing for network drives seems to die
Sent it, might end up in Junk, since I attached the info as a text file.
Re: MD5 Property indexing for network drives seems to die
Got it, thanks.
I am currently looking into the issue.
The first thing to catch my eye was "exclude_recall_on_data_access":1 (Tools -> Options -> Properties -> MD5 -> Exclude recall on data access)
I wonder if these network files have the M attribute set?
Please check if M is shown in the attributes column for these files.
I am currently looking into the issue.
The first thing to catch my eye was "exclude_recall_on_data_access":1 (Tools -> Options -> Properties -> MD5 -> Exclude recall on data access)
I wonder if these network files have the M attribute set?
Please check if M is shown in the attributes column for these files.
Re: MD5 Property indexing for network drives seems to die
Checked that, but no. I just enabled this to make really sure that no cloud file would be downloaded on access. Not that it is really necessary, I always keep the files available locally. So overdoing it a bit here....void wrote: ↑Tue Dec 05, 2023 9:29 am
The first thing to catch my eye was "exclude_recall_on_data_access":1 (Tools -> Options -> Properties -> MD5 -> Exclude recall on data access)
I wonder if these network files have the M attribute set?
Please check if M is shown in the attributes column for these files.
file: !md5 distinct:attributes comes up with A, HA, HS, HSA, RA attributes only for files on network drives. And it's mostly A.
Re: MD5 Property indexing for network drives seems to die
So Attribute 'M' is "OneDrive" specific?I wonder if these network files have the M attribute set?
Re: MD5 Property indexing for network drives seems to die
At least, this is what I thought. But I moved a bunch of duplicate files to a new folder within the same drive using the Advanced move within Everything, and I see that the MD5 property for those files is empty and recalculated.
@void can you confirm this is as designed? I'd reckon when a file is moved, the index entry is updated, but that such a property would remain as is in stead of being cleared.
Re: MD5 Property indexing for network drives seems to die
Yes, unfortunately.@void can you confirm this is as designed?
Everything sees the move as removed + added.
The md5 property value is removed and then re-gathered.
This is just a limitation with ReadDirectoryChanges.
If you rename the parent folder or rename the file then the md5 property is not cleared/regathered.
I have been testing your settings my end and haven't run into the issue yet.
Locking a work station appears fine.
The only way I was able to simulate the issue is when I pulled out the network cable to the PC.
Everything will silently fail when it tries to read the remaining md5 values.
The md5 values appear empty in Everything.
Maybe the network is disconnecting for you?
Could you please check your event viewer and see if there is a system event for a network disconnection.
If md5 values are missing I recommend the following:
Search for:
!md5:
Select all files and press Ctrl + F5.
I have on my TODO list to re-request missing md5 property values when the device comes back online.
I also have on my TODO list to keep retrying for a single property value for up to one minute.
Currently, Everything will abandon all remaining property requests for a volume if it goes offline.
-If your network is down for 1 second, all remaining property requests will fail for that volume.
Hopefully this change will only led to one md5 value missing instead of all remaining md5 values..
Re: MD5 Property indexing for network drives seems to die
Too bad, but I understand the reason.
My machine is a laptop, connected by Wi-Fi, although it is plugged in, it locks after some 15 min of idle time.
I checked, and I see Kernel power events for the system entering the Connected standby due to idle time. [Lock screen]
After that some network events "7026 - Dump after return from D3 before/after cmd".
Only when the timeout is longer I see nhi reporting this: "The driver entered RTD3. All the connected devices will be removed from driver's internal state, so it is expected that DeviceDisconnected events will happen."
Not sure if that is a real disconnect, since the network drives are certainly not offline and readily available when unlocking the laptop.
It seems however that the disruption is not only happening during Lock screen idle time. I saw this happening a few times:
Current state of affairs is now that I just have to check regularly if the indexing has stopped. If so, I've to Ctrl-A and Ctrl-F5 again. It takes some time, but in the end I'm getting there. And the MD5 property is working very well for finding duplicates, I've already identified multiple gigabytes of duplicates.EricB wrote: ↑Mon Dec 04, 2023 4:11 pm Another observation:
After the select all and ctrl-F5 the indexing for the single network drive was busy. From the corner of my eye (this machine is next to me on the desk, working on another one) I saw suddenly the GUI refreshing, getting rid of the selection and the indexing stopped. So could it be the GUI is interrupting here?
Even if these additions would not help my case, they seem very sensible for other cases.void wrote: ↑Wed Dec 06, 2023 4:51 am I have on my TODO list to re-request missing md5 property values when the device comes back online.
I also have on my TODO list to keep retrying for a single property value for up to one minute.
Currently, Everything will abandon all remaining property requests for a volume if it goes offline.
-If your network is down for 1 second, all remaining property requests will fail for that volume.
Hopefully this change will only led to one md5 value missing instead of all remaining md5 values..
Regards, EricB
Re: MD5 Property indexing for network drives seems to die
Could you please run Everything in verbose debug mode:Current state of affairs is now that I just have to check regularly if the indexing has stopped.
- In Everything, from the Tools menu, under the Debug submenu, check Verbose.
- From the Tools menu, under the Debug submenu, check Start Debug Logging.
---wait for Everything to stop property indexing--- - In Everything, from the Tools menu, under the Debug submenu, click Stop Debug Logging.
---this will open your %TEMP%\Everything Debug Log.txt in Notepad. - Look for the following lines:
CreateFileW(): <error-code>: Failed to open file <filename> - There should be a lot of these messages.
- What is the error code?
I'm working on adding a retry when gathering properties and the volume goes offline.
Re: MD5 Property indexing for network drives seems to die
Hello void,
Today I indeed followed the procedure as described, it was happily MD5-ing, I got away from the machine 10 minutes and after unlocking it, I saw that the indexing had stopped.
Since the debug file has become quite large (even zipped it is 12 Mb), and I don't want to strip it, could I send it to you via Dropbox or WeTransfer?
Regards, EricB
Today I indeed followed the procedure as described, it was happily MD5-ing, I got away from the machine 10 minutes and after unlocking it, I saw that the indexing had stopped.
Since the debug file has become quite large (even zipped it is 12 Mb), and I don't want to strip it, could I send it to you via Dropbox or WeTransfer?
Regards, EricB
Re: MD5 Property indexing for network drives seems to die
Thank you for the debug log.
2023-12-09 14:08:52.607: GetOverlappedResult M: <share-name> 64
(there's a lot of these, one for each share, same error code)
Error 64 is ERROR_NETNAME_DELETED
The specified network name is no longer available.
All pending property requests are aborted.
I am working on a solution..
2023-12-09 14:08:52.607: GetOverlappedResult M: <share-name> 64
(there's a lot of these, one for each share, same error code)
Error 64 is ERROR_NETNAME_DELETED
The specified network name is no longer available.
All pending property requests are aborted.
I am working on a solution..
Re: MD5 Property indexing for network drives seems to die
Everything 1.5.0.1362a will now keep trying to read properties on offline volumes.
Everything should now only miss gathering a few properties when your network drops (instead of all remaining pending property requests on the offline volume)
Everything should now only miss gathering a few properties when your network drops (instead of all remaining pending property requests on the offline volume)
Re: MD5 Property indexing for network drives seems to die
Thank you, David!
I'm going to test this the coming days. Is Ctrl-F5 on !MD5: search selection still the best way continuing indexing the properties in this case? Or should I just re-index the network drive?
Just out of curiosity, did you implement some mechanism to mark a network drive "dirty", if it looses connection, so indexing can be picked up again when connection is restored?
Regards, EricB
I'm going to test this the coming days. Is Ctrl-F5 on !MD5: search selection still the best way continuing indexing the properties in this case? Or should I just re-index the network drive?
Just out of curiosity, did you implement some mechanism to mark a network drive "dirty", if it looses connection, so indexing can be picked up again when connection is restored?
Regards, EricB
Re: MD5 Property indexing for network drives seems to die
Yes.I'm going to test this the coming days. Is Ctrl-F5 on !MD5: search selection still the best way continuing indexing the properties in this case?
No, not yet.Just out of curiosity, did you implement some mechanism to mark a network drive "dirty", if it looses connection, so indexing can be picked up again when connection is restored?
It's on my TODO list.
I did experiment doing this last week.
Unfortunately, to determine if a property value was gathered successfully is rather CPU and Disk IO expensive.
Re: MD5 Property indexing for network drives seems to die
Hi void,
I did some experimenting MD5-ing on Network drives with large amounts of files. I see an improvement insofar that the network disconnect does not seem to happen anymore. However, I got a GUI hangup twice, which unfortunately also undid all the MD5 progress, since the database wasn't flushed. Funny thing was that the console was still running stuff, but the GUI did not recover anymore.
I've retrieved the Windows crash reports for both events. Also I've got the debug logs for both runs. Do you want me to send them for analysis?
Regards, EricB
I did some experimenting MD5-ing on Network drives with large amounts of files. I see an improvement insofar that the network disconnect does not seem to happen anymore. However, I got a GUI hangup twice, which unfortunately also undid all the MD5 progress, since the database wasn't flushed. Funny thing was that the console was still running stuff, but the GUI did not recover anymore.
I've retrieved the Windows crash reports for both events. Also I've got the debug logs for both runs. Do you want me to send them for analysis?
Regards, EricB