Everything's ability to Process file lists.

Discussion related to "Everything" 1.5 Alpha.
Post Reply
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Everything's ability to Process file lists.

Post by ChrisGreaves »

I seek confirmation that I am on The Right Track with Everything 1.5.
I am not looking for a solution here - a simple yes/no response will suffice, but I am seeking re-assurance that my general idea is possible.

I need to locate pairs (and only pairs, not triplets) of files on a hard drive.

I can write the traditional brute-force approach (see below) in WordVBA and run it over the next weekend. So I have "a solution"

But with what I know of Everything I can reduce time of execution by making use of Everything's ability to Produce and to Process file lists.

Code: Select all

t: ext:msf
This search allows me to build an EFU file list of one type of the pair of files i seek.

This post in "Compare 2 file lists with each other?" introduces me to the concept of a process/filter/macro/user-function within Everything. New territory to me in Everything, but I suspect that if I scratch my head enough I might be able to make progress here in learning what appears to be a "programming language" within Everything and give me greater processing power down the road.

(1) Am I correct in thinking that I should work at gaining proficiency in this technique as a means to process EFU lists?
(2) Or is there a different part of Everything that might be more suitable?

I have no paid task riding on this; just a desire to see how else Everything excels at file-name processing.
Thanks, Chris
My Brute-Force algorithm:

Code: Select all

For each folder on a drive
	For each file in that folder
		If extension matches parameter ‘ then we have one candidate for a pair
			If there exists exactly one other instance of this <name> in the folder then – we still have a chance
				If that instance has no extension ‘ then we have found a pair
					Report this pair to the user
				Else
			Else
		Else
	Next file
Next folder
void
Developer
Posts: 16735
Joined: Fri Oct 16, 2009 11:31 pm

Re: Everything's ability to Process file lists.

Post by void »

I need to locate pairs (and only pairs, not triplets) of files on a hard drive.
Currently, not possible with Everything.
Everything will find all duplicates.
I have on my TODO list to add functionality to limit the number of duplicates.

For now, you could export as EFU and process the list externally with a script.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

void wrote: Mon Jul 01, 2024 9:20 amCurrently, not possible with Everything. ... For now, you could export as EFU and process the list externally with a script.
Thank you Void.
My understanding is that I can produce an EFU list with Everything 1.5 and use that list to drive my (WordVBA) external script.
That does eliminate my need to program the two loops in my brute-force algorithm (not a big deal) and hence reduce the processing time (a HUGE deal!)
Cheers, Chris
therube
Posts: 4972
Joined: Thu Sep 03, 2009 6:48 pm

Re: Everything's ability to Process file lists.

Post by therube »

I need to locate pairs
But these are not "pairs", as in, duplicates, these are pairs, where 1 file is associated with another (making it a "pair").

So if your extension is: .zoo
& you have "tiger.zoo", then you want to find stem:tiger.zoo = "tiger" in the current directory,
& if found, then you have a "pair", "tiger.zoo" & "tiger".

And Everything can do that, no.
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Everything's ability to Process file lists.

Post by NotNull »

And I thought it was about files with the same stem, but with different extensions :?

Like:
- file.A, file.B and file.C will not be listed as there are more than 2 files with the same stem ("file")
- name.A and name.B will be reported as there are exactly 2 files with the same stem (" name" )
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Everything's ability to Process file lists.

Post by NotNull »

Everything has quite a few Search Functions. Most of them concentrate on specific properties of a file or folder.
There are also a couple of what I call "family functions" . They concentrate on common properties of ancestor, parent, sibling, child and descendant (did I forget one?). They are also listed under the Search Functions.

All these properties can be shown in extra columns -- beside the already shown Name, Path, Size, Date -- by right-clicking the column header, slecting Add Columns and choosing the one(s) to add.


It is also possible to create your own columns, by combining the data of the colums above with Formulas.
Consider them to be a kind of spreadsheet formulas that work on properties of a file/folder.
With them you can do some crazy and fun stuff. A more extreme example is shown in the link you posted.
More examples can be found with a forum-search for "a-label"

People familiar with spreadsheets should get the hang of it quite quickly, but it is not a "day 1 rabbithole" to dive into.


BTW: I found an answer to (what I think is) your question without using Formulas, so it is not needed in this case.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

therube wrote: Tue Jul 02, 2024 3:02 pm
I need to locate pairs
But these are not "pairs", as in, duplicates, these are pairs, where 1 file is associated with another (making it a "pair").
Thank you the rube.
Yes, this is what i want. The file pairs are from Mozilla Thunderbird, so as an example:
Files_01.png
Files_01.png (54.37 KiB) Viewed 10321 times
Everything finds 133 *.MSF on my data partition T:. Almost all of these are older versions of the InBox and other common files from earlier Thunderbird installations, While they contain mainly out-of-date emails, from time to time I remember reading an email about, say, "wild bees in Newfoundland", and so I want Thunderbird to spend five minutes locating that email.
Do my mission is to locate each <file>MSF for which there is a <file>, and if these are the only two <file> in that folder then generate a unique name for the pair and copy/delete or MOVE them to the Local Folders in the new profile folder.

In the image above I have circled the Inbox MSF that is my current Inbox, and an older Inbox which I moved experimentally from my data partition.
Given my ability with Word/VBA the algorithm to work through 134 filenames will be pretty easy, but I thought to first ask if Everything could do this.


Cheers, Chris
therube
Posts: 4972
Joined: Thu Sep 03, 2009 6:48 pm

Re: Everything's ability to Process file lists.

Post by therube »

(.msf are essentially text files.
So have Everything do a content: search on .msf for wild bees Newfoundland.
[I'm not particularly sure on the syntax for content:...]

Once Everything has found the particular .msf, have TB open that .msf.

Might even set up a seperate TB Profile, perhaps called, msf_opener.
And msf_opener is a "mule", if you will, where [with TB closed] you throw any arbitrary .msf into its data directory, then open TB -profile "msf_opener" to load that particular .msf.)
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

NotNull wrote: Tue Jul 02, 2024 8:57 pmAnd I thought it was about files with the same stem, but with different extensions :?
:clapping:
Spot on!

Code: Select all

ext:msf stem:trash
Files_02.png
Files_02.png (115.15 KiB) Viewed 10312 times
"ext:msf" yields all the (supposed) Thunderbird MSF files of a (supposed) pair.
I now see/believe that "stem:" extracts what - in another life - I used to call the name part of name.ext
And as usual i have circled the most recent files at the head of the image - today's installation of Thunderbird.

Given my still-shaky knowledge of Thunderbird, I will write a small application to process each MSF file, look for a matching STEM, and build a list of folders with those stem items.
Then as I process each ITEM I can
(1) Check that it is a true pair, that there isn't a third item lurking in the wings
(2) Present a question to the user: Does this look like a pair of mail folders?
(3) And if so, rename each pair with a unique STEM and move the pair INTO Thunderbird''s Local Folders folder AND delete the pair from the original folder.

Little by little, under this user's eyes, I will be able to move most of the MSF pairs, leaving only genuine puzzles.

To test this out I shall go out and mow hay where the lawn was supposed to be and mull over the idea.
Thanks for the confirmation!
Cheers, Chris
therube
Posts: 4972
Joined: Thu Sep 03, 2009 6:48 pm

Re: Everything's ability to Process file lists.

Post by therube »

Does this look like a pair of mail folders?
No.
The two are not going to be related to one another.


Now, the 7vo...pop.gmail.com & 7fp...pop.gmail.com likely are related, one being current, & the other containing data from 2 years back. (And that said, given the sizes, I'd expect they're essentially empty at that.)
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

therube wrote: Fri Jul 05, 2024 3:40 pm
Does this look like a pair of mail folders?
No.
The two are not going to be related to one another. Now, the 7vo...pop.gmail.com & 7fp...pop.gmail.com likely are related, one being current, & the other containing data from 2 years back. (And that said, given the sizes, I'd expect they're essentially empty at that.)
There are two areas of data here:-
(1) In Thunderbird specifically, when we see "InBox" or "Trash" or "2024" or "BernieL" or whatever, what to Thunderbird and its users is essentially "a mail folder" is a pair of files called variously:-
Inbox and Inbox.MSF
Trash and Trash.MSF
2024 and 2024.MSF
and BernieL and BernieL.msf
etc.

(2) Now these file pairs can be any Windows folder that we choose (or setup in Thunderbird)
Files_01.png
Files_01.png (54.37 KiB) Viewed 10303 times
So in this image (from above in this topic) we can see EIGHT Windows folders.
The first folder shown is Thunderbird's InBox that we all know and love.
The second folder is an Inbox PAIR that I have dragged into the Local Folders tree within the Thunderbird PROFILE folder for this installation.
The remaining six Windows folders I must examine; for each of those Windows folders I will try to match a <stem>.MSF with a <stem> file, and if I find ONLY two such files, I will (rename and ) drag those two files into the Local Folders Windows Folder of my current Thunderbird

I know that that sounds confusing, and it only a week since, after ~20 years of using Thunderbird that I have learned how to carry forwards mail from a previous installation.

I truly hope that, over time, this helps more than confuses.

Back to my original post in this topic: I can use Everything to make a file list that can be the launching-point for a discovery phase that I feel I have to monitor closely.
But at least I don't have to wander through 288,325 files in 29,590 folders; Everything will have done that for me in mere seconds. MUCH faster than Word/VBA

Cheers, Chris
horst.epp
Posts: 1445
Joined: Fri Apr 04, 2014 3:24 pm

Re: Everything's ability to Process file lists.

Post by horst.epp »

I use the Thunderbird Maildir format since a long time.
This stores every mail in single .eml formats and not in the strange single mailbox file format.
They can be content indexed and have no other dependencies.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

therube wrote: Fri Jul 05, 2024 3:35 pm(.msf are essentially text files. So have Everything do a content: search on .msf for wild bees Newfoundland. [I'm not particularly sure on the syntax for content:...] Once Everything has found the particular .msf, have TB open that .msf.
Hi therube. If I understand this you are suggesting that the MSF could be scattered all over my data partition T: ??? Because that's my situation right now; ten or so different installations over the years, each in its own folder (because I didn't know any better). And that's not counting the MSF on my cumulative backup drive.
If all the relevant MSF are in the one folder, OldMSFPairs within Local Folders in this morning's current installation, then I can have Thunderbird search that cluster of INBOXes, TRASHes, 2024es etc, right? Which would suggest that I've already collected those "pairs" in the one Windows folder, right?

Might even set up a separate TB Profile, perhaps called, msf_opener. And msf_opener is a "mule", if you will, where [with TB closed] you throw any arbitrary .msf into its data directory, then open TB -profile "msf_opener" to load that particular .msf.)
And if I've got this right, it means that I copy/move an arbitrary MSF file into a folder for each search, have a separate "mule" installation of Thunderbird, and it will find only "wild honey bees" reported BY that MSF IN its related extent-less file?
I think then that Thunderbird would be searching only ONE MSF-pair, rather than going through my vast collection of old Thunderbird mail boxes?

Three possibilities:
(1) You have misunderstood how Thunderbird search works or
(2) I have misunderstood how Thunderbird search works or
(3) We BOTH have misunderstood how Thunderbird search works (grin!)

Cheers, Chris
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Everything's ability to Process file lists.

Post by NotNull »

@void is basically right: not possible with Everything.
But there's a "Hold my Beer" solution (basically doing something risky/stupid, because, well, too much beer ...)
You are not looking for a solution, but I give one anyway :mrgreen:

I was describing the thought process behind it, but that became more than a page long
(what was basically a couple of seconds of thinking; incredible what the mind can accomplish in such a short time. And about 20 failed attempt to make it working in the end; incredible how many errors one can make in such a short time).
Deleted it.
(The core idea here is to capture the name of the file matching the sibling: - pattern. Which is unlikely to use a supported method .. )


You can try this (you can leave the starting folder out when initial tests are successful):

Code: Select all

"C:\starting folder\"   regex:^(.*)\.(msf|.+?)$   regex:sibling:^$1:\.(msf|.*)$   unique:path,regmatch0   dupe:path,stem 
2024-07-05 20_48_34-Settings.png
2024-07-05 20_48_34-Settings.png (72.58 KiB) Viewed 10284 times





FWIW: Everything can also be very helpful in copying/moving these pairs.
(Menu => Edit => Advanced => Advanced Copy To Folder)
Might be a little tricky to set up correctly in this case, but will likely save you tons of time.
The people here will help you out (if needed).


My testfiles:
onlydoubles.zip
(1.94 KiB) Downloaded 68 times
horst.epp
Posts: 1445
Joined: Fri Apr 04, 2014 3:24 pm

Re: Everything's ability to Process file lists.

Post by horst.epp »

NotNull wrote: Fri Jul 05, 2024 6:35 pm You can try this (you can leave the starting folder out when initial tests are successful):

Code: Select all

"C:\starting folder\"   regex:^(.*)\.(msf|.+?)$   regex:sibling:^$1:\.(msf|.*)$   unique:path,regmatch0   dupe:path,stem 
Thanks a lot for this.
I left out the starting folder and use it as Bookmark within the Folder view.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

NotNull wrote: Fri Jul 05, 2024 6:35 pmYou can try this (you can leave the starting folder out when initial tests are successful):

Code: Select all

"C:\starting folder\"   regex:^(.*)\.(msf|.+?)$   regex:sibling:^$1:\.(msf|.*)$   unique:path,regmatch0   dupe:path,stem 
Thanks NotNull. After three minutes I realized that this is NOT a CMD prompt command and pasted it into Everything. I turned it loose on my data partition T: and went back outside to empty and sort through 150 containers that used to hold trees seedlings.
Back inside we found 18,000 items (so I think 9,000 pairs) from T: which holds 288,000 files - so mumble-mumble percent.
FilePairs_01.png
FilePairs_01.png (108.11 KiB) Viewed 10212 times
The majority of the pairs are doc/htm, which means that my web compiler spits out a lot of web pages and uploads them.

I do not pretend to understand Regex stuff - it's been on my agenda for the past twenty years - so my brain parses this from left to right as:-
(1) only look on T:
(2) look for MSF files
(3) look for siblings of those found MSF files, where a sibling has a null extent
(4) but by the time I get to unique path and dupe path my mind is in a daze.

Now this task to me is not essential (which is why I said "not looking for a solution", :mrgreen: ,but just in case you didn't know - I do appreciate you developing this. I am reporting back here in case I have mis-edited something OR you want to see why (I suspect) I ended up with a list of 18,000 pairs which were mostly NOT msf pairs.
(signed) "Your beta-tester of Bonavista"
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Everything's ability to Process file lists.

Post by NotNull »

Hmm ...the query finds *all* "pairs" (there is a simpler query for that), not just the msf ones .
Will look at it tomorrow, but at first glance I made a mistake, be it copying or thinking.

Anyway: Thank you for testing!
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Everything's ability to Process file lists.

Post by NotNull »

In the meantime, could you test the following query?

Code: Select all

"C:\starting folder\"   <regex:^(.*)\.(.+?|msf)$   regex:sibling:^$1:\.(.+|msf)$   unique:path,regmatch0   dupe:path,stem> <regmatch1:msf|ext:msf> 
BTW: what file extensions do the files typically have that are paired with the .msf files?
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

NotNull wrote: Sat Jul 06, 2024 10:01 pmIn the meantime, could you test the following query?
FilePairs_02.png
FilePairs_02.png (42.63 KiB) Viewed 10174 times
Ran the test as given
ZERO items.
I know not what I do, so I started truncating the search string from the RHS:-
FilePairs_03.png
FilePairs_03.png (118.17 KiB) Viewed 10174 times
6,970 items; they match in the STEM portion. So 3485(?) pairs.
FilePairs_04.png
FilePairs_04.png (129.82 KiB) Viewed 10174 times
7,432 items.

(This would not be runnable if I were not using an external SSD for my data partition)
FilePairs_05.png
FilePairs_05.png (126.18 KiB) Viewed 10174 times
8,337 items
FilePairs_06.png
FilePairs_06.png (131.86 KiB) Viewed 10174 times
78,936 items

0
6970 6970
7432 462
8337 905
78936 70599

The difference table is shown above.
FilePairs_07.png
FilePairs_07.png (104.03 KiB) Viewed 10174 times
I have 111 files with an extent of "MSF" within the t:\Greaves tree.
FilePairs_08.png
FilePairs_08.png (45.25 KiB) Viewed 10174 times
Now generally we should expect a pair of files to appear, so that if we see "Input.MSF" we expect to find, in the same folder, the file "Input" as the screenshot above shows.
So why does "bernielynch.msf" appear without its buddy "bernielynch"? Because I didn't know what I was doing when i copied ""bernielynch.msf"" into the folder. I thought that the MSF files represented the Thunderbird mailbox for "Bernie Lynch", now I know that i must copy a PAIR of related files for each mailbox in Thunderbird.

So in all the data above, you and I can't make a dead-accurate calculation of how many items we should see at each stage of your expression because in the past I have corrupted the purity of the pairs.
NotNull wrote: Sat Jul 06, 2024 10:01 pm BTW: what file extensions do the files typically have that are paired with the .msf files?
The file associated with the MSF file should be a file with the same STEM but with No extent (or if you prefer, with a NULL extent).

Cheers, Chris
tuska
Posts: 1052
Joined: Thu Jul 13, 2017 9:14 am

Re: Everything's ability to Process file lists.

Post by tuska »

2Notnull
Out of interest I tested this (with a different file extension) - here is my result:
 
2024-07-07_Pairs of files (partner files) - file extension and exactly one partner file with a different file extension.png
2024-07-07_Pairs of files (partner files) - file extension and exactly one partner file with a different file extension.png (81.31 KiB) Viewed 10171 times
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Everything's ability to Process file lists.

Post by ChrisGreaves »

tuska wrote: Sat Jul 06, 2024 11:36 pmOut of interest I tested this (with a different file extension) - here is my result:
Hi Tuska. The first part of the image looks like my Goal EXCEPT _DSC8038 has three files, rather than a pair.
The second part of the image looks good; in my case the second part of the pair should have a null extent.

It could be argued that an all-Everything solution might start off by adding a unique extent to those null-extent files, but that complicates things. We can't just add extents to files willy-nilly. There again, once we have identified and moved the obvious pairs out of the way, we could undo that unique extent from remaining files, but oh! ...

Anyway, thank you for providing confirmation that there might yet be a way.
I am out of my depth with regex so apart from field-testing solutions I am of little use here.
Cheers, Chris
tuska
Posts: 1052
Joined: Thu Jul 13, 2017 9:14 am

Re: Everything's ability to Process file lists.

Post by tuska »

ChrisGreaves wrote: Sun Jul 07, 2024 12:22 pm The first part of the image looks like my Goal EXCEPT _DSC8038 has three files, rather than a pair.
_DSC8038 is NOT displayed in the search result, as the search is for pairs.
For me, the search result is therefore OK.

The question for me is whether in this example, after entering the file extension "xmp", the file with extension "raw"
should be found (this is not displayed due to 3 files).

However, I did not delve into this topic thoroughly enough, but only carried out a test out of interest.
Post Reply