Slew of time-consuming searches

Discussion related to "Everything" 1.5 Alpha.
Post Reply
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Slew of time-consuming searches

Post by ChrisGreaves »

void wrote: Tue Jan 17, 2023 11:23 pmPlease try reducing the number of files your are content indexing under Tools -> Options -> Content
While we are on the joined-at-the-hip "content" and "excessive time", I do understand that Everything is fast because it starts off with an index of file characteristics. A search by file name, path, date-modified etc is therefore a fast search in RAM.
I further understand that to satisfy "content" Everything must work through the index of found-files (file name, path, date-modified etc ) and open each of just those files to examine the content.

I'm new at this, but I suspect that there must be a slew of time-consuming searches. For example, MP3 files are different from TXT files in that MP3 files have tags in the file header, which is to say, in the content of the file, so asking about an Artist or a BitRate ought to consume time just as does Content.

If that is so, then somebody (OK, I'll do it!) might flag all the functions/modifiers/whathave you to indicate those features that require a file to be open to be queried.
Cont would be flagged, that's for sure.
I am unsure where Dimension is stored. Or Orientation

And while I am at it, I suspect that Everything is smart enough to determine what, in the Filter, can be satisfied without opening the file, and do all that filtering before opening the file to examine Content and the other time-consuming stuff.

Thanks, Chris
void
Developer
Posts: 16764
Joined: Fri Oct 16, 2009 11:31 pm

Re: everything indexes the content but is slow to provide results

Post by void »

Moved from everything indexes the content but is slow to provide results.



I have on my TODO list to indicate the performance of each search function.

Search Weights covers this a little.

Internally, every single search function is weighted.
I may publish these weights closer to release.

For now, there's 3 basic weights:
Indexed information == fast
Properties = slow
Content = very slow



I have categorized properties into the following types:
Content (very slow)
Metadata (slow)
File (medium)
Volume (medium)
Search (fast)
Index (fast)

To view the type of a property:
  • Right click the result list column header and click Add Columns...
  • Search for your property.
  • The property type is listed under the Type column.
All properties map to a search function.
Right click a column header and click Search for <property>
This will append the search function to your search box.
For example, the Width property has a width: search function.

The property type should give a good idea how fast the associated search function will perform.



Indexing properties will drastically improve search performance at the cost of RAM usage.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: everything indexes the content but is slow to provide results

Post by ChrisGreaves »

void wrote: Sat Feb 04, 2023 3:13 am Search Weights covers this a little.
David, who put you in charge of rabbit-holes? :lol:

If I understand your weights-and-measures-act (and I but skimmed it) you will have a set of compile-time constants - weights - and at run-time will evaluate items within a filter so as to execute the fastest items first.
That makes sense.
For now, there's 3 basic weights:
Indexed information == fast
Properties = slow
Content = very slow

I have categorized properties into the following types:
Content (very slow)
Metadata (slow)
File (medium)
Volume (medium)
Search (fast)
Index (fast)
For our beginner (i.e.me) the basic trick must be
(1) Tools Options, Index and see what is checked ON as indexable
(2) Start searching until you have an anticipated sub-population of objects that meets the criteria.
(3) Only then, introduce new terms to fine-tune list results that will suit your application. These new terms will be relatively time-consuming, but the overall cost will be reduced by the terms acting only on the selected sub-population.
Indexing properties will drastically improve search performance at the cost of RAM usage.
David, who put you in charge of rabbit-holes? :lol:

Discussions or RAM, CPU, HDD speeds and so on come into their own when one gets into a production mode, I think. That is important for folks who are dealing with, say, millions of transaction files at Amazon. And when that happens, often enough Amazon will spring for a dedicated machine.
I can speak only for me, but when I am developing a solution, I tend to allocate whatever hardware it takes to get the solution to the point that it is acceptable to the client, so I am, in general, less worried about hardware usage.

That said, it seems to me that down the road, a series of controlled runs, run in batch mode, to report various aspects of speed, hardware use, is going to be required to validate and/or fine-tune your work on speeds. Correct?

Thanks, Chris
void
Developer
Posts: 16764
Joined: Fri Oct 16, 2009 11:31 pm

Re: Slew of time-consuming searches

Post by void »

That said, it seems to me that down the road, a series of controlled runs, run in batch mode, to report various aspects of speed, hardware use, is going to be required to validate and/or fine-tune your work on speeds. Correct?
Unless hard drives get faster than ram, the weights won't need to be changed.

Indexed information will be in RAM and will be fast.
Unindexed meta data / content will be on disk and will be slow.

Weights are also calculated on how much work has to be done.
Typically hardware improvements will scale evenly for all weights.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Slew of time-consuming searches

Post by ChrisGreaves »

void wrote: Tue Feb 21, 2023 11:13 am Unindexed meta data / content will be on disk and will be slow.
Void, Thanks for all of this. My new laptop has an SSD and I had supposed that SSDs replacing HDDs would have taken a bit of the wind out of your sails, but regardless of the "drives" faster operation users are still in need of powerful search filters, which is, after all, what Everything provides, regardless of speed of machine components.
Everything does what Microsoft says can't be done!
Chris
Post Reply