Number of hits in Content Search?
Number of hits in Content Search?
Is there a way to display (and sort by) the number of hits of a search term in a given file?
Re: Number of hits in Content Search?
I have put on my TODO list to add a property to show the number of occurrences of a search term.
Thank you for the suggestion.
Thank you for the suggestion.
Re: Number of hits in Content Search?
One convoluted way to do this now is:
Search for:
where your-search-term is your search term.
Then sort by the Regular Expression Matches 1-9 property.
Search for:
Code: Select all
#define:<t=your-search-term>regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1) | regex:(#t:).*(\1) | regex:(#t:)
Then sort by the Regular Expression Matches 1-9 property.
Re: Number of hits in Content Search?
Can't test right now, but I think that will work fine for filenames, but less for file content as it will find multiple search terms on one line. So if there are 5 lines, each containing "searchterm" (one time), it will report 1
For a halfway solution (as said: cant test), you need NotNull ( ) : [^\x00] as that makes regex look past line boundaries.
With that:
I do hope that showing the number of occurrences does not become the default, as it looks to me as being slower (reading a small chunk of data, see if i contains searchterm and if so: continue with next file VS Read complete file, count occurrences, go to next file)
For a halfway solution (as said: cant test), you need NotNull ( ) : [^\x00] as that makes regex look past line boundaries.
With that:
Code: Select all
c:\folder ext:txt regex:utf8content:"(searchterm)([^\x00]*\1){9}" | regex:utf8content:"(searchterm)([^\x00]*\1){8}" | (etcetera)
I do hope that showing the number of occurrences does not become the default, as it looks to me as being slower (reading a small chunk of data, see if i contains searchterm and if so: continue with next file VS Read complete file, count occurrences, go to next file)
Re: Number of hits in Content Search?
That number of hits count for content should not be a default.
If one is realy interested on such info he can always start some tool or script from the results
which show word counts and other infos.
If one is realy interested on such info he can always start some tool or script from the results
which show word counts and other infos.
Re: Number of hits in Content Search?
You can add (?s) to the beginning of your regex pattern to turn on PCRE_DOTALL [1][2][3]. This makes "." match \r and \n characters as well.
works: regex:content:"(?s)fox.*dog"
@void: Can you add some /g counting from PCRE? Any chance of adding support for m//g or (?g) patterns?
fails: regex:content:"fox.*dog"fox.txt wrote: The quick brown fox
jumps over
the lazy dog.
works: regex:content:"(?s)fox.*dog"
@void: Can you add some /g counting from PCRE? Any chance of adding support for m//g or (?g) patterns?
Re: Number of hits in Content Search?
Heh, I always thought that was PCRE2 syntax .. But it does indeed work in Everything (PCRE1). Good to know!
Re: Number of hits in Content Search?
I have put this on my TODO list.@void: Can you add some /g counting from PCRE? Any chance of adding support for m//g or (?g) patterns?
Thanks for the suggestion.
Re: Number of hits in Content Search?
Everything 1.5.0.1296a adds support for regex flags.
Regex flags can be enabled with the following search modifiers:
dotall:
. matches newlines
Regex alternative: (?s)
global:
Find all matches (not just the first).
If no capture groups are defined, each whole match is captured.
Regular Expression Match 0 captures from the start of the first match to the end of the last match.
multiline:
^ and $ match a whole line (not the whole text)
Regex alternative: (?m)
ungreedy:
Lazy matches by default, use (.*?) to swap between lazy/greedy.
case:
match case.
Regex alternative: (?i)
Instead of:
you can now search for:
global:regex:term
The Regular Expression Matches 1-9 property will show all the matches.
Everything doesn't support the /regex/flags syntax.
Everything uses PCRE
PCRE doesn't support a global flag.
I will consider adding support for my own (?g) flag.
For now, please use global:regex:
Regex flags can be enabled with the following search modifiers:
dotall:
. matches newlines
Regex alternative: (?s)
global:
Find all matches (not just the first).
If no capture groups are defined, each whole match is captured.
Regular Expression Match 0 captures from the start of the first match to the end of the last match.
multiline:
^ and $ match a whole line (not the whole text)
Regex alternative: (?m)
ungreedy:
Lazy matches by default, use (.*?) to swap between lazy/greedy.
case:
match case.
Regex alternative: (?i)
Instead of:
Code: Select all
#define:<t=your-search-term>regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1).*(\1) | regex:(#t:).*(\1).*(\1) | regex:(#t:).*(\1) | regex:(#t:)
global:regex:term
The Regular Expression Matches 1-9 property will show all the matches.
Everything doesn't support the /regex/flags syntax.
Everything uses PCRE
PCRE doesn't support a global flag.
I will consider adding support for my own (?g) flag.
For now, please use global:regex:
Re: Number of hits in Content Search?
Thanks for adding global! I think the syntax you chose is more than adequate, and I've never actually seen (?g) implemented in the wild. Your method is more than fine. (in-pattern flags like (?s) can usually be turned on-and-off by using their (?-s) counterpart, and can be used around a sub-pattern. there's no way to turn /g off, or to wrap /g around only a sub-pattern, so maybe (?g) isn't even appropriate.)
Also, I wanted to ask a long time ago about (?i) and (?-i) pattern flags. If I'm not mistaken, they interfere against Everything's not-Match-Case and only works if Match-Case is enabled. Should Match Case be interfering with regex patterns, in your opinion, or should they be insulated from that option?
Also, I wanted to ask a long time ago about (?i) and (?-i) pattern flags. If I'm not mistaken, they interfere against Everything's not-Match-Case and only works if Match-Case is enabled. Should Match Case be interfering with regex patterns, in your opinion, or should they be insulated from that option?
Re: Number of hits in Content Search?
They work as expected.Also, I wanted to ask a long time ago about (?i) and (?-i) pattern flags.
That is (?i) and (?-i) override case: and nocase: (or Match case from the Search menu)
regex is initialized (compiled) with the case:/nocase: search modifier. (or Match case from the Search menu)
(?i) / (?-i) will override the initial regex state.
nocase:regex:(?-i)ABC matches ABC (case sensitive)
case:regex:(?i)ABC matches abc or ABC or Abc etc... (case insensitive)
Another thing to note:
The capture groups when using global: in Everything is not standard. (not that there is a standard as global doesn't exist in PCRE)
Everything doesn't really have a way to expose capture groups for each match.
I find the current implementation works well enough..
I also skip over the previous match, so
II
III
Re: Number of hits in Content Search?
Everything 1.5 adds support for the following:
Show the number of occurrences of foo in the name:
Show the number of occurrences of foo in the content:
Show the number of occurrences of foo in the name:
addcolumn:a a:=STRINGCOUNT($name:,"foo") sort:a-descending
Show the number of occurrences of foo in the content:
addcolumn:a a:=STRINGCOUNT($content:,"foo") sort:a-descending