Regex Character Escapes Documentation

General discussion related to "Everything".
Post Reply
froggie
Posts: 301
Joined: Wed Jun 12, 2013 10:43 pm

Regex Character Escapes Documentation

Post by froggie »

Hi Void -

The regex documentation does not show any backslash escapes, but it appears that \. and \+ are implemented. Are there others? Are there any plans for more? (\\,\s and \d could be useful).
void
Developer
Posts: 16743
Joined: Fri Oct 16, 2009 11:31 pm

Re: Regex Character Escapes Documentation

Post by void »

Everything uses the Henry Spencer Regex Library.
This is only a POSIX regular expression implementation.

\ Escapes will only escape the next character.

You can use [[:space:]] and [[:digit:]] to match spaces and digits respectively.

I hope to add a Perl/Tcl regex implementation in a future release of Everything.
mpag
Posts: 2
Joined: Mon Jun 12, 2017 10:09 pm

Re: Regex Character Escapes Documentation

Post by mpag »

I can't seem to escape the closed square bracket

e.g. I have the following regex

Code: Select all

([^-[_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.])+
To return non-alphanumeric filenames (that also don't have a few other little-used characters)

However, adding

Code: Select all

\]
within my regex doesn't seem to work. Placing it immediately before the closing square bracket results in showing only files WITH square brackets in the file name. Placing it in the middle of the string results in 0 files returned (I should have things like Intel® Software Manager still returned). Placing it in the beginning, either escaped or not, returns either all files (before the second open square bracket) or 0 (after the second open square bracket). It doesn't seem to matter whether I have the second open square bracket escaped or not. I've tried double, triple and even quad-escaping the close bracket, but no options seem to work properly.

Is there a way to formulate the regexp to screen out files that are only included now because they include []?
void
Developer
Posts: 16743
Joined: Fri Oct 16, 2009 11:31 pm

Re: Regex Character Escapes Documentation

Post by void »

The ] character can be included in a bracket expression if it is the first (after the ^) character: []abc].
https://en.wikipedia.org/wiki/Regular_expression

For example:
([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.\-])+

Perl Compatible Regular Expressions (PCRE) have been added to Everything 1.4.
The following will also work with Everything 1.4:
regex:[^\x20-\x7f]+
therube
Posts: 4977
Joined: Thu Sep 03, 2009 6:48 pm

Re: Regex Character Escapes Documentation

Post by therube »

Code: Select all

regex:[^\x20-\x7f]+

([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.\-])+
Should they be returning the same (or at least similar) results?

With regex:[^\x20-\x7f]+, I get what I'd expect.
With ([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.\-])+, I get nothing?


XP
1.41.877
void
Developer
Posts: 16743
Joined: Fri Oct 16, 2009 11:31 pm

Re: Regex Character Escapes Documentation

Post by void »

If regex is not enabled, please make sure you search for:
regex:"([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.\-])+"

The space after Z will need to be escaped.

I get many results including a ; with the above search. Otherwise they are almost the same search.
therube
Posts: 4977
Joined: Thu Sep 03, 2009 6:48 pm

Re: Regex Character Escapes Documentation

Post by therube »

Well at least I'm getting results by quoting it.

But...

regex:[^\x20-\x7f]+, looks to be giving me "non-ASCII".

Where, ([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.\-])+, looks to be giving me non-non-ASCII, IOW, ASCII, as in the not is not negating the rest of the items?
The space after Z will need to be escaped.
And that would be with what, \ ?
void
Developer
Posts: 16743
Joined: Fri Oct 16, 2009 11:31 pm

Re: Regex Character Escapes Documentation

Post by void »

If you use the regex: modifier you will need to escape any spaces with double quotes:
regex:"([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.\-])+"

Otherwise, you will be searching for:
regex:([^][_~0-9a-zA-Z AND '`^=!@%&$#,{}+\(\)\.\-])+


If you enable regex from the search menu, you can search for:
([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.\-])+

(without the regex: modifier and double quotes)
mpag
Posts: 2
Joined: Mon Jun 12, 2017 10:09 pm

Re: Regex Character Escapes Documentation

Post by mpag »

thanks

Code: Select all

([^][-_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.])+
with regex enabled seems to work...mostly.

Somehow 03df314e-c18c-4d6c-9b89-f5490b1b5c0b_8[1].png and a few other [1] files (that also have hyphens) in one of the following directories
  • C:\Windows7.old\Users\NameHere\AppData\Local\Microsoft\Windows\Temporary Internet Files\Low\Content.IE5\Subdir
  • C:\Users\NameHere\AppData\Local\Microsoft\Windows\INetCache\IE\Subdir
  • C:\Users\NameHere\AppData\Local\Microsoft\Internet Explorer\DOMStore\Subdir
  • C:\Users\NameHere\AppData\Local\Packages\winstore_cw5n1h2txyewy\AC\AppCache\Subdir\1or2
where Subdir is a "random" sequence like 4K9RX1O1
still appear, as does RErth-BT+GS[016]F-C.rar in C:\Users\NameHere\Downloads

notably, neither

Code: Select all

([^]-[_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.])+
nor

Code: Select all

([^]\-[_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.])+
return any results, so in addition to square brackets being a little funky and finicky, so is the hyphen (-). Escaping the hyphen in the upper "working" sample doesn't impact results. However, double- (or triple- etc) escaping hyphen suddenly returns files with double-square brackets (e.g. [[IMPORT]])

Another thing that is strange about this filter is that many entries (e.g. Name of C: and no indicated Path) are listed twice in the results.
void
Developer
Posts: 16743
Joined: Fri Oct 16, 2009 11:31 pm

Re: Regex Character Escapes Documentation

Post by void »

The - character is treated as a literal character if it is the last or the first (after the ^, if present) character within the brackets: [abc-], [-abc].
https://en.wikipedia.org/wiki/Regular_expression

Please try:
([^][_~0-9a-zA-Z '`^=!@%&$#,{}+\(\)\.-])+
Post Reply