Content search: Find a word not inside link <a> tag ❓

wise_mike · Post by **wise_mike** » Sat May 18, 2024 6:35 pm

In content search, I am trying to find a word that is not inside <a and </a> tag.

This usually works in other regex searches, but in Everything it gives results of all occurrences of the word, including those inside link tag:

Code: Select all

regex*:content:<a.*?</a>[\s\S](*SKIP)(*F)|searchedword

Any suggestions?

Post by **NotNull** » Sat May 18, 2024 8:02 pm

The content: function defaults to using a system defined iFilter for that specific filetype/extension.
This is basically for extracting the text part of the file.
For example a html file iFilter will roughly extract the text you see in a browser or preview pane, without all the tags etc.
(my assumption; didn't test it)

To bypass this behaviour, use the utf8content: function instead. This will treat the content as (UTF-8) plain text instead.

Depending on the encoding of your file, you mnight need to replace utf8content: with one of the following:
ansi-content:
ascii-content:
text-plain-content:
utf16-content:
utf16le-content:
utf16becontent:

Post by **void** » Sat May 18, 2024 11:46 pm

Please try content*: instead:

regex:content*:<a.*?</a>[\s\S](*SKIP)(*F)|searchedword

wise_mike · Post by **wise_mike** » Sun May 19, 2024 12:02 am

Thanks guys for your help.

Regarding searching tags, Version 1.5.0.1371a (x64) does find them, and I got results from all my files before.

Re regex:content* and regex*:content, the both give the same results, which are the SearchedWord(s) around <a </a> or not (although finding the "SearchedWord(s)") between quotes in "Content with Count" filter does provide more results, I don't know why.

Post by **void** » Sun May 19, 2024 2:07 am

Please try the following search:

regex:content*:<a.*?</a>(*SKIP)(*F)|searchedword

If this is matching a html file with searchedword inside <a></a> could you please send the html file and the searchedword.

The latest version should handle quoted and unquoted parameters the same.
Everything 1.5.0.1372a adds &paramstart: and &paramend:

voidtools forum

Content search: Find a word not inside link <a> tag ❓

Content search: Find a word not inside link <a> tag ❓

Re: Content search: Find a word not inside link <a> tag ❓

Re: Content search: Find a word not inside link <a> tag ❓

Re: Content search: Find a word not inside link <a> tag ❓

Re: Content search: Find a word not inside link <a> tag ❓