I'm a bit lost - PDF metadata

Discussion related to "Everything" 1.5 Alpha.
Post Reply
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

I'm a bit lost - PDF metadata

Post by Michi »

Hi!

I've done some testing with PDF properties and assigned some metadata to a pdf file.

While using Directory Opus this looks like this:
Directory Opus columents
Directory Opus columents
directory opus.jpg (34.39 KiB) Viewed 4321 times
Searching in ET, added some columns, this file looks like:
ET search
ET search
et.jpg (43 KiB) Viewed 4321 times
I'm missing the title, the subject and the tags.

When opening the pdf in Adobe Reader, looking at the file properties, I see:
Adobe Reader
Adobe Reader
adobe.jpg (101.15 KiB) Viewed 4321 times
ET is configured to index properties in Options-Properties: Comment, Description, Subject, Tags, Title for *.pdf files, so I wonder why I'm missing at least some of those properties?
void
Developer
Posts: 16755
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

Thank you for the bug report Michi,

Could you please send me your test pdf to support@voidtools.com

I'll look into this issue.

I'm wondering if indexing these properties is causing the issue..
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

I would not say it's a bug, rather an issue right now :D

I assume that Directory Opus stores the metadata somewhere else. At least the comment is saved to an ADS as Leo from DO support team mentioned. Only the tags and the subject found their way into the PDF.

Nevertheless, ET did only index the comment so far, previously edited by using Directory Opus.

I just did an additional test, modified the subject and the tags again and had a look into Index Journal of ET:
index journal.jpg
index journal.jpg (54.06 KiB) Viewed 4265 times
Guess, ET recognized the file change. However, in the search result, still the comment is visible.

I've sent you the PDF file I'm currently testing on...

Thanks for looking into!
Michael
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

Just to be clear: my additional modification, that was recorded by the journal, was correctly index by ET, however only the modified comment was indexed again - nothing else...
tuska
Posts: 1052
Joined: Thu Jul 13, 2017 9:14 am

Re: I'm a bit lost - PDF metadata

Post by tuska »

Maybe this picture can help: Picture 3. -> Extended.
The pdf document was opened in Adobe Acrobat 11.0.23 to display the document properties.
 
2022-10-25_PDF metadata_Properties.png
2022-10-25_PDF metadata_Properties.png (127.3 KiB) Viewed 4255 times
...
void
Developer
Posts: 16755
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

Thank you for sending the test PDF.

The metadata is being stored as XMP instead of PDF metadata.
Everything currently only looks at the PDF metadata.

The PDF is using Cross-reference streams.
Everything doesn't support Cross-reference streams yet.

I am guessing Windows Explorer also does not show any metadata under Properties -> Details for this PDF?
Everything will fall back to the system to gather properties for the PDF.
In Everything, you should see the same as Windows Explorer.

I am looking into adding native XMP support and Cross-reference stream support.
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

Hi! Just got your e-mail and your answer! Many thanks for this!

To be fair, this document was the one and only I've seen so far with this issue, so it's certainly not a big problem. Anyway, I will look forward for your enhancement which will again bring ET one step ahead of others similar tools!
Michi
Posts: 69
Joined: Thu Jul 28, 2022 9:23 am

Re: I'm a bit lost - PDF metadata

Post by Michi »

void wrote: Tue Oct 25, 2022 11:17 pm I am guessing Windows Explorer also does not show any metadata under Properties -> Details for this PDF?
Yes, exactly!

With Adobe Reader installed, Explorer does not show even a single metadata property. However, Microsoft Edge is configured as the primary PDF viewer, if this matters.
pdf property.jpg
pdf property.jpg (29.96 KiB) Viewed 4209 times
pdf details.jpg
pdf details.jpg (53 KiB) Viewed 4209 times
David.P
Posts: 200
Joined: Fri May 29, 2020 3:22 pm

Re: I'm a bit lost - PDF metadata

Post by David.P »

Just discovered this interesting thread, which I'd like to join right away.

I am also looking for a way to search metadata in PDF files, e.g. the ones shown below:
Image

Have I set this up correctly below in Everything's options (using "Producer" as an example)?
Image

The file extensions *.mkv;*.mp4;*.avi;*.flv;*.webm were present by default, and I only added *.pdf.

Does Everything need to read the entire file contents of all corresponding files for indexing metadata? I'm asking because I have a few 100 GB of files on a slow remote Windows server over VPN.
void
Developer
Posts: 16755
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

Does Everything need to read the entire file contents of all corresponding files for indexing metadata?
No.
Properties with the type: "Metadata" are stored in a file header.
Everything will only read the file header. (not the entire file content)



Properties with the type: "Content" will read the entire file content.
David.P
Posts: 200
Joined: Fri May 29, 2020 3:22 pm

Re: I'm a bit lost - PDF metadata

Post by David.P »

Thanks very much
o75steve
Posts: 3
Joined: Sat Feb 24, 2024 5:29 pm

Re: I'm a bit lost - PDF metadata

Post by o75steve »

Hi,
I found this topic through a search on the forum topics & it is connected to my question :
I'm trying to add the property "Created" from the PDF Metadata to the Properties in Everything.
I tried all properties with Created in the list of Properties in Everything, but it's not there.
It's also missing from the Application/pdf properties that are listed.
Is it possible to add it in a next version ?
I would need it to get the original dates when files were created. I got the files through a shared file system, and the dates created & modified from Windows Explorer are simply the dates when I copied them, not the dates when the files were originally created..
Many thanks beforehand!
void
Developer
Posts: 16755
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

Everything does not have a PDF created property.

The closest thing would be Date Created.
(When the file was created on your PC).

Date Created can be indexed under Tools -> Options -> Indexes -> Index Date Created.



Is the Original Date Created property available in your PDF viewer or Windows Explorer?

Please check all property system properties:
In Everything, select your PDF file.
Right click the result list column header and click Add column...
Select Windows Property System on the left.
Right click the property list column header and check Preview.
Go through all property values.
Are any useful dates listed?
o75steve
Posts: 3
Joined: Sat Feb 24, 2024 5:29 pm

Re: I'm a bit lost - PDF metadata

Post by o75steve »

The preview is quite handy, I was looking for something like it yesterday.
One thing that's useful is that you can sort on the preview, so that allows you to go through the available values even faster.
Unfortunately, the property I'm looking for isn't in either Windows Property System or the Everything Properties.

So the Original Date Created property is available in my PDF viewer. When I open it and go to Document Properties, under Description and then Document Info, it shows me the field "Created".

image.png
image.png (428.77 KiB) Viewed 1070 times
image.png
image.png (183.37 KiB) Viewed 1070 times
Looking further, it appears to be in the XMP Core Properties.
image.png
image.png (138.5 KiB) Viewed 1070 times
Would be great if that could be added to the properties in some future version.. :-)
void
Developer
Posts: 16755
Joined: Fri Oct 16, 2009 11:31 pm

Re: I'm a bit lost - PDF metadata

Post by void »

I have put on my TODO list to add support for PDF xml metadata.

Thank you for the suggestion.
Post Reply