May 18, 2016 at 5:34 am #166371
We have an issue when creating PDF/A’s from existing PDF documents.
We create several documents using MS-Word, Excel, CAD applications etc. When then create PDF’s by either using the built-in PDF export/save as function (e.g. within Word 2013), or we use CutePDF Writer which is used as a PDF ‘printer’.
The PDF’s are then merged together using CutePDF Professional. The resulting PDF is fully searchable.
We then use Adobe Acrobat XI to convert the PDF to a PDF/A standard file.
After doing this, although the file looks fine, the document is not searchable which defeats the point of creating an archival version of the file. When you copy text from the PDF and paste it into another document you get garbage: $
@ . +++
*:= ;6( )-
If you try to search, nothing is found because the text is not as shown, but instead comprises the characters seen above.
Does anyone know how to fix this, please?
I’ve already asked on Adobe’s forums but have had no response in the month since the question was originally posted.
I forgot to mention that if we try to use an older version of Acrobat (9), the PDF’s are converted fine – except for one so far!
AnonymousMay 22, 2016 at 5:37 am #371929
SiIly question, but if you’ve got Adobe Acrobat XI, why use CutePDF at all? It sounds like there may be compatibility issues between the various applications. Have you tried doing everything after the ‘Office – save as’ step with only Adobe XI? It may be related to the initial export from the Office apps, or anything in between. Try to remove as many variables as you can to see if the end behavior changes. Have you verified that the searchable behavior works when just the ‘Office – save as’ is done, and then the opening the file in Adobe Reader DC, for example?May 23, 2016 at 8:43 am #337124
We only have one license for Acrobat.
We have lots of staff creating these reports so they use CutePDF to create the PDF versions. The person who uses the computer with Acrobat on it does not have time to do this. This is why we purchased the cheaper CutePDF Professional program for the other staff. It has worked fine for many years.
It is just during the last few months that this issue has surfaced and it only affects a few PDF’s (< < apologies – important info I left out. Sorry).
So, the reports are created using CutePDF Professional and all of them are searchable. When using Acrobat to convert them to PDF/A most are searchable but a few are not.
Every PDF/A created using Acrobat V11 is now not searchable. Just a few created using V9 are not searchable.
I have read that this can be caused by the font not being properly referenced and that the PDF/A process cannot convert it because of a lack of metadata. It is this lack that produces the garbage when copying/pasting etc.
AnonymousMay 23, 2016 at 12:55 pm #371930
I’ve just had a quick look at the definition of ‘PDF/A’ and it includes this: “PDF/A differs from PDF by prohibiting features ill-suited to long-term archiving, such as font linking (as opposed to font embedding). The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.” If the expected metadata about the fonts is missing, is it because Word never put it in, or because CutePDF doesn’t support it? How certain are you that all the necessary prefs are set correctly in the Acrobat XI install you’re using? How long ago did you install Acrobat XI vs how long ago did this issue begin to manifest?
When you say that “…reports are created using CutePDF Professional and all of them are searchable”, did you mean searchable by CutePDF, or by Adobe Reader, or by Acrobat XI before the PDF/A creation, or what? I get that at some stage it works but then stops working, the trick is nailing down the where/when it stops. Have you tried an alternate font on any doc which turns out non-searchable the first time? Were any of the V9 non-searchable results using the same fonts as the XI version results?
Hold the press: Just had a look for this kind of behavior, and found a few that talked about fonts. The security in Adobe was such that it should be searchable, but wasn’t. Try a non-searchable doc in Acrobat XI: go to ‘Tools – Edit document text’ then on each page Ctrl-A (select all) and pick a different font. Then save as the PDF/A and try the search again. Report back.May 24, 2016 at 9:04 am #337126
Thanks a lot, RicklesP. I will give that a go – am bogged down with other things at the moment but will get back to you within a couple of days.May 25, 2016 at 7:54 am #337127
Thanks for the suggestion but that has not worked. I’ve been looking into this a bit more and can’t understand what is happening. I’m going to have to check everyone’s settings – I just converted one of the problematic report documents to PDF, converted that to PDF/a and it is fine. Having read much more about it, I’m thinking this problem stems from the initial print job (substitute device font/download softfont etc.). I’ll report back if I solve it.
AnonymousMay 25, 2016 at 12:54 pm #371932
We await with baited breath….
vishnusuryawanshiMemberAugust 8, 2018 at 7:59 am #392077
It depends on what Adobe Acrobat are you using on your halfway as well. You need Pro version in order to make the file OCR compatible with other software. You might well try to look at it through this app https://edit-pdf.pdffiller.com/ it’s a paid one yet there’s a free trial to test your files. If they are searchable there, they will be searchable for every other toolAugust 8, 2018 at 8:15 am #337406
Nothing like a necropost to jog the memory. Since my last post we replaced our main printer and added OCR functionality. We found the PDF’s were not searchable. When I spoke their tech support the tech guy suggested changing the page description language from PCL to KPDL (K for Kyocera). This worked. As it was the default printer for most of our systems, it sorted out the Acrobat issue as well.
You must be logged in to reply to this topic.