You are here: PSPad forum > English discussion forum > Copy & Paste - problem

Copy & Paste - problem

#1 Copy & Paste - problem

Posted by: maki | Date: 2020-06-23 19:02 | IP: IP Logged

What to do if the PDF file cannot be correctly copied (copy / paste text) Polish or Russian characters (All Non-English word)
I want to copy to any editor or PSPad.

Example copy text (corrupted some polish letters)

yczliwe, oparte na Biblii propozycje i rady mog si
znacznie przyczyni ´
c do twych postp ´
ow, nawet gdyby ´
s
uczszczał do tej szkoły od wielu lat (Prz. 1:5).
Czy chciałby ´
s robi ´
c szybsze postpy? Bdzie to mo ˙
zliwe, je ´
sli wy

Example wrong character:
=============
[Main Instruction]
´
U+00B4
Środkowoeuropejski (Windows): 0xB4

[Content]
ACUTE ACCENT
========

Edited 1 time(s). Last edit at 2020-06-23 19:04 by maki.

Options: Reply | Quote | Up ^


#2 Re: Copy & Paste - problem

Posted by: pspad | Date: 2020-06-23 19:10 | IP: IP Logged

It's not PSPad related question. Problem is in your PDF. PDF is final exported document for print, not for back work. Some characters can be "painted" e.t.c.

The solution is to find another PDF, use OCR tool like Abbyy Fine reader

Options: Reply | Quote | Up ^


#3 Re: Copy & Paste - problem

Posted by: maki | Date: 2020-06-24 07:12 | IP: IP Logged

I have never in my life managed to extract any text from any pdf file with any "OCR" program.
Always but always the text is damaged. And I tested it on thousands of various pdf files.
I used dozens of programs, also professional, with unicode support, and always the result - deplorable!
This is unbelievable! Why this happens?!

Options: Reply | Quote | Up ^


#4 Re: Copy & Paste - problem

Posted by: pspad | Date: 2020-06-24 08:12 | IP: IP Logged

Because PDF is final format for print - it looks like same on all platforms.
PDF isn't used for additional work with text.
Characters cannot be presented as characters, but as glyphs or simple characters with painted accent or...

Options: Reply | Quote | Up ^


#5 Re: Copy & Paste - problem

Posted by: maki | Date: 2020-06-24 11:37 | IP: IP Logged

What praise the well-known "professional" and paid software, which in fact is doomed to a bad result of scanned documents, books, etc. (And I have a high quality scanner).
Let's be serious not only about the pdf format, or even epub, or other popular formats.
I wrote to the company, gave examples, and did not receive any answer, no help, only the possibility of reimbursement of costs incurred if I made a purchase.
It proves that OCR does not work. This is simple propaganda.

The scanned OCR document looks like it has more than a tornado of special Unicode characters that the world has invented.
Even the best hieroglyphs translator would not be able to understand the scanned document. :D

www.abbyy.com

Edited 4 time(s). Last edit at 2020-06-24 11:43 by maki.

Options: Reply | Quote | Up ^


#6 Re: Copy & Paste - problem

Posted by: pspad | Date: 2020-06-24 12:08 | IP: IP Logged

OCR work, I use it personally.
Do as you wish.

I mark this theme as closed. You got explanation, you got solution offer.

Options: Reply | Quote | Up ^


This Thread has been closed





Editor PSPad - freeware editor, © 2001 - 2020 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR