some xerox photocopiers had a “feature” where the copier would use OCR to scan the document (by default even) and then replace multiple instances of one letter (which may have all printed slightly differently before) with one instance of the same letter exactly, just copy+pasted into different places. if you were lucky it would even replace a “d” for example with a “b”. the resulting documents look tampered with, not to mention that it breaks the perfect paper trail your organization is supposed to have.
I think it wasn’t even an OCR bug per se, it was a bug in the image compression algorithm implementation. “Yeah these squiggles on the paper look basically the same, guess we’ll save space here.”
OCR would have at least been mitigated by the fact that you could see if the text didn’t match the image. And since OCR isn’t perfect anyway, you could even anticipate that. But if the image is screwed up, well, what do you do then?
some xerox photocopiers had a “feature” where the copier would use OCR to scan the document (by default even) and then replace multiple instances of one letter (which may have all printed slightly differently before) with one instance of the same letter exactly, just copy+pasted into different places. if you were lucky it would even replace a “d” for example with a “b”. the resulting documents look tampered with, not to mention that it breaks the perfect paper trail your organization is supposed to have.
I think it wasn’t even an OCR bug per se, it was a bug in the image compression algorithm implementation. “Yeah these squiggles on the paper look basically the same, guess we’ll save space here.”
OCR would have at least been mitigated by the fact that you could see if the text didn’t match the image. And since OCR isn’t perfect anyway, you could even anticipate that. But if the image is screwed up, well, what do you do then?
…it was a legitimate issue with scanned technical drawings unpredictably changing dimensions…