This is just a short reminder that people have to be mindful not only to search for terms using the correct spelling, but also to become aware of which letters are confused with other letters. But not only obvious ones like dalet and resh. There are many such letters, such as ayin and mem.
For example, a search as follows / site:hebrewbooks.org ימקב / returns no less than 11,400 results. To be clear, the word we would like to see in the results is "יעקב" not "ימקב." But presumably many, if not all, these 11,400 would just not show up if you searched for them properly spelled. Six-hundred seventy-four books is the results returned from a search for ימקב on Otzar HaHochma (which actually has some pretty sophisticated advanced search features which takes some of this into account).
Chet and mem are commonly confused as well. Hebrewbooks.org returns 167 results for "תלמיד מכם" and Google Books returned 74 results. Actually, I was pleasantly surprised to see that Google actually "asked" if I meant to search for "תלמיד חכם."
Most of the time these don't matter, but it would matter if that one result you need doesn't turn up, wouldn't it? Same thing in English and other languages. Google Books seems to confuse u and n, for example.
It would actually be a good idea to compile some kind of list of letters which search engines commonly confuse for the purpose of OCR dependent searching.
Subscribe to:
Post Comments (Atom)