Thursday, June 27, 2013

Problems with Stop Words

     What will you find when you Search for Khaled Hosseini’s new book, And the Mountains Echoed, in a library catalog?  Chances are you will retrieve quite a number of hits about ‘mountains’ and ‘echo’ but not the novel itself. Why is that? The book exists. The terms you entered are valid. Why can’t you find it? The problem is related to the words ‘and’ and ‘the’, which unfortunately appear at the beginning of the title. Those words are called ‘stop words’. Stop words are very common words that normally add little meaning to the subject content of the document being indexed. Most of the stop words are there to make a sentence grammatically correct. If you ignore them, a sentence will still make sense, somewhat. However, the problem is that search engines are ignoring them too.

     Most search engines do not index stop words in order to save disk space, to make searching more efficient, and to reduce result pollution. Some search engines might replace them with what is called a marker.

Consider this sentence:
                It is an unforgettable novel about finding a lost piece of yourself in someone else.

There are 7 stop words in this sentence; it, is, an, about, a, of, in. The sentence would be stored like this:
                *** unforgettable novel * finding * lost piece * yourself * someone else.

     To speed up the search process, search engines do not search for certain terms in order to save time. Consider the title And the Mountains Echoed.  A search engine will look for ‘mountains’ and ‘echoed’. To save time, it will most likely exclude terms that it considers too common, such as ‘and’ and ‘the’.

     So what should a researcher do short of asking every author in the world not to use stop words in the titles of their books? One way to avoid this problem is by entering search phrases as search engines are programmed to understand.  You can accomplish this by using markers where stop words appear.  Instead of And the Mountains Echoed, enter ** Mountains Echoed. Or skip certain words entirely; Dark and Deadly Pool instead of The Dark and Deadly Pool. You can also put the phrase in quotation marks for the search engine to search the exact phrase.

     Fortunately, database publishers understand this problem, and they are doing something about it. Some add extra script to let search engines be aware of certain terms. Some create a stop list, which is a list of stop words, and apply it to the content indexed in the database. However, this problem is still looming and will not go away entirely. The other day Mel, our cataloger, could not find a record for And the Mountains Echoed in the library catalog.  His search turned into a definite stop word dilemma!

Thursday, June 13, 2013

Old School Punctuation Rule

What does the MLA Style Manual say about the amount of spaces to insert after each punctuation mark?  When I first saw this question come across the ALA listserv, I quickly dismissed it.  How much can librarians write about a topic such as this?  I must admit that I soon realized my knowledge and understanding of this rule was far from accurate.

I am a two spaces woman.  As I write, I punctuate each sentence and insert two spaces before beginning my next sentence.  In my eyes, it is clean and clearly delineates the beginning and ending of each sentence.  When I took typing in high school, we were told about this rule and it stuck with me for, I guess I will admit it…, forever.  Why question such a functional, noninvasive rule?

As I read the posts on today’s listserv, I was amazed to find out the current MLA Style calls for simply one space after each punctuation mark.  One space…  How can that be?  This bit of information from the Grammar Girl website further clarifies some of the reasoning behind this punctuation shift:

“Here's the deal: Most typewriter fonts are called monospaced fonts. That means every character takes up the same amount of space. An "i" takes up as much space as an "m," for example. When using a monospaced font, where everything is the same width, it makes sense to type two spaces after a period at the end of a sentence to create a visual break. For that reason, people who learned to type on a typewriter were taught to put two spaces after a period at the end of a sentence. 

But when you're typing on a computer, most fonts are proportional fonts, which means that characters are different widths.  An "i" is more  narrow than an "m," for example, and putting extra space between sentences doesn't do anything to improve readability.”

So, the rule is quite clear.  If writing a paper using MLA Style, insert one space only after each punctuation mark!

Just out of curiosity… did anyone check the amount of spaces I inserted between sentences in this blog entry?  If you counted 2, you are correct!  By the way, APA Style still calls for the standard two spaces between sentences!

Michele Gregg

Wednesday, June 5, 2013

Library welcomes Kultida Dunagin

Kultida Dunagin is our new electronic resources librarian.    Previously she worked for the University Center of Greenville where she was a librarian for 15 years and  Spartanburg Methodist College from 1993-1998.

Kultida  began her academic career by earning a bachelor's degree in English from Chulalongkorn Univeristy in Bangkok, Thailand.  From there she earned her first masters which is in Theatre Arts from California State University in Los Angeles.  Following that she moved to the University of North Texas for a PhD in College Teaching. When she moved to South Carolina she decided  that the library profession is where her passion was and earned a second masters  in Library and Information Science from the University of South Carolina. 

Kultida is fluent in English, Thai, and French.  She translates novels, such as books by Ken  Follett, from English into Thai for a publisher in Thailand.  She has authored a number of books in Thai including one on President Obama. Her PhD Dissertation is on the "Cultural Identity in Thai Movies and its Implications for the Study of Film in Thailand."

Kultida and her daughter Niki live in Spartanburg.