Saturday, 23 January 2010

The 90% Rule - Keeping Keywords Relevant

How do you decide which keywords are appropriate?  Should you include every conceivable word which might be relevant to an image or video, or should you try to keep the number of keywords to a bare minimum?  The answer is the 90% rule.

When undertaking keywording, there is a sure way to unwanted cost and a little madness: try to create the perfect set of keywords.  Apart from the impossibility of attaining perfection itself, the process of trying to reach that goal is normally extremely time consuming as well as pointless.

Instead, attempt to reach a 90% confidence level for each keyword and for the set as a whole.  This is not a scientifically-measurable percentage, but more an acknowledgement that taking the acceptable level back from perfection is an acceptable and attainable result.
 


So the aim of the exercise is to save time, but also to stop the number of keywords becoming so enormous, and the keywords so tenuous, that you can find virtually every image by using virtually any keyword.  The more focused on the essence of the image/video the keywords are, the more effective research will be.

When an archive is small, this problem may seem non-existent, but for companies such as Getty Images, with enormous archives, neglecting this problem is extremely commercially damaging as research will drag in thousands of irrelevant images.

Consider this application of the 90% rule:  In a picture of a couple kissing, you can justifiably argue that a keyword such as "romance" would be expected by 90% of researchers.  Now take a picture of a couple holding hands and walking down the street with shopping bags in their hand, looking a bit glum.  Some people might think that because the couple are holding hands that the word "romance" is still appropriate.  But it can be argued that at least 90% of researchers would be disappointed to find such an image with that keyword.  By the same token, at least 90% of researchers would be happy to find that image with the keyword "shopping".

It is not that this confidence level will find the right image every time, more that it will stop irrelevant padded-out search results.

Likewise, if the number of keywords is too small you are short changing researchers.  In that case, consider if there is another word you can add which 90% of researchers would be happy to use.  So if the word shopping was not used in our example that would be poor keywording as more than 90% of researchers would be likely to expect it.

As mentioned, this is not a matter of science, more of judgment, but it can help you focus you efforts on getting an appropriate set of keywords.

One caveat though is that keywords destined for third parties such as Getty Images, Corbis or Alamy, should always be 100% in compliance with their standards and conventions.

Saturday, 16 January 2010

Clinging to Bad Keywording

Good keywording can transform the searchability of an archive, yet many photo and video libraries and on-line sellers continue to cling to poor keywording as if anything will do.

The level of self-delusion can be dumfounding.  A colleague told me of a recent conversation with an on-line seller who was using product descriptions as keywords.  As you might expect, the descriptions often contained misleading and superfluous words, with little consistency from one product to another due to the fact that the descriptions were written by countless different buyers and merchandisers.

Search results were all over the place, with customers frequently not seeing all the relevant products or having to wade through numerous irrelevant results to find what they were looking for.

Superficially, this was acknowledged, but the person in charge of the web site was more concerned with the look of the site, the search engine software, and many other "sexier" projects.  Good keywording would have to wait at least six months.  If it happens even in a year, it will be a surprise.



Ignoring keywording in favour of look and feel is like deciding you can change your luck when going fishing by getting a new boat, fishing rod and reel, but continuing to use the wrong bait.

This example illustrates one reason why people stick with bad keywording: they're too busy doing other things which are more interesting.

Whilst publicly-available statistics and studies are hard to come by, there is a simple test for people running such sites to see what the value is of putting some energy into keywording rather than changing the look of that banner, or changing ultra-expensive search software.  All they need to do is some mystery shopping on their own site.  They can try various product searches, and look for things such as gifts, products for kids, products by colours and so on, and see how they get on.  It normally takes only a few minutes to work out there's a problem that needs addressing.


The second reason for sticking with bad keywording is that some people see metadata as a problem too big to even begin to tackle.  This is especially true for editorial libraires where the quantity of images is a constant difficulty.

When faced with an avalanche of pictures and videos coming in every week, it is understandable that attention is given to just keeping up.  In that case, keywording by photographers, or very sparse keywording, is the order of the day.  Often captions are relied upon alone.  Which all means that the archive being built will become increasingly hard to search with little depth of search terms, misleading keywords and little consistency.  Ironically, keywording could fix those problems, the very thing libraries feel they don't have the time and resources to do because they're too busy getting the images out to the world.

One solution to this dilemma is to simply cut down on the images or video by putting restrictions on the numbers being submitted, or by editing submissions more strictly.  With so many images in particular being similar, reducing supply can improve the variety of images that come up in searches and make a more attractive lightbox for customers to view. Indeed all functions of a library can improve if there are fewer images and videos to work with.

Meanwhile, timing problems can be addressed by careful outsourcing to a 24/7 keywording company atuned to fast workflow.

The other main reason for clinging to bad keywording is that people fear change - including potentially changing staffing.  Changes in systems and keywording can also be an admission that previous policies were a mistake.

At this point, CEOs and managers of libraries need to have an objective look at what is going on.  Keywording managers and other staff may be too invested in the past to see that there can be a better way of doing things.  Again, outsourcing can help address this problem, even if that means simply employing outside consultants to help reorganise the inhouse keywording.

Saturday, 9 January 2010

Vocabulary - The Never Ending Story

The cornerstone of any consistent keywording system is an excellent vocabulary.  Investing time and energy into creating one is vital.

It is possible - in fact much amateur keywording is carried out this way - to simply look at an image or video and decide what seem like worthwhile words to include.  A template of keyword topics such as appearance, concepts and so on can help this.

But that is a long way short of an organised vocabulary with structure.  And it makes it very difficult to sustain the sort of consistency which will allow researchers to know that when they write in the word "business" that only business images will be returned, and that all business images will be returned.

An organised vocabulary also means that synonyms can be included very easily by linking them together in word strings.  Other organisation patterns, such as word hierarchies going from the specific to the general - eg hibiscus,-flowering plant,-plant, flora - can also be employed.  Although it creating such complex and rigid patterns it is important to realise that speed of amending and improving a vocabulary will be compromised, as will fast inputting.


 
One thing is for sure - any keywording vocabulary will fall well short of all the words necessary to keyword every image or video in existence.  The Oxford Dictionary contains more than 600,000 words, with about 2,500 being added every three months.  Just keeping up with those words is difficult enough, but there are literally millions of proper nouns, names of species, people's names, brands, slang words and so on.

More complex vocabularies also contain compound terms and phrases such as "beauty in nature".  Once the combinations of synonyms and so on are in the mix, possible keywords and keyword strings which could find their way into a vocabulary are essentially endless.

So the idea that a vocabulary is set in stone, or that the job of creating one is ever finished, is simply an illusion.

If creating your own vocabulary then, there are four important factors to bear in mind:

1. Structure - Will you use a hierarchy, a simple word list, or word strings?

2. Depth - How far will you go in terms of levels of synonyms, numbers of words and so on?  Every vocabulary is a compromise, so it's important to decide what that compromise will be.  In the case of specialist archives these decisions may be easier as there are fewer words to deal with, but a general vocabulary is harder nut to crack.

3. Ease of Updating - Every good vocabulary is altered, improved and added to on a regular basis.  If it is tricky to make changes, this maintenance can become an intolerable burden.

4.  Ease of Use - All the words in the world are of little use if it is difficult to find them and add them to the keywords field.  So think carefully about how your vocabulary fits with your input software and vice versa.  It may be a case that it is more efficient for you to change your inputting methods than your vocabulary.

Finally, when constructing a vocabulary, try to keep the end use in mind.  Highly-structured vocabularies that a librarian might use are unlikely to work for databases being searched by members of the public who need a more intuitive system.

Wednesday, 23 December 2009

Dreaming af a Beach Christmas!!

Merry Christmas and Happy Holidays from Keedup in New Zealand



Our next post will be in the New Year