Searching data and getting the answers you need

Searching has become a very common task for computers to provide.  Yet the capabilities of a search may be more powerful than you know.

This article examines the basics of searching and the advanced features you can make the most of.

Some surprising search capabilities exist in tools you use daily that you may never have realised.  Here’s how to make the most of those options.

  • Common tools with advanced search capabilities
  • A bit on the engines
  • Adding weights
  • Wrap up

 

[read more=”Read more” less=”Read less”]

Common tools with advanced searching capabilities

Google

Google is renowned for it’s search capabilities.

The technological advancements they have implemented behind the scenes to make their searching effective is highly secret.

Yet their interface provides some really easy and powerful capabilities.

 

Getting specific

When you want to find something interesting using something and the word interesting will return all results with either of these words.

Put the two words in double quotes and you’ll only get articles with that specific phrase

"something interesting"

 

Occlude results

Firstly we look at the minus character.  Quite simply put a minus before the word and it’s occluded.

Search for the word book.  As part of your results it will give you “booking” as in book flights, hotels, etc.  Now try a look up of

book -booking

You are telling the engine I don’t want these words in the search.  You can add as many as you like

 

Specific sites only

Next, say you want to find all the “technology” articles on Wikipedia.   Just doing a look up for “technology” in google brings back a wide variety of results.

As a result we try

site:wikipedia.org technology

 

Power at your fingertips

There are actually many others you can avail of and the’re handily listed here http://bit.ly/2qyHMtm

 

Outlook

Image from http://bit.ly/2ARyNct

Microsoft have made more modern versions of Outlook work very well with the searching capabilities of Windows.

 

Yet just being able to scan through emails isn’t enough.  Especially if you have a mailbox of thousands.

The task quickly becomes laborious and frustrating.

Outlook 2010 is still a popular version of Outlook according to research from 2017

So knowing how to work with older Outlook version search options is useful.

The key to this capabilities is CAPITALISATION.

 

Logic operators

Logic is very important to computers and like the Google search using – before a word to occlude results, use NOT in Outlook.

joe NOT bloggs

This gives you all the emails with joe in it but not if bloggs appears in the mail.

 

Double quotes around words means only these specifically

"joe bloggs"

The key word AND is also very useful as if you just use joe bloggs, you get all emails with joe or bloggs in it.  AND forces it to have both words somewhere.

joe AND bloggs

The key is using capitals as that tells Outlook, this is a rule not a key word

Here is a comprehensive list from Microsoft themselves.

 

A bit on the engines

How does it work

If you’ve every had any interaction with a database you’ll have heard about the Structured Query Language (SQL).

Also let’s imagine finding a book in a library.  A needle in a haystack of needles, right?

Well librarians solve this problem by having a system for finding information.

Similarly if we consider emails as data in a database, it’s the same as looking for a page in a book in a library.

SQL is the language to chat to the librarian and is our friend.

 

WHERE is it?

SQL has a very rigid core structure which means computer programmers learn the basics of SQL and can use most databases.

Select * From Library

Translates to “give me everything” from the library.

This isn’t always efficient, we often want to limit what we get back so you have the WHERE clause.

Select * From Library Where book_title = "My Little Pony"

This will find all books that exactly match “My Little Pony”

 

I LIKE this

As time progressed the ability to be smarter and do more filtering on searches was required so SQL added the LIKE operator.

Select * From Library Where book_title LIKE "% Little Pony"

This will find all books which have “Little Pony” at the end but could start with anything.  I can find all books with Pony it by using

Select * From Library Where book_title LIKE "% Pony %"

These queries can have many parameters using AND, NOT and OR clauses.

 

Gimme an index

Now with millions of books in the library, if the librarian had to go read the title of every book on the shelf the look up could take a long time.

So a way of speeding things up was applied.

In a library a box of index cards allows the libraries have “lists”.

These lists might be sorted alphabetically, by author, by ISBN or by any amount of other criteria.

If the index is kept up to date and query comes in, then the librarian can find the information far more quickly.

Databases can do massive searches very very quickly if they have an index on the data.

The librarian looks at the cards not check every book.

Similarly the database uses the index to find the data instead of reading all the information.

 

Adding weights

Keeping it handy

Where do you store the umbrella in your office?  Most commonly it’s by the door.

Why by the door?  Because it’s handy and when I need it that’s the spot for it.

If a book is very popular and lots of people are reading it, despite the system saying “store that in the back”, you keep it handy.

 

This “bucket by the door” approach of possibly duplicating data but making things faster is called caching “pronounced cash-ing”.

A cache is defined as to “store away in hiding or for future use.”

 

Consequently the database hides the cache from users but you experience the speed up.

Furthermore people who manage the databases (Database Administrators or DBAs) can configure their cache’s to maximise the speed of response.

 

Reading your mind

When you start searching in Google it will often try to finish your sentence for you.

It will suggest commonly asked expressions… how did it do that?

Quite simply put… the cache.  Someone before you typed this, I made a cache for it so I can answer you quickly.

 

Weighing on your mind

Firstly let’s imagine you are the librarian.

Someone visits your library and wants a book, you suggest 20 books.  They choose “My little pony”.

Next person comes in, you suggest the same 20 books.  They choose “my little pony” as well.

Consequently as the third person comes in… which one will save you time to suggest?

 

This usage statistic is called weighting or ranking.  The higher ranked your book to match the query, the more sensible it is to put it first.

Over time, people’s taste change, evolve and mature, so the rankings change.

The exact rules for how you get to the top of the list in Google are a trade secret but millions try daily to get ranked top in their searches.  This area is called Search Engine Optimisation or SEO.

 

Make my searching easier

The tools at hand

All tools today will have a search feature of some kind and the trick to make them work for you is to learn how they search.

The common choice is “like” or “precision” searching.

Select * From Library Where book_title LIKE "% Pony %"

or

Select * From Library Where book_title = "My Little Pony"

The first example returns every book with pony in it.  The second returns only a book called My Little Pony.

Furthermore some systems will give you a choice between how you search and possible combinations.

Great systems will have the power to let you have indexes on your commonly searched fields to vastly speed up your searches.

 

What lies within

Take any book and you index author, genre, year of publish, ISBN, Dewey Decimal code, a host of stats.

Finding the expression “a needle in a haystack” within all the books is a much harder challenge.

Someone would have to have opened all the books and indexed all the words in the books.

Similarly this is the challenge that belies your computer.  It’s easy to search against file names but searching IN documents and IN emails is a different challenge.

 

Like a cache, modern versions of windows, in quiet times, index the contents of your emails and documents.

Consequently this means when you search it can search contents as well.

This option to search contents can massively slow down your computer, so it’s an option you have to turn on in Windows 7 for example.

You’re setting the rules for your own private librarian.

 

Google Search Appliance

Image from http://bit.ly/2DuRZ2i

For some businesses with mountains of documents to search this is a major issue.

Google to the rescue… sorta.  You can’t put the documents in the public space because that would be a security risk.

 

Instead Google used to come to you.  A Google Search Appliance was a very yellow server you could add to your business.

For around $15,000 your own Google server which can provides the power of Google but just for your business.

As a result you have your own private Google which only your company accesses.

 

Consequently as the power of cloud computing has increased Google has a new cloud version.

Called Cloud Search, the power of Google for your business.

 

 

Wrap up

Searching is the new way of letting technology do the heavy lifting for you

“Search appliance please find this for me.”

Consequently knowing how to improve your ability to communicate with your search tools means getting more precise, relevant answers back faster.

People like computers can help you find the answers you’re looking for faster.  Especially because chatting to a human is sometimes faster than learning to work with the search engines.

Furthermore understanding a little of the SQL language goes a long way and is always worth chatting to your search providers for what can they do to help.

If there’s anything in this article you’d like me to find for you, to chat to me about you can contact me here or on social media.

[/read]

Leave a Reply

Your e-mail address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.