Wednesday, October 05, 2011

Highlighting hits in context with dtSearch

I Programmer - Hit Highlighting with dtSearch

"What do you do with your search results after you have obtained them? We explore hit highlighting with dtSearch and C#.

In the first part of my exploration of the search and indexing system dtSearch, I covered the basic principles of operation. Now we consider what to do next once you have some search results.

What do you do with your search results after you have obtained them?

It is a good question. In many cases it is enough to simply list the files that contain the hits, but what if your users want to look inside the files and see where the hits have occurred. This is a nightmare of a job if you have to start from scratch. All those file formats and then there is the bother of finding out how to highlight the hits in each format.

No - it probably isn't worth the effort.

Converting file formats

The good news is that if you are using dtSearch, which you can try for yourself by downloading the 30-day evaluation from, you can use a range of file and container parser and tap into the standard system for reading different file formats that Microsoft has implemented - IFilter.

As long as there is an IFilter for the document format you want to work with then the procedure that I'm about to describe will work without modification. If not you can write or have written for you a file parser. However, the standard range of file parsers supports most of the applications you are likely to encounter. The file parsers are used by the indexing engine to look inside each document and it is also used by the FileConverter object to allow you to process documents into a standard format so that the results of searches can be presented to users.

Getting Started


Saw this and depending on your industry, search hits in context can be an important feature and something that's not the easiest to execute (especially if you're talking about "native" files like Doc's, etc)


Related Past Post XRef:
dtSearch. You heard about it, you've seen it advertised, now see how to get started developing with it...
dtSearch: Not Dead. Not Yet. [Written in 2005 and it's still around and viable so I guess it's really not]

SQL Server 2008 Full Text Search Best Practices from the SQL CAT Team
SQL Full Text Search: IFilters or Indexing Filters used with SQL FTS...

Office 2007 IFilter Pack

That’s a hOOt! A from scratch, C# based full text indexer and search engine
Lucene.Net & C# Indexing and Searching WinForm Example
"Converting PDF to Text in C#" with PDFBox/IVKM.Net
Desktop Search Application: Part 1 [Search Office Doc's With DotLucene]

No comments: