Sentence no 1. Sentence no 2. Sentence no 3.
5-10 Minutes to read.
Introduction
Using the J1 Search module for a website, it is not longer needed to integrate complex external search engines like Bing or Google into your web. Searching a website using QuickSearch is s little different from Internet search engines.
| The document content is collected from the HTML body element |
Search platforms are using complex algorithms to provide a simple interface but require a lot of artificial intelligence methods to make sense out of a handful of words given for a search and to inject advertising elements for their customers.
Nevertheless, the J1 implementation of Lunar, is simple like searching at Google or Bing but offers additional features to do searches more specifically. QuickSearch provides an easy-to-use query language for better results with no advertising included.
Core concepts
Understanding some of the concepts and terminology that search engine of J1 Template, will allow users to provide powerful search expressions to get more relevant search results.
Indexing documents
QuickSearch offers searches on all documents of the website generated by J1. Advantage: no internet access is needed for searches. Searches are based on a pre-build local full-text index loaded by the browser for all pages. The index for a site is generated by the Jekyll index plugin lunr_index.rb located in the plugins folder _plugins.
| The full-text index for the search engine is always generated by Jekyll at build-time. |
Documents
The searchable data in an index is organized as documents containing the text and the words you want to search on. A document is a JSON-based data set with fields that are processed to create the result list for a search.
In this document, there are several fields, like title, tagline, or description, that could be used for full-text searches. But additional fields are available, like tags or categories that can be used for more specific searches.
To do a simple full-text search as well as more specific searches, the search core engine Lunar offers a query language, a DSL (domain-specific language). Find more about QuickSearch|Lunr DSL queries with the section [Searching].
Scoring
The relevance, the score, is calculated based on an algorithm called BM25, along with other factors. You don’t need to worry too much about the details of how this technique works. To summarize: the more a search term occurs in a document, the more that term will increase that documents' score, but th more a search term occurs in the overall collection of documents, the less that term will increase a document’s score. In other words: seldom words count and increase the score.
Scoring information generated by the BM25 algorithm is added to the search index and allows a very fast calculation of the relevance of documents for queries.
Imagine you’re website contains documents about Jekyll. The term Jekyll may occur very frequently throughout the entire website. Used quite often for the content. So finding a document that mentions the term Jekyll isn’t very significant for a search.
However, if you’re searching for Jekyll Generator, only some documents of the website has the word Generator in them, and that will bring the score, the relevance, for documents having both words in them at a higher level, and bring them higher up in the search results.
Matching and scoring are used by all search engines - the same as for J1 QuickSearch. You’ll see for QuickSearch a similar behavior in sorting search results as you already know from commercial internet search engines like Google: the top results are the more relevant ones.