This search engine software is designed to organize searches across various platforms and handle small amounts of data. It comes with an in-built database eliminating the need for additional software but lacks support for SQL servers.
One of the most significant changes made to MnoGoSearch Lite is the use of a new generation engine 3.2 which includes a better storage engine. This new engine utilizes the SQLite() engine which enables the features previously available only in the SQL and Pro versions to work in the Lite version. These features include incremental indexing, smart reindexing, and subsection control.
MnoGoSearch Lite also features better internationalization capabilities, with support for all modern and widely used character sets. This includes multibyte character sets for languages like Chinese, Japanese and Korean, as well as Unicode UTF-8 character set.
Templates in search.htm language have also been significantly improved to include features such as conditional operators, external includes, and simple or extended search modes. The cached copy feature enables users to view a snapshot of each page as it looked when indexer processed it, with search words highlighted. Smart excerpts allow relevant excerpts containing the query words to be fetched during searches.
Results ordering is now possible using relevancy or date results order, with a reverse order option available. Content encoding has also been updated to support gzip, compress and deflate HTTP compression. This significantly reduces traffic for HTTP servers supporting compression.
Finally, custom META tags can now be defined for indexing purposes, whereas previous versions only supported the indexing of Description and Keywords META tags. Overall, MnoGoSearch Lite is a reliable and efficient search engine, especially for larger sites with a higher volume of data.
Version 3.2.42.1:
* Fetching of a list of indexer targets is now much faster with
big MySQL databases. The SQL query for target look up was
rewritten to creating a temporary table and its further use in a join with the "url" table. It allowed to get rid of a filesort with a huge amount of data. Also, a key using free columns (next_index_time,seed,rec_id) was added for the "url" table, to allow index read instead of full table scan.
* MinCoordFactor improvements were made to be more
distinguishable for small word positions, i.e. in the very beginning of the document.
* Fixed that User.Date with YYYY-MM-YY and DD.MM.YYYY formats gave wrong month value.
* Crash with long words in synonym look up code was fixed.
* Fixed that after processing a "mailto:" or a disallowed link, indexer wouldn't store crosswords from any further links on the same document:
Version 3.2.39.1:
* Fetching of a list of indexer targets is now much faster with
big MySQL databases. The SQL query for target look up was
rewritten to creating a temporary table and its further use in a join with the "url" table. It allowed to get rid of a filesort with a huge amount of data. Also, a key using free columns (next_index_time,seed,rec_id) was added for the "url" table, to allow index read instead of full table scan.
* MinCoordFactor improvements were made to be more
distinguishable for small word positions, i.e. in the very beginning of the document.
* Fixed that User.Date with YYYY-MM-YY and DD.MM.YYYY formats gave wrong month value.
* Crash with long words in synonym look up code was fixed.
* Fixed that after processing a "mailto:" or a disallowed link, indexer wouldn't store crosswords from any further links on the same document:
Version 3.2.38.1:
* Fetching of a list of indexer targets is now much faster with
big MySQL databases. The SQL query for target look up was
rewritten to creating a temporary table and its further use in a join with the "url" table. It allowed to get rid of a filesort with a huge amount of data. Also, a key using free columns (next_index_time,seed,rec_id) was added for the "url" table, to allow index read instead of full table scan.
* MinCoordFactor improvements were made to be more
distinguishable for small word positions, i.e. in the very beginning of the document.
* Fixed that User.Date with YYYY-MM-YY and DD.MM.YYYY formats gave wrong month value.
* Crash with long words in synonym look up code was fixed.
* Fixed that after processing a "mailto:" or a disallowed link, indexer wouldn't store crosswords from any further links on the same document:
Version 3.2.36.1:
#indexer now supports DBMode=blob, which is now the fastest
DBMode for both indexing and searching (Pro only).
#It's now possible to use variables in an external parser
command line. This example passes URL and TAG values in
the parser command line: Mime "text/pdf" "text/plain"
"/path/to/parser -u ${URL} -t ${TAG}" See the list of all
available variables in "indexer -v6" output, in the lines
beginning with "Response." prefix.
#"SQLWordForms sql" search.htm command was added. It
intorduces a new fuzzy search method allowing to load
synonyms or word forms from the SQL database. It can be
used as a faster replacement for Synonym and Ispell
fuzzy search methods.
#Synonym files now understand "Mode: reverse" and "Mode:
oneway" commands to change word expansion behaviour between
"all words exapand to all words on the same line" and "only
the leftmost word expands to other words on the same line".
#"NumWordFactor num" search.htm command was added, where num
is between 0 and 255. It specifies how much the number of
found words in a document affects its final score. 255 means
maxinum effect, 0 means ignore the count of found words.
#"MinCoordFactor num" search.htm command was added. Use this
command to give more score for those documents having the
first found word closer to the beginning of the document.
Use with a number between 0 and 255. The default value is 0,
which means no effect.
#"URLDataThreshold num" search.htm command was added. It
allows to improve search performance with DBMode=blob for
the queries returning a small number of results (not more
than several hundreds). If search returns less than "num"
documents, full URL information is not loaded from the "bdict"
table and the "url" table is used instead. The default
value is 0, which means always read URL data from the "bdict"
table. Find the number which is good for your installation
experimentally.
#"UseNumericOperators yes/no" search.htm command was added.
#New character set name aliases were added: "armscii8",
"koi8r", "koi8u" and "ujis", for MySQL names compatibility.
#Fixed that XML character set declaration was not processed,
#Fixed that query tracking didn't work with Firebird,
Mimer, Sybase (Bug#742).
#Fixed that "crossdict" table wasn't created for
Mimer and Interbase/Firebird (Bug#748).
#Fixed that $(PerSite) value was calculated incorrectly with
several DBAddr search.htm commands.
Version 3.2.35.1:
Fixed that msvcrtd.dll is not required anymore (Bug#927).
A new "wtime" column was added into "qtrack" table to
store time spent for search, in milliseconds. Everyone
who uses "trackquery" feature needs add this column
(e.g. using ALTER TABLE) or recreate "qtrack" table.
IndexIf/NoIndexIf now understand variables, e.g. the
following command means not to index documents having
content type "text/plain" from the site 'site':
NoIndexIf "${URL}#${Content-Type}" "http://site/#text/plain"
indexer and search.cgi now load my.cnf file by default.
Use "DBAddr mysql://user:passwd@host/dbname/?MyCnfGroup=group"
to read options from the named group. If MyCnfGroup=no
is specified, then the option file is not loaded (Bug#771).
"DateFactor number" search.htm command was added.
Use with a number in the range 0..255 to change effect
of Last-Modified of a document on its score. The default
value is 0, which means don't take Last-Modified into account.
If DateFactor is set to a non-zero value, then a more fresh
document gets better score than an older document with the same content.
Indexer now treats the documents having "xml" and "rss"
substrings in Content-Type header as XML documents.
E.g. "application/xml", "application/rss" are now understood
as XML as well. Previously only the exact "text/xml" string worked.
DBType=blob now works with PostgreSQL.
"Deflate" DBAddr parameter was added into indexer.conf,
With "Deflate=yes" specified, indexer compresses data
when converting with "indexer -Eblob", which makes a
smaller database size and faster search.
It is possible to rewrite only URL data for
DBMode=blob: "indexer -Erewriteurl". It's useful for
very quick rewrite of URL data after adding "Deflate=yes",
without touching word information.
CustomLog indexer.conf command was added to log to stdout
using a user defined format, e.g.:
"CustomLog '[${PID}] ${CurrentTime} ${Status} ${URL} ${Content-Type}'".
Several minor search performance improvements were made.
Several bugs in "AlwaysFoundWord" were fixed.
Fixed that loading URL data in "DBMode=blob" didn't
work on big endian platforms (e.g. MacOS X). As a result search loaded data from "url" table, which was slow.
Fixed that "Section url.file" and "Section url.path"
didn't work well when indexing FTP sites having national
letters in directory and file names (Bug#658).
Directory and file names (after %XX URL-unescaping) considered
to have the same character set with the one specified in
RemoteCharset (or iso-8859-1 by default). A new indexer.conf
command "RemoteFileNameCharset" was added for the case when
URL character set is different from RemoteCharset.
Fixed that MySQL-4.1 running in utf8 failed to create
"qinfo" table with "Specified key was too long
Version 3.2.34.1:
Fixed that changes in "Length" parameter on "Sections" tabdiappeared after reopening indexer.conf and length was alwaysset to the default value (256).
Per session Cookie support was added, use new "UseCookie yes/no" indexer.conf command to switch on/off.
"sybase" database type was added. e.g. sybase://sa@localhost/db/. Tested with ASE-12.5 with native ctlib as well as unixODBC interfaces.
Relevancy improvements: "WordDistanceWeight number" search.htm command was added. Use with a number in the range 0..255 to change effect of distance between the searched words on the resulting score. The default value is 255, which means maximum effect of word distance.
Relevancy improvements: "DocSizeWeight number" search.htm command was added. Use with a number in the range 0..255 to give lower score to a longer document and higher score to a shorter document if both documents contain the same number of found words. The default value is 255, which means maximum effect of document size.
New "nfw" search.cgi parameter. It uses the same format with "fw". If all found words appear in the only one section, then resulting score becomes lower. It can be used for example to ignore spam in KEYWORDS meta tag. I.e. if you use high "fw" and "nwf" values for the section corresponding to KEYWORDS, then score will high only if a word appeared in KEYWORDS and also in title/section, but not only in KEYWORDS.
New "StrictModeThreshold number" search.htm command. If search returned less retults than the given number, then search automatically switches from m=all mode (all words) to less strict m=any mode (any word). Default value is 0, which means don't switch automatically to less strict mode.
A new special "User.Date" section was added. It makes possible to use a user defined meta tag (or even any other part) of a HTML document as an alternative "Last-Modified" value: e.g.
"Cached Copy" now looks better for "text/vnd.wap.wml" (WAP documents).
Language quesser now understands "cn" as synonym for "zh" to detect Chinese.
"DefaultContentType" search.htm command was added. Helps when "Content-Type" header is not stored in the database and automatic guesser fails to detect a document type. Previosly "text/plain" was assumed.
search.cgi now can do Cyrillic->Latin and Latin->Cyrillic transliteration. New "tl=yes" search.cgi parameter was added to activate transliteration.
Self-links (i.e. when a page has a link to itself) do not affect popularity rank anymore.
It is possible to use phrase as a synonym now.
Added "AlwaysFoundWord" search template command. It specifies dummy word that is always considered found.
PgSQL driver has been slightly optimized.
Several improvements to search template to be compatible with XHTML.
Fixed that "" entries didn't work well in search.htm.
Fixed search.cgi crash, which showed up on Debian and Suse in some cases (Bug#1004).
Fixed that after indexing with MinWordLength in indexer.conf phrase search didn't work properly.
Fixed that search could split words into parts because of invoking Chinese/Thai segmenter in wrong cases.
Fixed that search query and word statistics were displayed in LocalCharset instead of BrowserCharset when no documents were found.
Fixed that search.cgi crashed if NumSections was smaller than actual number of sections stored in the database.
Fixed minor bug in synonyms code. One wasn't able to use synonyms feature if there are less than three synonyms defined.
Several stability and performance improvements were made.
Version 3.2.33.1:
Japanese stoplist was added. Thanks to Alexander Sharapov.
"Realm site *" now follows only links from the same site
with the current URL.
$(CurrentTimestamp) and $(Last-Modified-Timestamp) search.htm
variables where added, representing current date and a document
modification date in numeric (Unix timestamp) format.
New "dstmp" search parameter was added. It can be used instead
of dy/dd/dm.
New "ExcerptStopword yes/no" search.htm command was added,
to choose whether stopwords should be highlighted
in excerpts.
MaxDocPerSite server setting was added.
Relevancy improvements were made (better word distance
calculation, word count is taken in account now).
Excerpt generating performance improvements were made.
Fixed that indexer exited with "Duplicate error" message with
PostgreSQL 8.0.
Fixed that indexer could crash when processing a malformed
BASE HREF tag.
Fixed that search results were wrongly displayed
if search limits returned no documents in some cases.
Fixed that a page was not removed from search index
in some cases even if it was already removed from site.
Fixed that "Alias regex" didn't work in search.htm.
Fixed that "Pro" version silently didn't start indexing
if some record in the list on "Server" tab contained a
bad URL.
Several stability improvements were made.
Version 3.2.24.1: N/A