Extension:CirrusSearch

From mediawiki.org
Jump to navigation Jump to search
Other languages:
Bahasa Indonesia • ‎Deutsch • ‎English • ‎Türkçe • ‎français • ‎português do Brasil • ‎svenska • ‎русский • ‎हिन्दी • ‎中文 • ‎日本語
MediaWiki extensions manual
CirrusSearch
Release status: stable
Implementation Search, API , Hook
Description Implements searching for MediaWiki using Elasticsearch
Author(s) Nik Everett, Chad Horohoe, Erik Bernhardson
Latest version continuous updates
Compatibility policy Snapshots releases along with MediaWiki. Master is not backwards compatible.
License GNU General Public License 2.0 or later
Download
README
  • $wgCirrusSearchLanguageWeight
  • $wgCirrusSearchUseIcuFolding
  • $wgCirrusSearchStemmedWeight
  • $wgCirrusSearchQueryStringMaxDeterminizedStates
  • $wgCirrusSearchCrossClusterSearch
  • $wgCirrusSearchExtraIndexSettings
  • $wgCirrusSearchTalkNamespaceWeight
  • $wgCirrusSearchPrefixWeights
  • $wgCirrusSearchPrefixSearchRescoreProfile
  • $wgCirrusSearchDisableUpdate
  • $wgCirrusSearchCompletionSuggesterUseDefaultSort
  • $wgCirrusSearchMoreLikeThisMaxQueryTermsLimit
  • $wgCirrusSearchUseIcuTokenizer
  • $wgCirrusSearchCompletionBannedPageIds
  • $wgCirrusSearchOptimizeIndexForExperimentalHighlighter
  • $wgCirrusSearchRescoreProfiles
  • $wgCirrusSearchPhraseRescoreBoost
  • $wgCirrusSearchInterwikiProv
  • $wgCirrusSearchDefaultCluster
  • $wgCirrusSearchElasticQuirks
  • $wgCirrusSearchMaxFileTextLength
  • $wgCirrusSearchFallbackProfiles
  • $wgCirrusSearchMoreLikeThisTTL
  • $wgCirrusSearchAllowLeadingWildcard
  • $wgCirrusSearchInterwikiPrefixOverrides
  • $wgCirrusSearchMaintenanceTimeout
  • $wgCirrusSearchReplicas
  • $wgCirrusSearchPhraseSlop
  • $wgCirrusSearchBoostOpening
  • $wgCirrusSearchWriteBackoffExponent
  • $wgCirrusSearchUserTesting
  • $wgCirrusSearchShardCount
  • $wgCirrusSearchUseCompletionSuggester
  • $wgCirrusSearchPhraseSuggestReverseField
  • $wgCirrusSearchFallbackProfile
  • $wgCirrusSearchFragmentSize
  • $wgCirrusSearchUnlinkedArticlesToUpdate
  • $wgCirrusSearchClientSideUpdateTimeout
  • $wgCirrusSearchIgnoreOnWikiBoostTemplates
  • $wgCirrusSearchRegexMaxDeterminizedStates
  • $wgCirrusSearchInterwikiHTTPConnectTimeout
  • $wgCirrusSearchExtraIndexes
  • $wgCirrusSearchCategoryDepth
  • $wgCirrusSearchMergeSettings
  • $wgCirrusSearchClusters
  • $wgCirrusSearchAllFields
  • $wgCirrusSearchBannedPlugins
  • $wgCirrusSearchMoreLikeThisConfig
  • $wgCirrusSearchClusterOverrides
  • $wgCirrusSearchCrossProjectBlockScorerProfiles
  • $wgCirrusSearchNearMatchWeight
  • $wgCirrusSearchReplicaGroup
  • $wgCirrusSearchIndexedRedirects
  • $wgCirrusSearchIndexAllocation
  • $wgCirrusSearchNumCrossProjectSearchResults
  • $wgCirrusSearchLanguageDetectors
  • $wgCirrusSearchUpdateShardTimeout
  • $wgCirrusSearchEnableCrossProjectSearch
  • $wgCirrusSearchFullTextQueryBuilderProfiles
  • $wgCirrusSearchCompletionDefaultScore
  • $wgCirrusSearchWriteClusters
  • $wgCirrusSearchCompletionSuggesterHardLimit
  • $wgCirrusSearchRecycleCompletionSuggesterIndex
  • $wgCirrusSearchLogElasticRequests
  • $wgCirrusSearchConnectionAttempts
  • $wgCirrusSearchWikiToNameMap
  • $wgCirrusSearchMaxFullTextQueryLength
  • $wgCirrusSearchLogElasticRequestsSecret
  • $wgCirrusSearchEnableRegex
  • $wgCirrusSearchClientSideSearchTimeout
  • $wgCirrusSearchExtraBackendLatency
  • $wgCirrusSearchNamespaceMappings
  • $wgCirrusSearchPreferRecentUnspecifiedDecayPortion
  • $wgCirrusSearchWMFExtraFeatures
  • $wgCirrusSearchSearchShardTimeout
  • $wgCirrusSearchNamespaceResolutionMethod
  • $wgCirrusSearchPrivateClusters
  • $wgCirrusSearchSimilarityProfiles
  • $wgCirrusSearchCategoryMax
  • $wgCirrusSearchCategoryEndpoint
  • $wgCirrusSearchPoolCounterKey
  • $wgCirrusSearchCompletionProfiles
  • $wgCirrusSearchMaxShardsPerNode
  • $wgCirrusSearchRescoreProfile
  • $wgCirrusSearchRefreshInterval
  • $wgCirrusSearchSimilarityProfile
  • $wgCirrusExploreSimilarResults
  • $wgCirrusSearchEnableArchive
  • $wgCirrusSearchIndexDeletes
  • $wgCirrusSearchFiletypeAliases
  • $wgCirrusSearchDevelOptions
  • $wgCirrusSearchPrefixSearchStartsWithAnyWord
  • $wgCirrusSearchUpdateConflictRetryCount
  • $wgCirrusSearchInterwikiHTTPTimeout
  • $wgCirrusSearchFetchConfigFromApi
  • $wgCirrusSearchPhraseSuggestUseOpeningText
  • $wgCirrusSearchExtraIndexBoostTemplates
  • $wgCirrusSearchPrefixIds
  • $wgCirrusSearchFullTextQueryBuilderProfile
  • $wgCirrusSearchStripQuestionMarks
  • $wgCirrusSearchMoreLikeThisFields
  • $wgCirrusSearchIndexBaseName
  • $wgCirrusSearchMasterTimeout
  • $wgCirrusSearchSanityCheck
  • $wgCirrusSearchTextcatConfig
  • $wgCirrusSearchNamespaceWeights
  • $wgCirrusSearchCrossProjectOrder
  • $wgCirrusSearchTextcatModel
  • $wgCirrusSearchInterwikiThreshold
  • $wgCirrusSearchMoreAccurateScoringMode
  • $wgCirrusSearchMaxPhraseTokens
  • $wgCirrusSearchCrossProjectSearchBlockList
  • $wgCirrusSearchRescoreFunctionChains
  • $wgCirrusSearchCrossProjectShowMultimedia
  • $wgCirrusSearchMaxIncategoryOptions
  • $wgCirrusSearchCrossProjectProfiles
  • $wgCirrusSearchWikimediaExtraPlugin
  • $wgCirrusSearchLanguageToWikiMap
  • $wgCirrusSearchLinkedArticlesToUpdate
  • $wgCirrusSearchEnableAltLanguage
  • $wgCirrusSearchPreferRecentDefaultHalfLife
  • $wgCirrusSearchCompletionSuggesterSubphrases
  • $wgCirrusSearchFunctionRescoreWindowSize
  • $wgCirrusSearchEnablePhraseSuggest
  • $wgCirrusSearchCompletionSettings
  • $wgCirrusSearchUseExperimentalHighlighter
  • $wgCirrusSearchDropDelayedJobsAfter
  • $wgCirrusSearchFeedbackLink
  • $wgCirrusSearchUpdateDelay
  • $wgCirrusSearchInterleaveConfig
  • $wgCirrusSearchPhraseRescoreWindowSize
  • $wgCirrusSearchDefaultNamespaceWeight
  • $wgCirrusSearchMoreLikeThisAllowedFields
  • $wgCirrusSearchClientSideConnectTimeout
  • $wgCirrusSearchPhraseSuggestUseText
  • $wgCirrusSearchPhraseSuggestProfiles
  • $wgCirrusSearchPreferRecentDefaultDecayPortion
  • $wgCirrusSearchInterwikiSources
  • $wgCirrusSearchWeights
  • $wgCirrusSearchBoostTemplates
  • $wgCirrusSearchICUFoldingUnicodeSetFilter
  • $wgCirrusSearchSlowSearch
Translate the CirrusSearch extension if it is available at translatewiki.net

Check usage and version matrix.

Vagrant role cirrussearch
Issues Open tasks · Report a bug

The CirrusSearch extension implements searching for MediaWiki using Elasticsearch.

This page is for installation. After the install is working, see Help:CirrusSearch for usage.

Goals

Image for: Goals
  • No native dependencies that would make this difficult to install
    • The only dependencies are pure-PHP MediaWiki extensions and Elasticsearch itself
  • Provide a near-real-time search index for wiki pages that's extendable by other MediaWiki extensions
  • Provide all of the query options MWSearch has given users, and more

Dependencies

Image for: Dependencies
PHP and cURL
Elasticsearch

  • MediaWiki 1.29.x and 1.30.x require Elasticsearch 5.3.x or 5.4.x.
  • MediaWiki 1.31.x and 1.32.x require Elasticsearch 5.5.x or 5.6.x.
  • MediaWiki 1.33.x, 1.34.x and 1.35.x require Elasticsearch 6.5.x (6.5.4 recommended).

Take note that a Java installation like OpenJDK is needed in addition.

Elastica
  • Elastica is a PHP library to talk to Elasticsearch. Install Elastica per the instructions below.

Other
  • Due to the actual handling of jobs by the CirrusSearch extension, it is advisable to set up jobs in Redis to prevent messages like Notice: unserialize(): Error at offset 64870 of 65535 bytes in JobQueueDB.php and subsequent errors like Unsupported operand types. See task T157759.

Installation

Image for: Installation

Elastica

Even though the instructions below tell you to only run Composer when installing from git, it may be necessary to issue it anyway in order to install all PHP dependencies.

  • Download and place the file(s) in a directory called Elastica in your extensions/ folder.
  • Only when installing from git run Composer to install PHP dependencies, by issuing composer install --no-dev in the extension directory. (See task T173141 for potential complications.)
  • Add the following code at the bottom of your LocalSettings.php:
    wfLoadExtension( 'Elastica' );
    
  •  Done – Navigate to Special:Version on your wiki to verify that the extension is successfully installed.

CirrusSearch

  • Download and place the file(s) in a directory called CirrusSearch in your extensions/ folder.
  • Add the following code at the bottom of your LocalSettings.php:
    wfLoadExtension( 'CirrusSearch' );
    
  • Now follow the setup instructions in the CirrusSearch README delivered with your extension i.e. $IP/extensions/CirrusSearch/README. Note that all info in it might not apply to your version of the extension, especially the version of Elasticsearch supported.
  • Configure as required.
  •  Done – Navigate to Special:Version on your wiki to verify that the extension is successfully installed.

Upgrading

Image for: Upgrading

Please follow the upgrade instructions in the CirrusSearch UPGRADE file.

Configuration

Image for: Configuration

The configuration parameters of CirrusSearch are documented at the "settings.txt" file. See also documentation on CirrusSearch configuration profiles.

Elasticsearch will fail to index for CirrusSearch if one is using a database name for MySQL which contains a capital character, e.g. "MyWikiDatabaseName". To mitigate this CirrusSearch provides the $wgCirrusSearchIndexBaseName configuration parameter which one needs to set, e.g. $wgCirrusSearchIndexBaseName = 'mywikidatabasename';.

Hooks

Image for: Hooks

CirrusSearch extension defines a number of hooks that other extensions can make use of to extend the core schema and modify documents. The following hooks are available:

API

Image for: API

CirrusSearch features can be used in API queries. Searching happens via the normal search API, action=query&list=search; you can use CirrusSearch-specific features, such as the morelike: special prefix to find pages related to Marie Curie and radium: api.php?action=query&list=search&srsearch=morelike:Marie_Curie%7Cradium&srlimit=10&srprop=size&formatversion=2 Custom APIs and parameters are provided for querying CirrusSearch configuration and debug information:

See also

Image for: See also
General links
Debugging

Local development

Image for: Local development

Elastic Search service can be run with the Vagrant role (cirrussearch) and MediaWiki Vagrant.

For Docker, you can use a command like docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.2. Then follow the installation and configuration directions. If your web host is in a container you'll want to make sure the above container is on the same network, and in LocalSettings.php you will want to reference elasticsearch as the host name. This will not have the WMF plugins but can be sufficient for basic testing.