The distributed search space is seeing more competition as developers build applications for their website search and enterprise search needs. The Apache Lucene search libraries sit behind both Apache Solr search and Elasticsearch. These offerings are best suited for those looking to build their own applications. A second option is to use a managed search offering from Google Custom Search or Amazon CloudSearch, among others. It is important for developers to evaluate which offering suits their needs.
Any comprehensive review requires building a test environment which is normally performed on one server in Solr standalone mode. Next, as search volumes increase it is important to review distributed search offerings and in Apache Solr that sits in SolrCloud mode.
Tests in Solr require building data sets and configurations
for managing the search application. Here with the
solr create_collection command you
can create a collection in a test environment in SolrCloud mode.
solr create_collection, used in
distributed SolrCloud mode, has functionality for setting up shards and
replicas. It also is used to set up links to Apache ZooKeeper which
manages search traffic and is helpful in failover situations.
solr create_collection has
added complexity, if your test case is for a standalone mode application
solr create or
solr create_core commands are better
suited for you.
solr create_collection command is
one of 12 commands within the main
solr create_collection command
itself has 6 options, also known as parameters.
The syntax for running
solr create_collection is as follows.
The list of 6 options including default values are explained fully in the table below.
This syntax assumes your current working directory is
the $SOLR_HOME directory for Solr, which for version 7
standalone mode for a local installation. When running in a production
environment the directory locations may differ.
So the path to the location of the solr script is:
solr script can be run using the
./solr from within
When using a Solr Windows installation, the solr script is called using
solr create_collection command
-c option is required. The other
options (parameters) are optional.
||Create a Solr collection with default options named <name>.||
||Select a name for the collection in SolrCloud mode.||-c <name> is required|
||Select the configuration directory contents to copy to the
collection in SolrCloud mode.
Currently two options exist:
||You can select a name for the configuration directory set up by the -d <confdir> option. By default the name selected will match that selected with -c <name> and contents will be copied to it from the _default directory. This option is provided if a desired configuration directory already exists.||<name> from the -c option|
||Set the number of shards to split the collection into.||1|
||Set the number of copies of each document in the collection. The option can be specified using the long form -replicationFactor <#> as well.||1 (not replicated)|
||Select the port on the server where Solr should create the collection.||Solr selects the first running server|
The following command will create a Solr collection with the name mycollection on the default port 8983.
The following command will create a Solr collection called mycore on port number 8984 instead of the default 8983. This also creates configuration directory called mycollection with a copy of contents from /server/solr/configsets/_default.
The following command will create a Solr collection named mycollection copying the techproducts configuration to a directory called myconfigs.
The following command will create a Solr collection named mycollection with 2 shards and 2 replicas using the _default configurations copied and named mycollection.
From the command line you can access additional help on the
solr create_collection command by
-help after the command.
FactorPad offers Apache Solr Search content in both tutorials and reference.
Watch for more free Apache Solr content on our YouTube Channel. Subscribe here.