Install Solr - The 5 Steps to an Easy Apache Solr Installation

The 5 Steps to Installing Apache Solr Search in a Test Environment

Beginner

At the outset, I called this installation easy. Not many aspects of implementing indexing and search are easy, especially when you move to a production environment. What is easy, relatively, is the Solr installation in a local test environment, as you will see, but we have a few steps to cover first.

Apache Solr in Video

Videos can also be accessed from our Apache Solr Search Playlist on YouTube (opens in a new browser window).

Install Solr - The 5 Steps to an Easy Apache Solr Installation (13:27)

For Those Just Starting Out

Step 1 - Find Help With Online Apache Solr and Apache Lucene Resources

First off, let me point you to the Apache Solr website at https://lucene.apache.org/solr/ with the caveat that this may look different in the future. Take a quick look around without getting bogged down. There is a lot of information here and it is easy to get lost.

News, Resources and Community

Here is a quick summary of helpful resources.

First, Apache publishes updates to Solr quite often so check here for the latest news. With new releases arriving every few months, at the beginning stages try to focus less on the version number and more on the process of installation. At this stage we are focused more on testing Solr and when you are ready to go to production you will likely want the latest version anyway.

It is worth reviewing Apache resources, including a Solr Quick Start tutorial, which to me is a little difficult to navigate for beginners, which explains why I started this tutorial series.

You will also find html-based documentation for the latest release. Also, for each of the versions after 4.4 is a Reference Guide in pdf and web formats. The pdf for the Apache Solr Reference Guide for the latest version 7.0 is 1,035 pages long, which should indicate how complicated indexing and search can be. Also included are links to Wiki-type documentation, books, presentations and videos.

Under Community you will find links to support, mailing lists, known issues and IRC channels. Again, Solr updates frequently and versions move from 'Stable' to 'End of Life' quickly, and those details are noted here.

Step 2 - System Requirements - Java Runtime Environment (JRE)

As for system requirements, for Apache Solr 6 or 7 to run on your system you will need the Java Runtime Environment version 8 or greater, and where you see version 1.8, it is the same thing as version 8.

Use the java -version command to see your version number on a MacOS or Linux machine from the commmand line.

$ java -version openjdk version "1.8.0_131" OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-1~bpo8+1-b11) OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)

If you do not have version 1.8 or greater, our previous tutorial walks you through how to install Java JRE for two versions of Debian, which can apply to other Debian-based distributions found on cloud hosting providers.

Step 3 - How to Download Apache Solr

Next is how to download Solr.

Currently, when you click on Download, you will be transferred directly to a mirror, which is basically a repository. Apache claims that the Apache Lucence Solr combination of programs is downloaded 6,000 times per day, which is astonishing to me, but explains why they set up mirrors to spread the traffic around the globe.

A note about security

A quick note about security. If you notice from the start, each of the sites are prefixed with http instead of https. Of course a conversation about security, man-in-the-middle attacks and why Apache and other software providers opt not to use SSL-secured websites is way beyond our scope here. So I will spend a minute and show you how I verified the validity of the download on my Debian Linux system. There are of course multiple ways to do this on Linux, macOS and Windows, as discussed at the bottom of the Download page.

That said, many people just take the risk and download the file.

Select the appropriate version

So which version should you select? Well, most software providers want all newcomers to use the latest version and Apache is no different. Just as of a few days ago version 7 was released.

Let me show you the files. Click on the suggested mirror for your location and you will see a directory with three compressed files. Source code here looks like solr-7.0.0-src.tgz. Second is solr-7.0.0.tgz which was compressed using a file format commonly used on Linux and macOS systems called gzip. Finally, the solr-7.0.0.zip was compressed using zip which is common on Windows, but can be unzipped in Linux as well.

You can of course download a different version if you move up one directory or access the full archive of all historical versions on the Apache Solr website.

Why downloading Solr from the Linux repository is not a good idea

So why is downloading Solr from the Linux repository not a good idea? As a general preference on Linux it is a good practice to see which version of any program is available in the repository for your Linux distribution. That implies it is fully tested out by say Debian, Ubuntu or Red Hat, giving you more assurances that it will work properly.

Recall from earlier, I mentioned why we need to download Solr files from Apache instead. Well, this is because, at least in my case with Debian, the most recent fully-tested version is 4 years old! So while the path of using sudo apt-get install of the default version is what we may prefer for its simplicity, using a 4 year old version of Solr is not wise.

Let me show you this on Debian. After doing a sudo apt-get update you can search your local cache file of 54,000 packages with the apt-cache command like this.

$ apt-cache -n search solr dovecot-solr - secure POP3/IMAP server - Solr support libwebservice-solr-perl - Perl interface for the Solr (Lucene) web service libsolr-java - Enterprise search server based on Lucene - Java libraries solr-common - Enterprise search server based on Lucene3 - common files solr-jetty - Enterprise search server based on Lucene3 - Jetty integration solr-tomcat - Enterprise search server based on Lucene3 - Tomcat integration php5-solr - solr module for PHP 5 python-pysolr - lightweight Python wrapper for querying Apache Solr r-cran-solrium - general purpose R interface to 'Solr'

The file we are interested in is called solr-common which can be evaluated using apt-cache showpkg.

$ apt-cache showpkg solr-common Package: solr-common Versions: 3.6.2+dfsg-5 (/var/lib/apt/lists/ftp.us.debian.org_debian_dists_jessie_main_binary-amd64_Packages) (about 20 lines trimmed)

So downloading from the Debian repository will give you version 3.6.2, which was released in 2013 and is well beyond the date where Apache provides patches and updates, meaning it is considered EOL (End of Life).

Keep in mind that I am using Debian 8 Jessie and if you are on the later version, 9 Stretch, we can always do a search on https://www.debian.org/distrib/packages using that same solr-common and will find the same version 3.6.2.

Download the files

Heading back to the files. On a GUI interface through a browser you can download these files straight from here with your mouse in Windows with Right-Click or on macOS with Ctrl-Click.

A common question is: where should you put the files? Apache Solr documentation suggests you put them in your working directory. And I will show you later why creating a subdirectory at this stage is not necessary.

For those using the Linux command line, like I am on my end, I will go to my working directory and download the files using the wget command.

I downloaded three files because I want to verify the integrity of the download, just to be extra safe and to show you how it works. I will be using the PGP signature so I will download the .asc file, and the file called KEYS along with the compressed file for version 7, which is 143 Megabytes in size.

If you would like to verify the files using a different method with the .md5 or .sha1 files, then I suggest reading the bottom of the page at the mirror when you clicked on Download.

$ cd ~/ $ wget http://www-us.apache.org/dist/lucene/solr/7.0.0/solr-7.0.0.tgz $ wget http://www-us.apache.org/dist/lucene/solr/7.0.0/solr-7.0.0.tgz.asc $ wget http://www-us.apache.org/dist/lucene/solr/7.0.0/KEYS

In the last tutorial, I mentioned preparing to install Solr version 6.6.1, but, instead of doing that, I will work with Solr version 7. Solr 7 was just released in the last few days. This should reinforce the point from earlier about how not being too particular about which version you are using in a test environment because the point is to test things out. That said, you will need Java Runtime Environment 1.8 or greater.

Verify the validity of the files

As mentioned earlier, for those concerned with security, I will follow the advice Apache gives regarding verifying the integrity of downloaded files.

On my version of Debian 8 Jessie at the command line, I used the program gpg and entered two commands.

$ gpg --import KEYS $ gpg --verify solr-7.0.0.tgz.asc solr-7.0.0.tgz gpg: Signature made Fri 08 Sep 2017 01:23:19 PM PDT using RSA key ID ... gpg: Good signature from ...

The first command imports the public key of the sender, Apache. The second verifies that the signature matches.

Step 4 - Install Apache Solr by Unzipping the File

Next, installing Apache Solr is accomplished simply by unzipping the compressed file downloaded earlier. Personally, before that I generally try to understand what is inside by counting the number of files and viewing the structure.

Inspect the download

We can look into the compressed file using tar -tf, which is t to list contents and f for files, but before that I pipe it into a word count for lines to count the total number of files.

$ tar -tf solr-7.0.0.tgz | wc -l 1604 $ tar -tf solr-7.0.0.tgz solr-7.0.0/LUCENE_CHANGES.txt solr-7.0.0/contrib/analysis-extras/lib/ solr-7.0.0/contrib/clustering/lib/ solr-7.0.0/contrib/dataimporthandler-extras/lib/ solr-7.0.0/contrib/extraction/lib/ (1599 lines trimmed)

So this output shows 1,604 files and the directory structure, so everything will be put into a directory called solr-7.0.0. I mention this because some people wonder if they should create a subdirectory for the solr download, but as you can see it will create one for you on its own.

Unzip the Solr download

Next, to unzip the compressed file that sits in our working directory. Apache documentation recommends using the tar command with the options zxf which is z for the gzip format, x to extract and f for files, on the filename solr-7.0.0.tgz. This is equivalent to using options -zxf if you prefer.

$ tar zxf solr-7.0.0.tgz

Step 5 - Get Started With a Quick Look Around

In the next tutorial we will spend time looking around but at this stage we want to see if Solr was installed properly.

Solr at the command line

First, navigate to the directory containing Solr, assuming you installed it in your working directory as suggested. Next, start Solr by pointing to the solr script with bin/solr start.

$ cd ~/solr-7.0.0 $ bin/solr start Waiting up to 180 seconds to see Solr running on port 8983 [\] Started Solr server on port 8983 (pid=2493). Happy searching!

Ok, great! Solr is up and running. Happy searching!

Now, we can get the status including the number of nodes, the port number, and a few other tidbits with the bin/solr status command.

$ bin/solr status Found 1 Solr nodes: Solr process 2493 running on port 8983 { "solr_home":"/home/paul/solr-7.0.0/server/solr', "version":"7.0.0 ... - 2017-09-08 13:21:08", "startTime":"2017-10-01T18:55:42.497Z", "uptime":"0 days, 0 hours, 1 minutes, 55 seconds", "memory":"29.6 MB (%6) of 490.7 MB"}

Then, to stop Solr use the bin/solr stop command.

$ bin/solr stop Sending stop command to Solr running on port 8983 ... waiting up to 180 seconds to allow Jetty process 2493 to stop gracefully. $ _

Very good. So at this point I am reluctant to go any deeper. In the next tutorial we will open the Admin Console which is browser-based tool to help you administer the Solr instance. We will also look around a bit and get comfortable with a few new terms.

Working with Apache Solr can be tricky, so if you need help please feel free to reach out on social media, including at our FactorPad YouTube Channel.

Related Solr Reference Material

Questions and Answers

Q: Why are there two types of setups: a test environment and a production environment?
A: Apache Solr is a large and complicated piece of software with many use cases. Given that, it is best to take time to understand creating indexes, setting schema, configuring searches and evaluating the SolrCloud offering before moving on to setting up a production environment.