FactorPad
Faster Learning Tutorials

Solr Field Properties : syntax, options and examples

This list of 19 properties apply to Fields and dictate how data is stored and retrieved by Apache Solr.
  1. About - Understand the purpose of Field properties and how their defaults are set.
  2. Syntax - See how Solr Field properties are coded in the schema.
  3. Options - View the list of 19 Field properties including defaults.
  4. Examples - Review examples of usage.
by Paul Alan Davis, CFA, November 17, 2017
Updated: July 16, 2018
Here we focus on the specification in the XML-formatted schema.xml or managed-schema files.

Outline Back Next

~/ home  / tech  / solr  / reference  / solr field properties


Solr Field Properties and Their Defaults

Beginner

The following reference is intended for developers evaluating Apache Solr for enterprise search or website search applications. Both Apache Solr and Elasticsearch use Lucene libraries for custom search.

Apache Solr Reference

1. About Solr Field Properties

Solr schema refers to a configuration file that instructs Solr how to index documents, plus which Fields to display in search results. Documents may contain structured data as you might find in a database like an online store, or unstructured data as used in full text search applications like search engines.

The Solr schema is formatted in the file named managed-schema when the user elects to make modifications using the Solr Schema API, or schema.xml for more advanced users who modify the schema by hand.

Fields in Solr are related to the documents themselves and the information being searched for. Each Field is assigned a Field Type which provides rules for how Fields of that type should be processed during indexing and search.

The version="1.6" attribute in the schema dictates default values for each Field Type class, with 1.6 being the schema version for Solr version 7. These may be overridden at the Field level.

The easiest way to think about defaults is that each Field Type class dictates the default values. These defaults are listed in the tables below, but they can be overridden at the Field Type level or the Field level.

2. Syntax for Solr Fields and Field Types

An example for both Field Types and Fields might look like this (including the XML and schema tags).

<?xml version="1.0" encoding="UTF-8"?> <schema name="default-config" version="1.6"> ... <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true"/> ... <field name="title" type="text_general" indexed="true" stored="true"/> ... </schema>

The schema file is typically hundreds of lines long, and above is a snippet first of a Field Type that processes Fields that are given the type="text_general" name. In this case, the title Field pulled from the indexed document is assigned this Field Type.

Where you see indexed="true" and stored="true" in the Field tag, these are examples of Field properties. They dictate whether information is being stored in the index and whether it can be accessed during a search.

The table below provides a list and description of 19 properties that can be included in the Field tag and will override defaults for that Field Type.

3. Options for the Different Field Properties in Apache Solr

Below are 19 Field properties provided by Solr with defaults. The first table represents the most commonly used properties for beginners. All properties are entered as either true or false, except the name and type field definitions.

Most common Field properties

The list of 8 common Field properties relate to whether Fields are stored and can be retrieved during search. Also, similar to a database, whether they are required and can have multiple values.

Remember, part of the goal is to minimize the size of an index, and these settings allow you to customize your index and turn on the features you need.

field Property Description Default
name (required) The name for the field. --
type (required) Points to a FieldType within the same schema that controls behaviors for all fields of that type. true
default The field will be populated with the value (default="value") if no data is supplied at index time. none
indexed Only when true is selected can the Field be searched or sorted in queries to retrieve matching documents. true
stored Only when true is selected can the Field be retrieved in queries. true
required When true Solr will not add documents to the index where a value in this Field is missing. This is common for id Fields and structured data. false
multiValued When set to true then a document may have multiple values of this Field or Field Type. Similar to a one-to-many relationship in a database. false
docValues When true the value in a Field will be added to an additional structure called DocValues that is helpful for retrieving information that will be used to sort, highlight terms or provide facets (groupings). A standard inverted index is not ideally suited for this type of operation, so DocValues adds columns to the index. This adds to the size and complexity of the index, so if you are not sorting, highlighting and faceting, then the setting should be false. docValues are only available for some Field Types. false

The name field should use the convention of starting with a letter. Those with leading and trailing underscores are reserved for those like _version_, _text_ and _root_ which are four pre-declared fields in the _default configset along with id.

Field properties for more advanced implementations

The following table of 11 properties relates to finer points of index construction and will impact the size of the index and its ability to find and rank documents during search.

field Property Description Default
sortMissingFirst Documents are sorted on a specified Field, when none is provided and true is specified, then those with missing data in the specified Field show up first when sorted. This works for string, boolean, date and numeric data types only. false
sortMissingLast Documents are sorted on a specified Field, when none is provided and true is specified, then those with missing data in the specified Field show up last when sorted. This works for string, boolean, date and numeric data types only. false
omitNorms When true it disables length normalization for text Fields. Defaults to true for non-analyzed Field Types such as BinaryField, BoolField, IntPointField and StrField, and false for text fields. true
omitTermFreqAndPositions When text fields are tokenized, tokens include information on the frequency, position and payloads which are used in document ranking. It defaults to true for non-text fields and false for text fields. true
omitPositions Omits the position information from tokens. true
termVectors Maintains locations of tokens in documents, helpful for MoreLikeThis where document similarity is required. false
termPositions Maintains position information for tokens in documents. false
termOffsets Maintains offset information for advanced Field parsing. false
termPayloads Maintains information for document scoring. false
useDocValuesAsStored If the Field has stored="false" and this Field set to true would allow for the Field to be returned with "*" in the fl search parameter. Defaults to true. true
large If stored="true" and multiValued="false" then this can be used to adjust whether large Fields are cached or not, thus improving performance. false

4. Examples of Common Solr Field Properties

Example 1 - Set up a Field as a unique key

In this case, a Field is given two required properties that make it suitable as a unique key.

<field name="id" type="string" indexed="true" multiValued="false"/>
Example 2 - Create an indexed and searchable Field

In this case, a Field is included in the index, and searches can be performed within the Field.

<field name="description" type="text_general" indexed="true"/>
Example 3 - A Field that can be retrieved in queries

In this case, we set up a Field that can be returned in queries.

<field name="description" type="text_general" stored="true"/>

Alternatively, if you are using docValues you could use docValues="true".

Example 4 - A Field used to perform sorting

In the following case a Field can be used to sort documents.

<fieldType name="rank" class="string" indexed="true" multiValued="false"/>

It is advised to use docValues="true" for integer and floating point Field types and use omitNorms.

Example 5 - To perform highlighting on a Field

In the following case Fields can be returned with highlighting.

<fieldType name="_text_" class="text_general" indexed="true" stored="true" termVectors="true" termPositions="true"/>

Here a tokenizer must be used for the Field. Also, termVectors is not required, but must be set to true for termPositions to be used.

Example 6 - To perform Field faceting

In the following case a Field can be used for Field faceting.

<fieldType name="state" class="string" indexed="true" />

It is advised to use docValues="true" for faceting but not required.


Other Related Solr Content

FactorPad offers Apache Solr Search content in both tutorials and reference.


What's Next?

Our YouTube Channel is built for developers like you. Subscribe here.

  • To see the outline of Solr reference material, click Outline.
  • To see a list of Solr Field Type properties, click Back.
  • To see our page on Solr analyzers, click Next.

Outline Back Next

~/ home  / tech  / solr  / reference  / solr field properties



 
 
Keywords:
solr field properties
apache solr
solr search
custom search
enterprise search
apache lucene
lucene field properties
solr field defaults
solr docValues
solr syntax
solr indexed fields
elasticsearch
solr multiValued properties
solr sorting
solr unique key