Your Privacy

This site uses cookies to enhance your browsing experience and deliver personalized content. By continuing to use this site, you consent to our use of cookies.
COOKIE POLICY

Skip to main content

Apache SOLR Search – Find Everything

Apache SOLR Search – Find Everything
Back to insights

Recently, I had a chance to build a search application for one of our clients.  The client had an existing Google Search Appliance integrated with multiple Plone CMS sites.  The Google Search Appliance has been discontinued and they were looking for a replacement for their search application.  UDig recommended Apache SOLR as a replacement.  We were able to leverage a single search schema for multiple sites and consolidate their search application into a single page application that utilizes AJAX Solr to provide a comprehensive search for the end users.  We found that the open source community has embraced SOLR as a search solution and as such, there is a Plone plugin that we used to provide near real-time indexing of new documents as they are added to the Plone CMS.  Also, with a rich set of API’s available in SOLR, the existing content, no matter how old, was indexed and searchable.  The client also wanted the ability to define custom rank orders to documents that they considered highly relevant.  With SOLR, we easily changed the ranking order based on the criteria supplied by the client.

What is Apache SOLR?

Apache SOLR is an enterprise search platform built on Apache Lucene.  Lucene is a search engine packaged together in a set of jar files.  SOLR takes the Lucene API and builds features on top of them to make the API’s available to a web server.  This also makes building a search application much easier.  SOLR is defined as a “highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. SOLR powers the search and navigation features of many of the world’s largest internet sites” (http://lucene.apache.org/solr/).  This standalone search server provides multiple mechanisms to enter your data into SOLR and provides a query language to retrieve your documents from SOLR.

So, how do you begin?

Well, first download and install SOLR using Apache’s easy to follow quickstart.  After you have a working SOLR demo site, then you can begin to figure out how to use it in your environment.  Let’s say you have a web site and you want to index the web pages for a custom search.  You can use Apache Nutch which is a mature web crawler that integrates directly with SOLR.  How about my .net Forms application? Well, there are apis for that too.  SolrNet is a .net client for SOLR.  Maybe you have a Java application – SolrJ to the rescue!  How about that file system that has hundreds of documents?  Are you constantly trying to find a word document from 2 years ago?  You can index that file system into SOLR and then search the index for that document.  In our case, we used the Plone CMS SOLR plugin to index documents.  The plugin supported both HTML documents and attachments such as Excel, Word and PDF.  This met our needs for indexing and we ended up with an index that we could use to build our search application.

Building a Search Application

We chose AJAX Solr to build out the search application.  AJAX Solr is a JavaScript library that can be extended to provide custom search results. This choice provides the users with a single search application to search all of the different locations that data is stored.   The result is a cohesive application that the users will come to rely on.  We built out the search application to include some of SOLRs wonderful features such as Faceted Search, Filtering, Query Suggestions, Spell Check and Auto-complete.  We also ranked the results so that relevant information is provided to the user higher up in the search results. Let’s breakdown some of the search features of SOLR.

To send a query to the SOLR server, you construct an URL to be sent to the server.

/solr/query?q=*:*

Basic Search

To search a term in your index called searchableText, simply put the query after a colon on the search URL

Phrases
To search a phrase, enclose the query in double quotes

Sloppy Phrases Search

A proximity query will search for a phrase within a phrase.  Utilizing a tilde (~) we can tell SOLR to look for the number of words to search for. “fast search” will match “fast search” and “fast solr search” in the searchableText field.  ~1 tells SOLR to search within 1 word of our search phrase.

Boost Queries

Any query clause can be boosted with the ^ operator. The boost is multiplied into the normal score for the clause and will affect its importance relative to other clauses. In this example, any documents with “UDig” in the searchableText field will have its score boosted by 10 which will cause that result to be higher in the results than a searchableText field with only the word “blog” in the field.

Range Queries

A range query selects documents with values between a specified lower and upper bound. Range queries work on numeric fields, date fields, and even string and text fields.

  • Square brackets [ ] denote an inclusive range query that matches values including the upper and lower bound.
  • Curly brackets { } denote an exclusive range query that matches values between the upper and lower bounds, but excluding the upper and lower bounds themselves.

There are many, many ways to slice and dice your search index.  SOLR has a very rich API which can be utilized to provide users with the best search results possible.  From internal sites and databases to externally facing websites, utilizing search will help users find everything.  In fact, our recent project actually helped users save lives.  Click here to read more.

Digging In

  • Software Engineering

    Creating Reusable Code Templates to Reduce Client Project Startup Time

    In consulting, one of the least visible but most expensive phases of a project is the beginning. Teams can spend days or weeks setting up repositories, agreeing on structure, wiring basic infrastructure, and solving problems that have already been solved many times before. Code templates are a practical way to reduce overhead while improving consistency. […]

  • Software Engineering

    Player Three Has Entered the Game: How AI Is Finally Bridging the Divide Between Design and Engineering

    As AI begins to become more prominent in our day-to-day lives, I find myself in a unique position. As a practicing software engineer and UI/UX designer, I am genuinely happy to see the introduction of AI tools begin to take shape in our industry. But more importantly, I am happy to start seeing the effects it is having on what has historically been a pretty challenging relationship: the […]

  • Software Engineering

    The Disappearing Middle of Software Work: Why the Bookends – Strategy & Impact – Matter Most Now

    Here’s a question nobody in enterprise software wants to sit with: what happens to the middle? Not the middle of the org chart. The middle of the work. The vast, expensive layer of effort that has defined enterprise software delivery for thirty years—translating what the business wants into working code. The requirements-to-implementation pipeline. The “build phase.” […]

  • Software Engineering

    Zero-Code Telemetry with OpenTelemetry’s OBI

    Full distributed tracing and exception capture for any application — without writing a single line of instrumentation code. View the source code on GitHub → The Premise Observability is essential for understanding what’s happening inside your services, but instrumenting an application by hand — adding trace spans, logging calls, and metric counters throughout your codebase […]

  • Software Engineering

    Building a Consultant in the Trenches: How Playing Offensive Line Shaped My Consulting Career

    People often ask me the same question when they find out that I played college football: “Do you miss it?” On the surface, it’s a bad question with an obvious answer. Yes. However, if I give myself a minute to sit with that question, the answer is more nuanced. Yes, I miss playing football, but […]

  • Software Engineering

    Modernization That Sticks: Why Adoption, Not Just Architecture, Drives Success

    Modernizing a legacy sales platform in a large enterprise isn’t just a technical challenge, it’s a cultural and operational one. On a recent project with a Fortune 500 organization, several past attempts to replace the aging ERP system failed. Why? Because those efforts treated modernization as a software delivery exercise, not an adoption journey. When […]