Boost Your Search With Machine Learning and ‘Learning to Rank’

Most companies know the value of a smooth user experience on their website. But what about for their onsite search? Simply shoving Ye Olde Search Box in the upper right corner doesn’t cut it anymore. And having bad search could mean bad news for your online presence:

79% of people who don’t like what they find will jump ship and search for another site (Google).
15% of brands dedicate resources to optimize their site search experience (Econsultancy).
30% of visitors want to use a website’s search function – and when they do, they are twice as likely to convert (Moz).

This expands even further to the search applications inside an organization like enterprise search, research portals, and knowledge management systems. Many teams focus a lot of resources on getting the user experience right: the user interactions and the the color palette. But what about the quality of the search results themselves?

Automate Iterations With Machine Learning

Smart search teams iterate their algorithms so relevancy and ranking is continuously refined and improved. But what if you could automate this process with machine learning? There are many methods and techniques that developers turn to as they continuously pursue the best relevance and ranking.

There are several approaches and methodologies to refining this art. One popular approach is called Learning-to-Rank or LTR.

LTR is a powerful machine learning technique that uses supervised machine learning to train the model to find “relative order.” “Supervised” in this case means having humans manually tune the results for each query in the training data set and using that data sample to teach the system to reorder a new set of results.

Popular search engines have started bringing this functionality into their feature sets so developers can put this powerful algorithm to work on their search and discovery application deployments.

With this year’s Activate debuting an increased focus on search and AI and related machine learning technologies, there are two sessions focused specifically on using LTR with Apache Solr deployments. To help you get the most out of these two sessions, we’ve put together a primer on LTR so you and your colleagues show up in Montreal ready to learn.

But first some background.

How LTR Differs From Other ML Techniques

Traditional ML solutions are focused on predicting or finding a specific instance or event and coming up with a binary yes/no flag for making decisions or a numeric score. Think of use cases like fraud detection, email spam filtering, or anomaly identification. It’s either flagged or it’s not.

LTR goes beyond just focusing on one item to examining and ranking a set of items for optimal relevance. With LTR there is scoring involved for the items in the result set, but the final ordering and ranking is more important than the actual numerical scoring of individual items.

How LTR Knows How to Rank Things

The LTR approach requires a model or example of how items should be ideally ranked. This is often a set of results that have been manually curated by subject matter experts (again, supervised learning). This relies on well-labeled training data, and of course, human experts.

The ideal set of ranked data is called “ground truth” and becomes the data set that the system “trains” on to learn how best to rank automatically. This method is ideal for precise academic or scientific data.

A second way to create an ideal set of training data is to aggregate user behavior like likes, clicks, and view or other signals. This is a far more scalable and efficient approach.

LTR With Apache Solr

With version 6.4, Apache Solr introduced LTR as part of its libraries and API-level building blocks. But, the reference documentation might only make sense to a seasoned search engineer.

Solr’s LTR component does not actually do the training on any models — it is left to your team to build a model training pipeline from scratch. Plus, figuring out how all these bits and pieces come together to form an end-to-end LTR solution isn’t straightforward if you haven’t done it before.

So let’s turn to the experts.

Live Case Study: Bloomberg

Financial information services giant Bloomberg runs one of the largest Solr deployments on the planet and is always looking for ways to increase and optimize relevancy while maintaining split-second query response times to millions of financial professionals and investors.

In their quest to continuously improve result ranking and the user experience, Bloomberg turned to LTR and literally developed, built, tested, and committed the LTR component that sits inside the Solr codebase.

Those engineers from Bloomberg will be onstage at the Activate conference in Montreal this October to talk about LTR. They’ll discuss their architecture and challenges in scaling and how they developed a plugin that made Apache Solr the first open source search engine that can perform LTR operations out of the box.

You’ll hear the full war story of how Bloomberg’s real-time, low-latency news search engine was trained on LTR and how your team can do it, too – along with the many ways not to do it.

Full details on this session at Activate 2018 Learning to Rank: From Theory to Production

Image may be NSFW.
Clik here to view.

Live Demo: Practical End-to-End Learning to Rank Using Fusion

Also at Activate 2018, Lucidworks Senior Data Engineer Andy Liu will be presenting a three part demonstration on how to set up, configure, and train a simple LTR model using both Fusion and Solr.

Liu will demonstrate how to include more complex features and show improvement in model accuracy in an iterative workflow that is typical in data science. Particular emphasis will be given to best practices around utilizing time-sensitive user-generated signals.

The session will also explore some of the tradeoffs between engineering and data science, as well as Solr querying/indexing strategies (sidecar indexes, payloads) to effectively deploy a model that is both production-grade and accurate.

Full details on this session at Activate 2018 Practical End-to-End Learning to Rank Using Fusion

So that’s a brief overview of LTR in the abstract and then where to see it action with a real world case study and a practical demo of implementing it yourself. Here’s even more reading to make sure you show up in Montreal ready to get the most out these sessions:

More LTR Resources

Bloomberg’s behind the scenes look at how they developed the LTR plugin and brought it into the Apache Solr codebase

Our ebook Learning to Rank with Lucidworks Fusion on the basics of the LTR approach and how to access its power with our Fusion platform. Accompanying webinar.

An intuitive explanation of Learning to Rank by Google Engineer Nikhil Dandekar that details several popular LTR approaches including RankNet, LambdaRank, and LambdaMART

Pointwise vs. Pairwise vs. Listwise Learning to Rank also by Dandekar

A real-world example of Learning to Rank for Flight Itinerary by Skyscanner app engineer Neil Lathia

Learning to Rank 101 by Pere Urbon-Bayes, another intro/overview of LTR including how to implement the approach in Elasticsearch

= = =

Want to Learn More? Join us at Activate, the search and AI conference, where you can hear from these experts and more than 75 others, Oct 15-18, in Montreal.

Image may be NSFW.
Clik here to view.

The post Boost Your Search With Machine Learning and ‘Learning to Rank’ appeared first on Lucidworks.

Boost Your Search With Machine Learning and ‘Learning to Rank’

Automate Iterations With Machine Learning

How LTR Differs From Other ML Techniques

How LTR Knows How to Rank Things

LTR With Apache Solr

Live Case Study: Bloomberg

Image may be NSFW.
Clik here to view.

Live Demo: Practical End-to-End Learning to Rank Using Fusion

More LTR Resources

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...