Get smarter search results with the Amazon Kendra Intelligent Ranking and OpenSearch plugin
If you’ve had the opportunity to build a search application for unstructured data (i.e., wiki, informational web sites, self-service help pages, internal documentation, etc.) using open source or commercial-off-the-shelf search engines, then you’re probably familiar with the inherent accuracy challenges involved in getting relevant search results. The intended meaning of both query and document can be lost because the search is reduced to matching component keywords and terms. Consequently, while you get results that may contain the right words, they aren’t always relevant to the user. You need your search engine to be smarter so it can rank documents based on matching the meaning or semantics of the content to the intention of the user’s query.
Amazon Kendra provides a fully managed intelligent search service that automates document ingestion and provides highly accurate search and FAQ results based on content across many data sources. If you haven’t migrated to Amazon Kendra and would like to improve the quality of search results, you can use Amazon Kendra Intelligent Ranking for self-managed OpenSearch on your existing search solution.
We’re delighted to introduce the new Amazon Kendra Intelligent Ranking for self-managed OpenSearch, and its companion plugin for the OpenSearch search engine! Now you can easily add intelligent ranking to your OpenSearch document queries, with no need to migrate, duplicate your OpenSearch indexes, or rewrite your applications. The difference between Amazon Kendra Intelligent Ranking for self-managed OpenSearch and the fully managed Amazon Kendra service is that while the former provides powerful semantic re-ranking for the search results, the later provides additional search accuracy improvements and functionality such as incremental learning, question answering, FAQ matching, and built-in connectors. For more information about the fully managed service, please visit the Amazon Kendra service page.
With Amazon Kendra Intelligent Ranking for self-managed OpenSearch, previous results like this:
Query: What is the address of the White House?
Hit1 (best): The president delivered an address to the nation from the White House today.
Hit2: The White House is located at: 1600 Pennsylvania Avenue NW, Washington, DC 20500
become like this:
Query: What is the address of the White House?
Hit1 (best): The White House is located at: 1600 Pennsylvania Avenue NW, Washington, DC 20500
Hit2: The president delivered an address to the nation from the White House today.
In this post, we show you how to get started with Amazon Kendra Intelligent Ranking for self-managed OpenSearch, and we provide a few examples that demonstrate the power and value of this feature.
Amazon Kendra Intelligent Ranking plugin for self-managed OpenSearch – This is installed along with your OpenSearch deployment and uses the Rescore function of the Amazon Kendra Intelligent Ranking API to semantically re-rank the search results.
OpenSearch Dashboards Compare Search Results Plugin – This lets you compare search results from two queries side by side, for example, where one query is a keyword search, while the other query uses Amazon Kendra Intelligent Ranking for self-managed OpenSearch.
For this tutorial, you’ll need a bash terminal on Linux, Mac, or Windows Subsystem for Linux, and an AWS account. Hint: consider using an Amazon Cloud9 instance or anAmazon Elastic Compute Cloud(Amazon EC2) instance.
You will:
Install Docker, if it’s not already installed on your system.
If you don’t already have the latest version of the AWS CLI installed, then install and configure it now (see AWS CLI Getting Started). Your default AWS user credentials must have administrator access, or ask your AWS administrator to add the following policy to your user permissions:
That’s it! OpenSearch and OpenSearch Dashboard containers are now up and running.
Read the output message from the quickstart script, and make a note of the directory where you can run the handy docker-compose commands, and the cleanup_resources.sh script.
Try a test query to validate you can connect to your OpenSearch container:
Note that if you get the error curl(35):OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to localhost:9200, it means that OpenSearch is still coming up. Please wait for a couple of minutes for OpenSearch to be ready and try again.
{ "create" : { "_index" : "tinydocs", "_id" : "tdoc1" } }
{"title": "WhiteHouse1", "body": "The White House is located at: 1600 Pennsylvania Avenue NW, Washington, DC 20500"}
{ "create" : { "_index" : "tinydocs", "_id" : "tdoc2" } }
{"title": "WhiteHouse2", "body": "The president delivered an address to the nation from the White House today."}
{ "create" : { "_index" : "dstinfo", "_id" : "dst1" } }
{"title": "Daylight Saving Time", "body": "Daylight saving time begins on the second Sunday in March at 2 a.m., and clocks are set an hour ahead, according to the Farmers’ Almanac. It lasts for eight months and ends on the first Sunday in November, when clocks are set back an hour at 2 a.m."}
{ "create" : { "_index" : "dstinfo", "_id" : "dst2" } }
{"title":"History of daylight saving time", "body": "Founding Father Benjamin Franklin is often deemed the brain behind daylight saving time after a letter he wrote in 1784 to a Parisian newspaper, according to the Farmers’ Almanac. But Franklin’s letter suggested people simply change their routines and schedules — not the clocks — to the sun’s cycles. Perhaps surprisingly, daylight saving time had a soft rollout in the United States in 1883 to solve issues with railroad accidents, according to the U.S. Bureau of Transportation Services. It was instituted across the United States in 1918, according to the Congressional Research Service. In 2005, Congress changed it to span from March to November instead of its original timeframe of April to October."}
{ "create" : { "_index" : "dstinfo", "_id" : "dst3" } }
{"title": "Daylight saving time participants", "body":"The United States is one of more than 70 countries that follow some form of daylight saving time, according to World Data. States can individually decide whether or not to follow it, according to the Farmers’ Almanac. Arizona and Hawaii do not, nor do parts of northeastern British Columbia in Canada. Puerto Rico and the Virgin Islands, both U.S. territories, also don’t follow daylight saving time, according to the Congressional Research Service."}
{ "create" : { "_index" : "dstinfo", "_id" : "dst4" } }
{"title":"Benefits of daylight saving time", "body":"Those in favor of daylight saving time, whether eight months long or permanent, also vouch that it increases tourism in places such as parks or other public attractions, according to National Geographic. The longer days can keep more people outdoors later in the day."}
Make the script executable:
chmod +x ./bulk_post.sh
Now use the bulk_post.sh script to create indexes and load the data by running the two commands below:
OpenSearch queries are defined in JSON using the OpenSearch query domain specific language (DSL). For this post, we use the Linux curl command to send queries to our local OpenSearch server using HTTPS.
To make this easy, we’ve defined two small scripts to construct our query DSL and send it to OpenSearch.
The first script creates a regular OpenSearch text match query on two document fields – title and body. See OpenSearch documentation for more on the multi-match query syntax. We’ve kept the query very simple, but you can experiment later with defining alternate types of queries.
The second script is similar to the first one, but this time we add a query extension to instruct OpenSearch to invoke the Amazon Kendra Intelligent Ranking plugin as a post-processing step to re-rank the original results using the Amazon Kendra Intelligent Ranking service.
The size property determines how many OpenSearch result documents are sent to Kendra for re-ranking. Here, we specify a maximum of 20 results for re-ranking. Two properties, title_field (optional) and body_field (required), specify the document fields used for intelligent ranking.
Start with a simple query on the tinydocs index, to reproduce the example used in the post introduction.
Use the query_nokendra.sh script to search for the address of the White House:
./query_nokendra.sh tinydocs "what is the address of White House"
You see the results shown below. Observe the order of the two results, which are ranked by the score assigned by the OpenSearch text match query. Although the top scoring result does contain the keywords address and White House, it’s clear the meaning doesn’t match the intent of the question. The keywords match, but the semantics do not.
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.1619741,
"hits" : [
{
"_index" : "tinydocs",
"_id" : "tdoc2",
"_score" : 1.1619741,
"_source" : {
"title" : "Whitehouse2",
"body" : "The president delivered an address to the nation from the White House today."
}
},
{
"_index" : "tinydocs",
"_id" : "tdoc1",
"_score" : 1.0577903,
"_source" : {
"title" : "Whitehouse1",
"body" : "The White House is located at: 1600 Pennsylvania Avenue NW, Washington, DC 20500"
}
}
]
}
}
Now let’s run the query with Amazon Kendra Intelligent Ranking, using the query_kendra.sh script:
./query_kendra.sh tinydocs "what is the address of White House"
This time, you see the results in a different order as shown below. The Amazon Kendra Intelligent Ranking service has re-assigned the score values, and assigned a higher score to the document that more closely matches the intention of the query. From a keyword perspective, this is a poorer match because it doesn’t contain the word address; however, from a semantic perspective it’s the better response. Now you see the benefit of using the Amazon Kendra Intelligent Ranking plugin!
{
"took" : 522,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 2,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.3798389,
"hits" : [
{
"_index" : "tinydocs",
"_id" : "tdoc1",
"_score" : 0.3798389,
"_source" : {
"title" : "Whitehouse1",
"body" : "The White House is located at: 1600 Pennsylvania Avenue NW, Washington, DC 20500"
}
},
{
"_index" : "tinydocs",
"_id" : "tdoc2",
"_score" : 0.25906953,
"_source" : {
"title" : "Whitehouse2",
"body" : "The president delivered an address to the nation from the White House today."
}
}
]
}
}
Try the dstinfo index now, to see how the same concept works with different data and queries. While you can use the scripts query_nokendra.sh and query_kendra.sh to make queries from the command line, let’s use instead the OpenSearch Dashboards Compare Search Results Plugin to run queries and compare search results.
Paste the local Dashboards URL into your browser: http://localhost:5601/app/searchRelevance – / to access the dashboard comparison tool. Use the default credentials: Username: admin, Password: admin.
In the search bar, enter: what is daylight saving time?
For the Query 1 and Query 2 index, select dstinfo.
Copy the DSL query below and paste it in the Query panel under Query 1. This is a keyword search query.
Now copy the DSL query below and paste it in the Query panel under Query 2. This query invokes the Amazon Kendra Intelligent Ranking plugin for self-managed OpenSearch to perform semantic re-ranking of the search results.
Choose the Search button to run the queries and observe the search results. In Result 1, the hit ranked last is probably actually the most relevant response to this query. In Result 2, the output from Amazon Kendra Intelligent Ranking has the most relevant answer correctly ranked first.
Now that you have experienced Amazon Kendra Intelligent Ranking for self-managed OpenSearch, experiment with a few queries of your own. Use the data we have already loaded or use the bulk_post.sh script to load your own data.
As you’ve seen from this post, the Amazon Kendra Intelligent Ranking plugin for OpenSearch can be conveniently used for semantic re-ranking of your search results. However, if you use a search service that doesn’t support the Amazon Kendra Intelligent Ranking plugin for self-managed OpenSearch, then you can use the Rescore function from the Amazon Kendra Intelligent Ranking API directly.
Try this API using the search results from the example query we used above: what is the address of the White House?
First, find your Execution Plan Id by running:
aws kendra-ranking list-rescore-execution-plans
The JSON below contains the search query, and the two results that were returned by the original OpenSearch match query, with their original OpenSearch scores. Replace {kendra-execution-plan_id} with your Execution Plan Id (from above) and save it as rescore_input.json:
{
"RescoreExecutionPlanId": "{kendra-execution-plan_id}",
"SearchQuery": "what is the address of White House",
"Documents": [
{ "Id": "tdoc1", "Title": "Whitehouse1", "Body": "The president delivered an address to the nation from the White House today.", "OriginalScore": 1.4484794 },
{ "Id": "tdoc2", "Title": "Whitehouse2", "Body": "The White House is located at: 1600 Pennsylvania Avenue NW, Washington, DC 20500", "OriginalScore": 1.2401118 }
]
}
Run the CLI command below to re-score this list of documents using the Amazon Kendra Intelligent Ranking service:
As expected, the document _tdoc2 (_containing the text body “The White House is located at: 1600 Pennsylvania Avenue NW, Washington, DC 20500”) now has the higher ranking, as it’s the semantically more relevant response for the query. The ResultItems list in the output contains each input DocumentId with its new Score, ranked in descending order of Score.
When you’re done experimenting, shut down, and remove your Docker containers and Rescore Execution Plan by running the cleanup_resources.sh script created by the Quickstart script, e.g.:
In this post, we showed you how to use Amazon Kendra Intelligent Ranking plugin for self-managed OpenSearch to easily add intelligent ranking to your OpenSearch document queries to dramatically improve the relevance ranking of the results, while using your existing OpenSearch search engine deployments.
Read the Amazon Kendra Intelligent Ranking for self-managed OpenSearch documentation to learn more about this feature, and start planning to apply it in your production applications.
Abhinav Jawadekar is a Principal Solutions Architect focused on Amazon Kendra in the AI/ML language services team at AWS. Abhinav works with AWS customers and partners to help them build intelligent search solutions on AWS.
Bob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.
Today, we are announcing that DeepSeek AI ’s first-generation frontier model, DeepSeek-R1 , is available through Amazon SageMaker JumpStart and Amazon Bedrock M
Meetings play a crucial role in decision-making, project coordination, and collaboration, and remote meetings are common across many organizations. However, cap
This post was co-written wit h Saurabh Gupta and Todd Colby from Pushpay. Pushpay is a market-leading digital giving and engagement platform designed to help ch
The ability to effectively handle and process enormous amounts of documents has become essential for enterprises in the modern world. Due to the continuous infl
Conducting assessments on application portfolios that need to be migrated to the cloud can be a lengthy endeavor. Despite the existence of AWS Application Disco
Change is inevitable in an organization; especially in the age of digital transformation and emerging technologies, businesses and employees need to adapt. Chan