Skip to content

FullTextScore() in ORDER BY clause does not take affect in Cosmos DB NoSQL full-text queries with partition key #5332

@Wizmann

Description

@Wizmann

Describe the bug

When using Cosmos DB for NoSQL with Full-Text Search enabled (via fullTextPolicy and fullTextIndexes), the ORDER BY RANK FullTextScore(c.text, 'keyword') clause appears to have no effect on query results when partition key is enabled in the query. The ranking does not change even when the relevance of the documents clearly differs, and changing the keyword produces the same order of results.

(Python SDK has the same issue, ref)

To Reproduce

Steps to reproduce the behavior:

  1. Cosmos DB account with Full Text Search enabled (on CentralUS)
  2. Container created with:
full_text_policy = {
    "defaultLanguage": "en-US",
    "fullTextPaths": [{"path": "/text", "language": "en-US"}]
}
indexing_policy = {
    "indexingMode": "consistent",
    "automatic": True,
    "includedPaths": [{"path": "/*"}],
    "excludedPaths": [{"path": "/\"_etag\"/?"}],
    "fullTextIndexes": [{"path": "/text"}]
}

and Partition key: /userid

  1. Query used:
var query = $"SELECT * FROM c WHERE c.userid='{userid}' ORDER BY RANK FullTextScore(c.text, '{keyword}')";
var iterator = container.GetItemQueryIterator<dynamic>(
    query,
    requestOptions: new QueryRequestOptions { PartitionKey = new PartitionKey(userid) }
);

Expected behavior
The query:

SELECT TOP 5 * FROM c 
ORDER BY RANK FullTextScore(c.text, 'fence')

should return the documents where c.text is most relevant to 'fence' first. In this example, the sentence:

"The cat jumped over the fence."

should be ranked highest.

Actual behavior

The returned documents are not ranked by relevance as expected.

Environment summary
SDK Version: Microsoft.Azure.Cosmos 3.52.1
OS Version: Linux

Additional context

Sample code to reproduce the problem:

using System;
using System.Threading.Tasks;
using System.Collections.Generic;
using Azure.Identity;
using Microsoft.Azure.Cosmos;

class Program
{
    private static readonly string endpoint = "https://<ENDPOINT>.documents.azure.com:443/";
    private static readonly string databaseId = "<DATABASE>";
    private static readonly string containerId = "dogcat";
    private static CosmosClient cosmosClient;
    private static Container container;
    private static Database database;

    static async Task Main(string[] args)
    {
        cosmosClient = new CosmosClient(endpoint, new DefaultAzureCredential());

        database = await cosmosClient.CreateDatabaseIfNotExistsAsync(databaseId);

        var containerProperties = new ContainerProperties(containerId, "/userid")
        {
            IndexingPolicy = new IndexingPolicy
            {
                IndexingMode = IndexingMode.Consistent,
                Automatic = true,
                IncludedPaths = { new IncludedPath { Path = "/*" } },
                ExcludedPaths = { new ExcludedPath { Path = "/\"_etag\"/?" } }
            }
        };

        container = await database.CreateContainerIfNotExistsAsync(containerProperties);

        await ClearContainer();
        await InsertTestData();

        await Task.Delay(2000); // Wait for indexing

        await FullTextSearch("cat", "fence");
        await FullTextSearch("dog", "ball");
    }

    private static async Task ClearContainer()
    {
        var query = "SELECT c.id, c.userid FROM c";
        var iterator = container.GetItemQueryIterator<dynamic>(query);
        while (iterator.HasMoreResults)
        {
            foreach (var item in await iterator.ReadNextAsync())
            {
                await container.DeleteItemAsync<dynamic>(item.id.ToString(), new PartitionKey(item.userid.ToString()));
            }
        }
    }

    private static async Task InsertTestData()
    {
        var catSentences = new List<string>
        {
            "The cat is sleeping on the couch.",
            "A black cat crossed the road.",
            "Cats are curious animals.",
            "I have a cat named Whiskers.",
            "The cat jumped over the fence."
        };

        var dogSentences = new List<string>
        {
            "The dog barked loudly last night.",
            "Dogs love going for walks.",
            "A golden retriever is a friendly dog.",
            "My dog plays fetch every day.",
            "The dog chased the ball into the yard."
        };

        foreach (var text in catSentences)
        {
            await container.CreateItemAsync(new { id = Guid.NewGuid().ToString(), userid = "cat", text });
        }

        foreach (var text in dogSentences)
        {
            await container.CreateItemAsync(new { id = Guid.NewGuid().ToString(), userid = "dog", text });
        }
    }

    private static async Task FullTextSearch(string userid, string keyword)
    {
        Console.WriteLine($"\nTop results for userid='{userid}' with keyword='{keyword}':");

        var query = $"SELECT * FROM c WHERE c.userid='{userid}' ORDER BY RANK FullTextScore(c.text, '{keyword}')";
        var iterator = container.GetItemQueryIterator<dynamic>(
            query,
            requestOptions: new QueryRequestOptions { PartitionKey = new PartitionKey(userid) }
        );

        while (iterator.HasMoreResults)
        {
            foreach (var item in await iterator.ReadNextAsync())
            {
                Console.WriteLine($"- [{item.userid}] {item.text}");
            }
        }
    }
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions