Skip to content

[MEVD] Replace VectorStoreGenericDataModel with Dictionary<string, object?> #10802

@roji

Description

@roji
  • VectorStoreGenericDataModel is dynamic in terms of the data and vector properties, but not in terms in the key, since it's generic (in the .NET sense) over TKey. In other words, for dynamic scenarios where e.g. a layer like kernel memory (or whatever) is layered on top, Type.MakeGenericType() would have to be used to create a closed VectorStoreGenericDataModel over a specific key type; if nothing else, that's incompatible with NativeAOT.
    • Note that this problem goes beyond just VectorStoreGenericDataModel - it affects IVectorStoreRecordCollection as well, which is also generic over the key type. We'd like need to make it generic over object for such dynamic cases.
  • VectorStoreGenericDataModel contains a key, a dictionary of data properties and a dictionary of vector properties. The separation between these three things at the .NET type level seems unnecessary: it seems like we could allow users to simply map a Dictionary<string, object?> directly for dynamic use, without having to go through a special type.
    • IVectorStoreRecordCollection already knows all the modeling information internally (i.e. which key/data/vector properties exist and metadata about them), so the user-facing type can simply be a flat bag of properties.
    • After all, when the user maps a static .NET type, they don't make the distinction between key/data/vector properties - they just have flat properties on their custom type. I think the same can work in the same way for Dictionary<string, object?>.
  • Naming-wise, "generic data model" for this feature is a bit problematic, since generic has a very well-defined meaning in .NET (generic types, generic methods). I'd like to propose "dynamic" as the name for this feature - it's basically about working with dynamic data models where there's no strongly-typed CLR type being mapped.
    • If we really wanted to, we could even look into supporting the C# dynamic language feature (link); that's generally not recommended these days (and IIRC isn't supported in expression trees, so LINQ filtering wouldn't be possible), but conceptually we'd be using Dictionary<string, object?> in the exact same way.

Once this is done, make sure to implement LINQ filtering for this (#10468).

/cc @westey-m @dmytrostruk @adamsitnik

Sub-issues

Metadata

Metadata

Assignees

Labels

.NETIssue or Pull requests regarding .NET codeBuildFeatures planned for next Build conferencememorymemory connectormsft.ext.vectordataRelated to Microsoft.Extensions.VectorData

Type

No type

Projects

Status

Sprint: Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions