Skip to content

[Feature] Log-structured grain storage #9450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 23, 2025

Conversation

ReubenBond
Copy link
Member

@ReubenBond ReubenBond commented Apr 19, 2025

Summary:

This pull request introduces an experimental framework for log-structured grain state persistence in Orleans, based on the replicated state machine approach. This approach focuses on appending state changes to a log rather than overwriting full state, offering potential benefits for state types with high write frequency or complex structures. A key advantage is the ability to perform atomic updates across all durable state managed by a single grain activation. The PR includes the core journaling infrastructure, several powerful built-in durable collection types and state wrappers (including IDurableValue<T>), and an initial Azure Append Blob storage provider. A volatile in-memory provider is also included for development and testing.

Background:

This directly implements the proposal outlined in issue #7691. Traditional grain persistence often involves reading and writing the entire grain state for one or more IPersistentState<T> instances. This can become inefficient for large or frequently updated state objects, and crucially, updates to multiple IPersistentState<T> instances within a single grain are not atomic with respect to each other.

The log-structured approach provides an alternative model:

  • Grain state is modeled as a composition of one or more Durable State Machines (DSMs).
  • State modifications made by the grain are captured as a sequence of log entries (commands or events) and buffered in memory.
  • Calling WriteStateAsync triggers the persistence of all pending log entries across all managed DSMs as a single, atomic unit to durable storage.
  • The grain's in-memory state is reconstructed upon activation by replaying the ordered sequence of these log entries from the log.
  • Periodically, a snapshot of the complete state across all DSMs is saved, allowing the log to be truncated for faster future recoveries.

Beyond atomic updates and potential performance benefits for certain workloads, this pattern also offers significant extensibility:

  • Library developers can define custom IDurableStateMachine implementations for specialized state requirements.
  • The ordered, append-only log structure provides a foundation for building higher-level features such as transactions, indexing, and custom state types for Orleans features like reminders or stream checkpointing, which can then leverage this durable, atomic log.

Implementation Details:

This PR delivers the core components required for this pattern:

  1. Orleans.Journaling Project: This is the central library defining the pattern and providing the core state management logic and built-in types.

    • Core Interfaces: Defines IDurableStateMachine, IStateMachineManager (manages multiple state machines within a grain activation), and IStateMachineStorage (the interface for interacting with durable storage).
    • StateMachineManager: Implements the core logic for log recovery (replaying entries/snapshots), applying log entries and snapshots, buffering pending writes in memory (LogExtentBuilder), and coordinating atomic persistence operations (appending new log segments or writing full snapshots) via the IStateMachineStorage. It operates asynchronously in a dedicated work loop.
    • Built-in Durable Types: Provides ready-to-use implementations of common state patterns and collections as durable state machines. These types automatically generate log entries for their modifications via the IStateMachineLogWriter interface (provided by the StateMachineManager):
      • IDurableDictionary<K, V>
      • IDurableList<T>
      • IDurableQueue<T>
      • IDurableSet<T>
      • IDurableValue<T> (simple scalar value wrapper)
      • IPersistentState<T> (Implemented by DurableState<T>, providing compatibility with the existing grain persistence programming model for single values, backed by journaling).
      • IDurableTaskCompletionSource<T> (for persisting the outcome of asynchronous operations whose state needs durability).
      • IDurableNothing (a placeholder state machine that performs no operations, useful for marking state machines as retired).
    • DurableGrain Base Class: A convenient base class that automatically handles dependency injection and lifecycle management for the IStateMachineManager, simplifying grain implementation. It provides the WriteStateAsync() method.
    • VolatileStateMachineStorageProvider: An in-memory provider that implements IStateMachineStorage but does not provide cross-process durability. Ideal for testing or scenarios where persistence isn't required.
    • Hosting Extensions: Facilitates easy registration of the core journaling services (.AddStateMachineStorage()) and makes the built-in durable types available via keyed DI.
  2. Orleans.Journaling.AzureStorage Project: This library provides a concrete durable storage implementation for the journaling framework.

    • AzureAppendBlobStateMachineStorageProvider: Implements IStateMachineStorageProvider using Azure Append Blobs. Append blobs are a natural fit for storing the sequential log segments efficiently.
    • Append Blob Logic: Handles reading the full log (or latest snapshot + subsequent entries) during recovery, appending new log segments, replacing the log with snapshots, and deleting the blob. It uses Azure's built-in mechanisms for optimistic concurrency (ETags).
    • Configuration: Includes AzureAppendBlobStateMachineStorageOptions for connection details and container configuration, and hosting extensions (.AddAzureAppendBlobStateMachineStorage()) for easy setup.
  3. Serialization Layer Improvements:

    • Introduces ArcBufferWriter, an efficient pooled buffer implementation optimized for building the log segments (LogExtentBuilder), enabling fast in-memory buffering and efficient writing to storage backends. Based on atomic reference counting, with 'slices' taking leases.
    • Includes necessary adjustments to core serialization codecs and the deep copying mechanism to support the new buffer types and durable patterns.

Usage Examples:

Using the new journaling persistence involves configuring a storage provider and then injecting and using the durable types within a grain inheriting from DurableGrain.

1. Silo Configuration (Example with Azure Append Blob Storage):

Add necessary NuGet packages: Microsoft.Orleans.Journaling, Microsoft.Orleans.Journaling.AzureStorage.

siloBuilder.AddAzureAppendBlobStateMachineStorage(options =>
{
    // Configure Azure connection - using connection string, managed identity, etc
    options.ConfigureBlobServiceClient("YOUR_AZURE_STORAGE_CONNECTION_STRING");
    options.ContainerName = "orleans-grain-journal";
    // Other options like BlobClientOptions can be configured here
});

2. Grain Definition and State Injection:

Define your state using the built-in durable types or custom IDurableStateMachine implementations. Inherit from DurableGrain.

[GenerateSerializer]
public record class MySimpleState(string Name, int Count);

public interface IMyDurableGrain : IGrainWithStringKey
{
    Task SetInitialState(string name, int initialCount);
    Task UpdateCountAndDictionary(string key, int value);
    Task<(string Name, int Count)> GetSimpleState();
    Task<int> GetDictionaryValue(string key);
    Task<int> GetDictionaryCount();
}

public class MyDurableGrain(
    [FromKeyedServices("simple-state")] IDurableValue<MySimpleState> simpleState,
    [FromKeyedServices("my-dictionary")] IDurableDictionary<string, int> myDictionary)
    : DurableGrain, IMyDurableGrain
{
    public async Task SetInitialState(string name, int initialCount)
    {
        simpleState.Value = new MySimpleState(name, initialCount);
        myDictionary["initial"] = 1;
        await WriteStateAsync();
    }

    public async Task UpdateCountAndDictionary(string key, int value)
    {
        simpleState.Value = simpleState.Value with { Count = simpleState.Value.Count + 1 };
        myDictionary[key] = value;

        // WriteStateAsync persists changes from BOTH _simpleState and _myDictionary atomically
        await WriteStateAsync();
    }

    public Task<(string Name, int Count)> GetSimpleState()
        => Task.FromResult((simpleState.Value.Name, simpleState.Value.Count));

     public Task<int> GetDictionaryValue(string key)
        => Task.FromResult(myDictionary[key]);

    public Task<int> GetDictionaryCount()
        => Task.FromResult(myDictionary.Count);
}

Testing:

Comprehensive unit and integration tests are included in the Orleans.Journaling.Tests project. These tests cover:

  • The core StateMachineManager logic, including log recovery and write coordination.
  • Full test coverage for each of the built-in durable types (IDurableDictionary, IDurableList, IDurableQueue, IDurableSet, IDurableValue, DurableState (via IPersistentState), etc.) to ensure their operations are correctly logged and replayed.
  • Validation of state persistence and recovery against both the VolatileStateMachineStorageProvider and the AzureAppendBlobStateMachineStorageProvider, including scenarios with multiple operations, large state, and atomic updates across multiple state machines within a grain.

Note: As mentioned, this feature is introduced as experimental (ORLEANSEXP005). The APIs, implementation details, and performance characteristics may evolve based on feedback and further development.

Open Points / Future Work:

  • Develop additional storage providers for other backends (e.g., SQL, other existing Grain Storage providers via adapters).
  • Create detailed documentation and examples demonstrating various usage patterns, custom DSMs, and the benefits for specific workloads.
  • Refine metrics and monitoring.
  • More testing.
  • TTLs for abandoned state machines (delete the associated state for a missing state machine within a grain after some configurable period of time, eg 7 days).
  • Implement stream checkpointing, reminders "v2", inbox, and outbox using DSMs

@ReubenBond
Copy link
Member Author

We have been using this library internally for a few proof-of-concept projects and I want it to be available via nuget in an experimental, alpha form so that we and others can experiment more with it, provide feedback, and help to mature the library.

@ReubenBond ReubenBond force-pushed the feature/journaling/1 branch from 0cdc6eb to e8fe1ac Compare April 19, 2025 02:07
@alrz
Copy link
Member

alrz commented Apr 19, 2025

Does this mean if we have a collection as the state, this can be used to append to the collection without resending the entire collection over? e.g it would be represented as a proper list in redis persistence provider. Although for this case it would be useful to also 'remove' a single item from the collection (random, dequeue, pop based on collection type)

@ReubenBond
Copy link
Member Author

Does this mean if we have a collection as the state, this can be used to append to the collection without resending the entire collection over?

Yes

e.g it would be represented as a proper list in redis persistence provider. Although for this case it would be useful to also 'remove' a single item from the collection (random, dequeue, pop based on collection type)

No - this represents state as opaque log entries (like a database write-ahead log), not as DB-native data structures. It's possible that state could be represented as a single redis hash or list, but the contents would be opaque blobs.

One alternative would be to replace the implementations of IDurableX in the container, but you'd need a fallback for unknown state machine types and types which don't have good representations in the target database. Then, during storage operations you would need to use a transaction to load all state or store all changes atomically. Conceptually doable, but not something I've been considering for this PR and probably not something which could be supported well. Why not use Redis APIs from your grain instead?

@alrz
Copy link
Member

alrz commented Apr 19, 2025

Why not use Redis APIs from your grain instead?

I wonder what I miss doing that instead of orleans persistence for any kind of state actually.

@ReubenBond
Copy link
Member Author

ReubenBond commented Apr 19, 2025

Why not use Redis APIs from your grain instead?

I wonder what I miss doing that instead of orleans persistence for any kind of state actually.

Mostly just convenience - Orleans loads the state for you during activation and manages concurrency using etags to prevent lost writes, etc. We often encourage people to go directly to their database when they have needs not satisfied by that simple persistence model.

Edit: and it keeps the state in memory while the grain is active, allowing you to serve reads without needing to perform IO. When you manage storage in application code, you need to handle all of this yourself. It may not be a burden for some, but it adds complexity to the application as well as room for error.

@ReubenBond ReubenBond changed the title [Feature] Log-structured grain storage framework and Azure Append Blob provider [Feature] Log-structured grain storage Apr 19, 2025
@ReubenBond ReubenBond requested a review from Copilot April 19, 2025 20:06
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds an experimental log‐structured journaling framework for Orleans grain state persistence, along with a suite of durable state machine types and an Azure Append Blob storage provider for durable logging.

  • Introduces a new Orleans.Journaling project that defines durable state machine interfaces and implementations (e.g. DurableValue, DurableList, DurableDictionary, etc.).
  • Implements an Azure Append Blob storage provider for atomic persistence and recovery, together with hosting extensions and serialization improvements.

Reviewed Changes

Copilot reviewed 57 out of 59 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
HostingExtensions.cs Registers scoped services for state machine storage and various durable types.
DurableValue.cs, DurableTaskCompletionSource.cs, DurableState.cs Implements durable state machine patterns with version checking and logging of state changes.
DurableSet.cs, DurableQueue.cs, DurableList.cs, DurableDictionary.cs Provides durable collection implementations using new C# collection initialization syntax.
DurableGrain.cs Provides a base class to support dependency injection and lifecycle management for durable state machines.
Azure/... Implements Azure Append Blob storage provider and related hosting extensions for journaling persistence.
MigrationContext.cs Updates disposal behavior to reset and clear buffer state.
Files not reviewed (2)
  • Orleans.sln: Language not supported
  • src/Azure/Orleans.Journaling.AzureStorage/Orleans.Journaling.AzureStorage.csproj: Language not supported

/// Container name where state machine state is stored.
/// </summary>
public string ContainerName { get; set; } = DEFAULT_CONTAINER_NAME;
public const string DEFAULT_CONTAINER_NAME = "state";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this container name have to be globally unique for the storage account? Should we prepend something here to avoid collisions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's configurable, so developers can configure it how they like (eg, prefixing it with a unique id, possibly based on ServiceId). They can also provide a factory via the BuildContainerFactory property below to customize it on a per-grain basis.

They have options already, but we could set a different default value or potentially prefix it with the ServiceId automatically. What do you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a collision error will be obvious, I guess it's fine to leave as is.

{
throw new InvalidOperationException("Registering a state machine after activation is invalid");
/*
// Re-enqueue the work item without completing it, while waiting for the state machine manager to be initialized.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we re-enqueue instead of throwing? With some mitigation for infinite looping...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll delete the comment in code. It's an enhancement we should implement in future (maybe before marking this as stable), but IMO we shouldn't block for it.

/// <summary>
/// Resets this instance, returning all memory.
/// </summary>
public void Reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe if there's an ArcBuffer that has a reference to this memory? We unpin the page and return it to the page pool but the ArcBuffer referencing it doesn't AFAICT know that the page may be being reused. The version check is only used during pin/unpin but not on access.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpin decrements the ref count, but if there is an ArcBufferSlice that references a page, it will be keeping the ref count above zero, so this is safe (sans implementation bugs).

}

/// <inheritdoc/>
public void Dispose()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we concerned that this doesn't actually release pinned pages if there's outstanding references? I guess the class of problem here is similar to the other comment; the writer has no insight into how many ArcBuffers were issued or whether they were released yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the expected behavior. You write, someone peeks/consumes slices, you dispose, eventually (maybe a long time after, and maybe the buffer is sliced repeatedly) all slices are disposed and the pages are returned. The reason we use atomic ref counting is to separate the slice lifetime from the buffer writer lifetime.

@ReubenBond ReubenBond merged commit 9948a33 into dotnet:main Apr 23, 2025
25 checks passed
@ReubenBond ReubenBond deleted the feature/journaling/1 branch April 23, 2025 02:18
@github-actions github-actions bot locked and limited conversation to collaborators May 23, 2025
@ReubenBond
Copy link
Member Author

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants