Skip to content

Conversation

@j-berman
Copy link
Collaborator

The daemon pops blocks to handle a reorg. Upon popping a block, the daemon places txs from the block back into the pool. Some txs placed back in the pool may have ring signatures that reference outputs that are no longer in the chain.

Without this change, these invalid txs may linger in the pool (potentially up to 1 week), preventing users from re-submitting new valid txs spending inputs that should be spendable.

This change kicks these invalid txs from the pool, allowing users to re-submit new valid txs.

The daemon pops blocks to handle a reorg. Upon popping a block,
the daemon places txs from the block back into the pool. Some txs
placed back in the pool may have ring signatures that reference
outputs that are no longer in the chain.

Without this change, these invalid txs may linger in the pool,
preventing users from re-submitting new valid txs spending inputs
that should be spendable.

This change kicks these invalid txs from the pool, allowing users
to re-submit new valid txs.
@plowsof
Copy link
Contributor

plowsof commented Sep 17, 2025

test failure for convenience:

[ RUN      ] select_outputs.exact_unlock_block
/__w/monero/monero/tests/unit_tests/output_selection.cpp:259: Failure
Value of: picked_exact_unlock_block
  Actual: false
Expected: true

@nahuhh
Copy link
Contributor

nahuhh commented Sep 17, 2025

the major UX issue isnt just that outputs are unspendable, but that the wallet tries to send those same outputs again and the node rejects them. This makes the wallet unusable. The wallet should not be selecting outputs that are in the pool when constructing a new transaction.. but it does, and then just barks about double spending until you change to a flushed nodes.

re solution:

I think id prefer some sort of "rbf" type of replacement, over kicking.

Under "rbf", it would look like:
"tx invalidated = user given option to resubmit the tx (same recipient, same amount), else those outputs remain unspendable for N days". Its a more complicated fix, but if its possible, i think is much better.
In contrsst, kicking allows unintentional double spends of (previously) confirmed txs. User might not even realize that the tx was unconfirmed / kicked, and, if kicked w/o delay, has no incentive to not screw the recipient.

In a short term, id probably suggest fixing the wallet trying to reselect outputs that are clearly in the txpool. Invalidated txs to be kicked when they are replaced or after 7 days, whichever comes sooner

  1. Fix wallet attempting to respend (undetected?) spent key images that are in pool
  2. allow user to submit a replacement for an invalid tx, for the same amount and recipient(s). similar to rbf.
  3. kick invalid txs after 7days or when replaced

@j-berman
Copy link
Collaborator Author

j-berman commented Sep 17, 2025

Test failure is unrelated to this PR.


@nahuhh repeating from past discussions: this is a simpler change that is an improvement over the current behavior. Allowing a user to submit a tx while a tx spending same key image(s) is still in the pool, is a larger and more complex change. That would take time to implement and review -- I estimate we wouldn't be able to get such a change in for weeks/months.

Repeating why this PR is an improvement over current behavior: assume a user sweeps their wallet (or spends their wallet's only enote), then there is a reorg of >= 10 blocks placing that tx back in all pools (EDIT: and the tx references a ring member that was reorged out). With the current behavior, that user would not have a usable wallet for up to 7 days (unless miners flush their pools).

Past discussions on this:
https://libera.monerologs.net/no-wallet-left-behind/20250908#c581477-c581507
seraphis-migration#81 (comment)

Fix wallet attempting to respend (undetected?) spent key images that are in pool

#10081 and #10083 should address this.

@nahuhh
Copy link
Contributor

nahuhh commented Sep 17, 2025

I don't think its an improved ux for the recipient, who has now lost track* of funds (with or without this pr, they lose funds). the sender is also unaware that the tx is reversed. Nodes also dont notice the double spends.

10081/10083 improve the UX, and in the short term i'd advocate for reducing the time they are held in the txpool to something more reasonable. Kicking them immediately enables unintentional double spends - with no record of it.

if rbf-type txs were implemented at some point, there would be no reason to kick them immediately. The reason to do so right now is 1. w/o 10081/83 it breaks wallets for a week and 2. w/ 10081/83, it breaks swept wallets for a week

With 10081/83, wallets with more inputs would function as normal, and the users would see the unconfirmed tx (are they detected as failed tx?) tx persisting in the wallet, which they could manually resend or (if recipient) request be re-sent

edit:

trying not to make my comment too long, heres tldr:

fwiw, my biggest qualms with kicking are:

  1. the loss of data for recipient "now you see me, now you dont". A tx that had 12 confs 1 minute ago, has now disappeared into the abyss. Receiver accountability is far more important than accommodating senders ability to (double) spend
  2. the unintentional double spends (senders not realizing the funds were reversed)
  3. the untracked double spends by nodes

in addition to 10081/83, in the interim, id advocate for reducing the time from 7 days to something more reasonable. W/ rbf-type resolution, id probably advocate for going back to 7days unless tx was rbf-type resolved.

@plowsof
Copy link
Contributor

plowsof commented Sep 17, 2025

running on node3 .monerodevs.org (main/test/stagenet)

print_pool_stats on testnet is the same as release, i assume these 2 transactions are invalid for other reasons?

2025-09-17 16:43:03.639	I Monero 'Fluorine Fermi' (v0.18.1.0-6ce477fbf)
2 tx(es), 3724 bytes total (min 1525, max 2199, avg 1862, median 1862)
fees 0.000297920000 (avg 0.000148960000 per tx, 0.000000080000 per byte)
0 double spends, 0 not relayed, 2 failing, 2 older than 10 minutes (oldest 4 days ago), no backlog
   Age      Txes       Bytes
00:00:00       0           0
48:05:29       2        3724

@j-berman
Copy link
Collaborator Author

With current behavior, in the example, the recipient has no way to receive Monero from the sender for 7 days, which I would argue is unambiguously worse UX for recipient.

and in the short term i'd advocate for reducing the time they are held in the txpool to something more reasonable

What would you propose?


i assume these 2 transactions are invalid for other reasons?

A couple thoughts on this:

  1. The code in this PR only executes after blocks are popped from the chain. So it's possible this code hasn't executed on the daemon you're using.

  2. This PR checks to see if any ring members reference outputs that can no longer possibly be in the chain. It doesn't re-validate every tx beyond that specific rule. So technically it would be possible for invalid txs to linger in the node for some other reason even with this PR.


I will add, I just realized there is another issue that needs to be dealt with in the wallet: the wallet saves rings used in prior constructed txs, so that it will attempt to reuse the same ring if spending the same output (thus avoiding revealing which output is the real spend in the ring). That ring saves references to ring members by global output ID.

A reorg can modify global output ID's tied to members of a ring, including the output spent. The logic inside get_outs that reads saved rings does not account for the potential of the spent ring member's output id to be modified by the reorg here, and thus the wallet may be unable to spend an output in that case unless the user explicitly calls unset_ring, which could reveal to the network which output they are spending.

The simplest solution to that is probably to remove this throw, and log a warning in that case instead.

@nahuhh
Copy link
Contributor

nahuhh commented Sep 17, 2025

With current behavior, in the example, the recipient has no way to receive Monero from the sender for 7 days, which I would argue is unambiguously worse UX for recipient.

currently, yes, because the wallet is unusable due to it trying to re-spend the same key images. With 10081/83 that is limited to users who spent most/all of their balance in an invalid tx. And these users know that something went wrong. They dont have to pay attention to see if a tx disappeared. They, and recipient, would see the tx move from confirmed to pending and notice that there is a problem (instead of hiding a problem that costs one side $).

and in the short term i'd advocate for reducing the time they are held in the txpool to something more reasonable

What would you propose?

probably 24 or 72hrs. Long enough for recipient to notice, potentially contact the sender, for the daemon to log double spends, and for analysts to know what happened incl many txs were invalidated.

example: cfb is claiming that 0txs were invalidated and that there were no double spends. With this pr, that would appear to be correct. The fact is that there are 115 invalid txs and 55 double spends as a result of the 18 block reorg.


i assume these 2 transactions are invalid for other reasons?

i previously thought there were 117 txs reorged, but i think 2 of those may have been for a different reason.

I will add, I just realized there is another issue that needs to be dealt with in the wallet: the wallet saves rings used in prior constructed txs, so that it will attempt to reuse the same ring if spending the same output (thus avoiding revealing which output is the real spend in the ring). That ring saves references to ring members by global output ID.

A reorg can modify global output ID's tied to members of a ring, including the output spent. The logic inside get_outs that reads saved rings does not account for the potential of the spent ring member's output id to be modified by the reorg here, and thus the wallet may be unable to spend an output in that case unless the user explicitly calls unset_ring, which could reveal to the network which output they are spending.

The simplest solution to that is probably to remove this throw, and log a warning in that case instead.

Rucknium and i both experienced broken shared-ring-dbs after selfish mining/reorg tests on testnet

@j-berman
Copy link
Collaborator Author

I disagree with your argument that this PR is worse than current. Users sweeping their wallet / spending their only enotes being rendered unable to spend (and their recipients unable to receive from them) for any period of time is worse than any of the scenarios you've raised.

There is an additional segment of users affected: wallets that don't have more Monero beyond the Monero spent to be able to cover the amount they spent / want to spend over the period of time the tx is held in the pool.

Plus, we observed the live solution pushed across the network was for mining pools to flush their pools. If some significant % of those txs were sweeps (or wallets don't have more Monero to be able to cover the amount they spent in those txs), then that would be the solution pushed again. This PR is strictly better than requesting live miners flush their pools.

I don't really have more to argue in favor on this. I think it's an obvious improvement over current.

Rucknium and i both experienced broken shared-ring-dbs after selfish mining/reorg tests on testnet

Will work on this.

@nahuhh
Copy link
Contributor

nahuhh commented Sep 17, 2025

Plus, we observed the live solution pushed across the network was for mining pools to flush their pools.

because 10081/83 issues broke wallets, and further, because the wallets are unusable for an (unreasonable) 7 days.

If some significant % of those txs were sweeps (or wallets don't have more Monero to be able to cover the amount they spent in those txs), then that would be the solution pushed again.

they can wait 24hrs while the recipient realizes that the funds they had previously confirmed had been invalidated, instead of EOD comparing balance to invoice totals and realizing they are short 18xmr, then having to rely on external software to figure out which invoices have txids that were rm -rf'd. Meanwhile the sender has immediate access to more xmr than they should, and accidentally or intentionally (double) spends it.

@j-berman
Copy link
Collaborator Author

I disagree, and don't have more to argue.

@tevador
Copy link
Contributor

tevador commented Sep 18, 2025

I don't think removing the "invalid" transactions immediately is the best solution. The reorged alt-chain may still come to life. This could realistically happen with rolling DNS checkpoints.

Instead, I'd suggest to reduce CRYPTONOTE_MEMPOOL_TX_FROM_ALT_BLOCK_LIVETIME to a couple hours. One week is clearly excessive.

@j-berman
Copy link
Collaborator Author

It's worth noting that in the event the reorged alt-chain comes back to life, these txs would be re-validated, not lost. Albeit there may be a slice of pool txs that never made it into the chain in the first place, and also weren't relayed to whoever brings the alt-chain back to life.

I'm fine with reducing CRYPTONOTE_MEMPOOL_TX_FROM_ALT_BLOCK_LIVETIME to a couple hours as a replacement for this PR. Though the latter txs would linger for 3 days thanks to CRYPTONOTE_MEMPOOL_TX_LIVETIME. I figure it would probably make sense to reduce that constant too.

@Rucknium
Copy link

Rucknium commented Sep 19, 2025

I'll provide some empirical analysis of what happened to invalidated transactions after the September 14, 2025 18-block reorg.

As of five days later, at least one of the key images in a majority of invalidated transactions later appeared in transactions confirmed on the blockchain. AFAIK, given stealth addresses and confidential transactions, there is no way to know if the transaction sent the exact same amount of XMR to the same recipient (unless the invalidated tx contained multiple inputs and the later valid transaction did not spent the exact same set of key images), so I will call them "respends".

The timing of the respends is plotted below:

sep-14-respends

I also analyzed the rings for privacy issues. There were three categories. I will count them by number of key images instead of number of transactions.

  1. The first category shared zero ring member output indices in common with the original ring in the invalidated transaction. So far, there have been 29 of these key images. What probably occurred is the original input's real spend was in the orphaned chain. The real spend was later confirmed on the mainchain blockchain because it was still a valid tx. Then, the wallet that created the invalidated tx created a new tx that spent the output associated with the key image. The real spend could be deduced deterministically by checking which ring member was referenced in the orphaned chain by the invalidated transaction.

  2. The second category shared one ring member output index in common with the original ring in the invalidated transaction. So far, there have been 381 of these key images. The shared ring member must be the real spend of the ring.

  3. The third category share all (sixteen) ring member output indices in common with the original ring in the invalidated transaction. So far, there have been 6 of these key images. User privacy was successfully preserved in these cases.

Probably, these transactions were able to be spent because remote nodes and/or mining pools flushed their txpools. The different behavior between the 2nd and 3rd category of key images is interesting. Maybe another wallet implementation allowed preservation of privacy for the 3rd category of key images, but most users (in the 2nd category) needed to delete their shared ringDB to actually perform the respends.

Analysis code is available in https://github.com/Rucknium/misc-research/tree/main/Monero-Blockchain-Reorgs . To reproduce the analysis, you need a node that was running at the time of the 18-block reorg and it must not have been restarted since then (or it was restarted, but initialized with the --keep-alt-blocks startup flag).

Personally, I'm leaning toward reducing instead of eliminating CRYPTONOTE_MEMPOOL_TX_FROM_ALT_BLOCK_LIVETIME .

UPDATE 2025-09-24

Just after the 7-day CRYPTONOTE_MEMPOOL_TX_FROM_ALT_BLOCK_LIVETIME expired, several more transactions that spent outputs from invalidated transactions have been confirmed on the blockchain. The 7-day txpool livetime had probably prevented those users from respending outputs.

Updated numbers on the three categories of key images: Category one: 38. Category two: 397. Category three: 6.

sep-14-respends-UPDATE-sep-24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants