-
Notifications
You must be signed in to change notification settings - Fork 237
fix: period index calculation #1641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to bed so if you manage to make all the required changes and want a re-review, please ping @litt3. Otherwise I can take a look over the weekend potentially.
@@ -116,8 +116,8 @@ func (a *Accountant) BlobPaymentInfo( | |||
return a.cumulativePayment, nil | |||
} | |||
return big.NewInt(0), fmt.Errorf( | |||
"no bandwidth reservation found for account %s, and current cumulativePayment balance insufficient "+ | |||
"to make an on-demand dispersal. Consider depositing more eth to the PaymentVault contract.", a.accountID.Hex()) | |||
"invalid payments: no available bandwidth reservation found for account %s, and current cumulativePayment balance insufficient "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really feel like this function should be broken into two: a function that only checks (and updates) reservation, and IF that fails (returns nil), then its caller's job to get a cumulativePayment). This function is very hard to review because its doing too many things, and BlobPaymentInfo
doesn't reflect the fact that it updates state, and possibly returns a cumulativePayment. Can it update reservation state when returning a nonzero cumulativePayment?
These are all questions (and many more) that are running through my brain as I'm reviewing this. We could write more documentation on the function to describe all these possible scenarios, but I feel like it would be much simpler (and self-documenting) to split into 2 functions. Can prob be done in a separate PR at a future date, but still wanted to leave this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a good refactor, only called in one spot func (a *Accountant) AccountBlob
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with the reasoning. This is something I have planned in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is just a refactor, could you make it a separate in-between PR, to not bloat the PR with actual new features?
In integration team, we try (but tbf its hard) to follow the RII pattern: new feature introductions should be broken into 3 separate PRs:
- refactor (any code refactors that are needed to prepare for the feature addition)
- interface
- implementation
You already broke your PR into interface and implementation, so that's awesome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can break it up. Given this is meant to be a fix for the current accounting, I would do refactor/interface/implementation for the upcoming quorum specific accounting
@@ -116,8 +116,8 @@ func (a *Accountant) BlobPaymentInfo( | |||
return a.cumulativePayment, nil | |||
} | |||
return big.NewInt(0), fmt.Errorf( | |||
"no bandwidth reservation found for account %s, and current cumulativePayment balance insufficient "+ | |||
"to make an on-demand dispersal. Consider depositing more eth to the PaymentVault contract.", a.accountID.Hex()) | |||
"invalid payments: no available bandwidth reservation found for account %s, and current cumulativePayment balance insufficient "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a good refactor, only called in one spot func (a *Accountant) AccountBlob
.
api/clients/v2/accountant.go
Outdated
@@ -157,9 +157,24 @@ func (a *Accountant) SymbolsCharged(numSymbols uint64) uint64 { | |||
return core.RoundUpDivide(numSymbols, a.minNumSymbols) * a.minNumSymbols | |||
} | |||
|
|||
// GetRelativePeriodRecord returns the period record for the given index | |||
// return empty record if there is no record for the relative index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment about the edge case where relativeIndex >= uint32(a.numBins)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. this should never happen, but just in case, it will return an empty one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer true with the panic; did you remove the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed in commit
api/clients/v2/accountant.go
Outdated
// relativeIndex := uint32((index / a.reservationWindow) % uint64(a.numBins)) | ||
relativeIndex := uint32((index / a.reservationWindow) % uint64(a.numBins)) | ||
// Return empty record if the index is greater than the number of bins (should never happen by accountant initialization) | ||
if relativeIndex >= uint32(a.numBins) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ever possible that a.numBins can change at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no we don't have mut on it, it will be fixed to the configured value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdym by "have mut"? A new developer could certainly add some code that changes it (by mistake). Would that be a problem? If yes, then it should be documented in the accountant struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of which.. how should one choose numBins? There is no comment as to this effect, neither on the struct nor on the constructor.
- can it be anything?
- does it have to align with the disperser's meterer? (I see on our docs that "The disperser will track at least 4 periods per reservation, starting from the previous period to the period after next period. "). are these period per reservation the same thing?
- what is a good choice for this number? why? does it even matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some documentation.
- Anything. the accountant would use that configured number or 3, whichever is bigger
- the disperser tracks the same thing
- 3 is the minimum and the default to account for overflow usage. In reality client could even just have one bin and doesn't provide overflow functionality within the accountant. Using a bigger number doesn't provide functional difference to the accountant, but the dispersal client could track more reservation history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- how do you make sure the client is using the same numBins as the disperser? EDIT: I see you overwrite PeriodRecords in SetPaymentState... could we do this in constructor instead? SetPaymentState is mandatory to be called to function properly right? This should prob be documented, and also AccountBlob should just panic if SetPaymentState has not been called I think?
- whats the point of allowing >3 then if there's no functional difference? whats the point of tracking reservation history?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raised a similar question here #1599 (comment). Let's move the discussion to that PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
numBins doesn't have to use the same
numBins
as the disperser.SetPaymentState
are only called by the first dispersal request if the accountant isn't filled in. The idea was to allow clients to implement accountants that manage state differently such as persisting them instead of relying on the disperser to provide the payment state, or call eth client directly for the onchain states. These are thing we won't implement, but allowing the possibility for modifications. Perhaps we should make it more of a requirement. Can keep this thought for later -
I figured some clients might want to track more history for analyzing and adjusting their usage. The circular wrapping invariants stay the same with >3, and not a requirement to configure, so I figured it won't hurt. We can iterate on whether clients should have this flexibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second round of comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big improvements already, thanks for the changes. Mostly LGTM at this point, still have 2 questions to be addressed #1641 (comment)
thanks! I answered the questions, but can keep the discussions open for later changes. will go ahead and merge this one 🤞 |
* Revert "refactor: account ledger interface incorporates debit slip (#1732)" This reverts commit b4cf3d9. * Revert "fix: period record circular wrapping refresh (#1731)" This reverts commit 1091f46. * Revert "fix: disperser_client v2 nil ptr bug (#1710)" This reverts commit a5acd93. * Revert "fix: disperser_client v2 should use default onDemandQuorums (#1700)" This reverts commit 005e0a6. * Revert "refactor: unify accountant debit slip (#1695)" This reverts commit a9c0868. * Revert "feat: AccountantLedger interface for next step unifications (#1694)" This reverts commit 354e42f. * Revert "feat: extract generic payment logic (#1693)" This reverts commit 1003b98. * Revert "fix: make disperser client backwards compatible (#1686)" This reverts commit 26c612f. * Revert "fix: payment state api doesn't fail at zero valued reservations/ondemand (#1682)" This reverts commit 2031c1d. * Revert "refactor: quorum specific metrics for metered bytes (#1668)" This reverts commit 7ac688d. * Revert "feat: GetPaymentStateForAllQuorums api impl (#1664)" This reverts commit ed89588. * Revert "refactor: core meterer consolidation (#1663)" This reverts commit 319a865. * Revert "refactor: onchain state interface (#1662)" This reverts commit d795dfa. * Revert "refactor: core meterer period record module (#1661)" This reverts commit afbef90. * Revert "refactor: payment common functions (#1653)" This reverts commit 36a3d33. * Revert "fix: period index calculation (#1641)" This reverts commit c229748. * Revert "refactor: use generic column name (#1626)" This reverts commit 4d1994e. * Revert "feat: payment onchain state interface (#1625)" This reverts commit 61e46ac. * Revert "feat: offchain quorum period record getter (#1620)" This reverts commit a309139. * Revert "feat: protobuf payment state api quorum specific variant (#1613)" This reverts commit 58acdfd. * Revert "fix: nil assignment to new account (#1612)" This reverts commit bcbf9f5. * Revert "feat: offchain batch writes (#1580)" This reverts commit ce4d2f5. * Revert "feat: validator authenticate on-demand request by disperser key (#1539)" This reverts commit f545b16. * fix: issues after series of reverts (scary) * chore: add back ErrZeroSymbols error This error was likely removed by mistake during a revert conflict fix from one of the previous set of reverted PRs. * style: fix lint * test: fix accountant_test hardcoded error string * Revert "Revert "fix: period index calculation (#1641)"" This reverts commit b8767ae.
Why are these changes needed?
[context on circular wrapping]
the calculation for relative period index versus absolute period index was off by a factor of
reservationWindow
.At the time of accountant metering implementation,
reservationPeriodIndex
wastimestamp/reservationWindow
. We received an audit issue concerning that the records can become corrupted whenreservationWindow
changes (configurable by Payment Vault owner on-chain). Our solution was to updatereservationPeriodIndex
to the nearest lower multiple ofreservationWindow
(timestamp/reservationWindow * reservationWindow
) so that we can keepreservationWindow
configurable without storingreservationWindow
along each reservation period records or storingtimestamp
.This introduced a bug into the accountant, which kept using the original calculation
reservationPeriodIndex = timestamp/reservationWindow
,relativeBinIndex = reservationPeriodIndex % numBins
, andoverflowBinIndex = (reservationPeriodIndex + 2) % numBins
. This means that ifreservationWindow % numBins = 0
, the bin index is consistently 0 for current period, and 2 for overflow period. The unit tests were failing because all the cases hadreservationWindow % numBins != 0
This PR updates relative bin index calculation:
relativeBinIndex = (reservationPeriodIndex / reservationWindow) % numBins
, andoverflowBinIndex = (reservationPeriodIndex/reservationWindow + 2) % numBins
; added test to cover this edge caseChecks