Skip to content

Conversation

@tsachiherman
Copy link
Contributor

@tsachiherman tsachiherman commented Dec 8, 2020

Summary

The existing transaction cache was always around the transaction entries that we had in our transaction pool. That has been working well for scenarios where the transaction pool is not congested. However, when we get to situations where the transaction pool is congested, it creates issues in the following two scenarios:

  1. A transaction is being received while the transaction pool is full. After we verify it's signature, we find that we won't be able to insert it into the transaction pool, and drop it. A subsequent block that we attempt to verify could include this transaction, and we will need to re-validate it's signature again.
  2. A node receive a proposal for verification. After verifying the proposal, the node receive a second proposal with a lower hash value ( and with a similar set of transactions ). At that point, the node would attempt to re-verify all the repeated transactions ( assuming they aren't present in the transaction pool ).

To address both issues, I've extracted the verified transaction cache out of the transaction pool into a separate object that is being held by the ledger. This object is always being used when verifying a transaction, and any verified transaction is being "set" in that object.

Test Plan

Unit tests were added and updated.

Performance Testing

The changes were tested using scenario1 and scenario2 networks; no regression was noted.

@ian-algorand ian-algorand added this to the Sprint 15 milestone Dec 11, 2020
r := rand.Intn(numAccs)
a := rand.Intn(1000)
f := config.Consensus[protocol.ConsensusCurrentVersion].MinTxnFee + uint64(rand.Intn(10))
f := config.Consensus[protocol.ConsensusCurrentVersion].MinTxnFee + uint64(rand.Intn(10)) + u
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was done to ensure that we don't end up with identical transactions when generating large number of txns.

@tsachiherman tsachiherman changed the title Implement node-wide transaction verification cache Create a unified transaction verification cache Dec 16, 2020
@tsachiherman tsachiherman marked this pull request as ready for review December 16, 2020 16:37
@tsachiherman tsachiherman self-assigned this Dec 16, 2020
@tsachiherman tsachiherman requested a review from a user December 16, 2020 19:28
Copy link
Contributor

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Som initial minor remarks. I need to take another look later.

}
groupCtxs := make([]*GroupContext, len(txnGroups))
for i, signTxnsGrp := range txnGroups {
groupCtxs[i], grpErr = TxnGroup(signTxnsGrp, blkHeader, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the cache param nil here? AddPayset is used only here, so... maybe let TxnGroup add a group?
I do not think few additional locks make a difference there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

many transaction groups will be of size 1. I think that we shouldn't take the lock if we don't have to..
after all, taking the lock takes 3000-5000 ns; multiply this by 10000 and you'll end up with some notable delay.

Copy link
Contributor

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as gerrit says "+1, Looks good to me, but someone else must approve"

if len(v.buckets[v.base])+len(txgroup) > entriesPerBucket {
// move to the next bucket while deleting the content of the next bucket.
v.base = (v.base + 1) % len(v.buckets)
v.buckets[v.base] = make(map[transactions.Txid]*GroupContext, entriesPerBucket)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably it is better pre-allocate to max(entriesPerBucket, len(txgroup))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the number of transaction in a group is 16. the entriesPerBucket is in the order of several thousands.
when allocating a new bucket, we want to have large buckets, and have each bucket contain all the transactions of a single txn group.

}
}
if !found {
transcationMissing = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break? since we going to error anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh yes.. Failing to pin a transaction within group ( or part of it ) isn't a good thing, but it shouldn't prevent us from pinning the rest of the entries. ( i.e. in the worst case scenario, we will need to verify the signature again for that particular transaction ).
The caller should log this, but there is nothing that really can be done at that point. ( and it's not really harmful either )

// we use the (base + W) % W trick here so we can go backward and wrap around the zero.
for offsetBucketIdx := baseBucket + len(v.buckets); offsetBucketIdx > baseBucket; offsetBucketIdx-- {
bucketIdx := offsetBucketIdx % len(v.buckets)
if ctx, has := v.buckets[bucketIdx][txID]; has {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we might stop earlier if we track how many buckets are in use. Maybe not a big deal, will only help on non-full cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that after the first cycle, all the bucket will be in use ( although they might contain "old" entries ).
My intent here was to try and avoid deleting the old maps entries.

Copy link
Contributor

@algonautshant algonautshant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.
I have some clarification questions.

// LogicSigSanityCheck checks that the signature is valid and that the program is basically well formed.
// It does not evaluate the logic.
func LogicSigSanityCheck(txn *transactions.SignedTxn, ctx *Context) error {
func LogicSigSanityCheck(txn *transactions.SignedTxn, groupIndex int, groupCtx *GroupContext) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test for this function?

// errMissingPinnedEntry is being generated when we're trying to pin a transaction that does not appear in the cache
var errMissingPinnedEntry = &VerifiedTxnCacheError{errors.New("Missing pinned entry")}

// VerifiedTransactionCache provides a cached store of recently verified transactions. The cache is desiged two have two separate "levels". On the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: designed two have -> designed to have

// entry isn't in pinned; maybe we have it in one of the buckets ?
found := false
// we use the (base + W) % W trick here so we can go backward and wrap around the zero.
for offsetBucketIdx := v.base + len(v.buckets); offsetBucketIdx > v.base; offsetBucketIdx-- {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the buckets are expected to be non-empty most of the time right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the usage. Proposal validation would cause full buckets, transaction gossiping that goes into the txpool would first go into the buckets and then moved into the pinned map.

@tsachiherman tsachiherman merged commit 3af8232 into algorand:master Dec 21, 2020
@tsachiherman tsachiherman deleted the tsachi/txn_cache branch December 21, 2020 19:10
tsachiherman added a commit to tsachiherman/go-algorand that referenced this pull request Jul 7, 2021
Create a unified transaction verification cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants