-
Notifications
You must be signed in to change notification settings - Fork 2.2k
backport: graph: fix various races #10433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v0.20.x-branch
Are you sure you want to change the base?
backport: graph: fix various races #10433
Conversation
The PubKey method had a race condition where concurrent calls could all pass the nil check and race to write to the cached pubKey field. This is a classic check-then-act race. Remove the caching entirely to fix the race. The overhead of parsing a public key is minimal and doesn't justify the added complexity and race risk of caching.
Summary of ChangesHello @ellemouton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request backports critical fixes for several race conditions identified within the channel graph database. The changes primarily focus on enhancing thread safety by eliminating problematic caching patterns in public key parsing methods and introducing proper mutex protection for shared cache resources during block disconnection events. This ensures the integrity and consistency of the graph data under concurrent operations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a backport that fixes several race conditions in the channel graph database. The changes remove racy check-then-act caching logic for public keys in Node and ChannelEdgeInfo models by parsing them from bytes on each call. Additionally, it adds proper mutex protection for cache access in DisconnectBlockAtHeight to prevent data races. The fixes are correct and improve the concurrency safety of the graph database. The release note update is also accurate.
Both NodeKey1 and NodeKey2 methods had the same race condition as the Node.PubKey method, where concurrent calls could race to write to the cached fields. Remove the caching for the same reasons: parsing overhead is minimal and doesn't justify the complexity and race risk.
The DisconnectBlockAtHeight method was modifying the rejectCache and chanCache without holding the cacheMu lock. This caused races with other operations that properly held the lock, such as AddChannelEdge which modifies the caches in its OnCommit callback while the batch scheduler holds cacheMu. Fix by acquiring cacheMu before removing channels from the caches.
0ddd1d1 to
7f09fd8
Compare
backport of #10420