Skip to content

Add Annotation to Stored Triples #10410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: latest-txt2kg
Choose a base branch
from

Conversation

nv-rliu
Copy link

@nv-rliu nv-rliu commented Aug 11, 2025

Closes https://github.com/rapidsai/graph_dl/issues/820

This PR adds support for storing the model used for generating triplets in the txt2kg_rag workflow.

@nv-rliu nv-rliu changed the base branch from master to latest-txt2kg August 11, 2025 18:48
Copy link
Contributor

@Kh4L Kh4L left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, left a small comment

@@ -322,8 +322,15 @@ def index_kg(args, context_docs):
checkpoint_path = os.path.join(args.dataset, "checkpoint_kg.pt")
if os.path.exists(checkpoint_path):
print("Restoring KG from checkpoint...")
saved_relevant_triples = torch.load(checkpoint_path,
weights_only=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove weights_only=False?

i think you should leave that arg

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought weights_only was only useful if we want to extract model weights and biases, aka, tensors. Otherwise, isn't the argument not really useful in this situation? I can add it back if that's incorrect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK thats the case when you set True, here we want to load items that are not Tensors (W&B), so we need to set it to False, otherwise pytorch might complaint

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. I thought it was okay to not use it since this Data object is never storing weights and biases. I'll add that back in.

@@ -409,12 +418,17 @@ def make_dataset(args):
triples = []
raw_triples_path = os.path.join(args.dataset, "raw_triples.pt")
if os.path.exists(raw_triples_path):
triples = torch.load(raw_triples_path, weights_only=False)
saved_data = torch.load(raw_triples_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz save/load w format {llm-name}{datetime}raw_triples.pt

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed format

@puririshi98
Copy link
Contributor

also plz make CI green, check the failures and fix

Copy link

codecov bot commented Aug 12, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (latest-txt2kg@c5f7c9b). Learn more about missing BASE report.

Files with missing lines Patch % Lines
torch_geometric/nn/nlp/txt2kg.py 50.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (50.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@               Coverage Diff                @@
##             latest-txt2kg   #10410   +/-   ##
================================================
  Coverage                 ?   85.08%           
================================================
  Files                    ?      507           
  Lines                    ?    35456           
  Branches                 ?        0           
================================================
  Hits                     ?    30166           
  Misses                   ?     5290           
  Partials                 ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nv-rliu
Copy link
Author

nv-rliu commented Aug 12, 2025

It looks like CI is failing here but I'm not sure why. The files I touched shouldn't affect the tests, right?

@Kh4L
Copy link
Contributor

Kh4L commented Aug 12, 2025

It looks like CI is failing here but I'm not sure why. The files I touched shouldn't affect the tests, right?

yeah i think we're safe skipping it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants