Skip to content

Conversation

@elbruno
Copy link
Collaborator

@elbruno elbruno commented Jun 4, 2025

This pull request introduces several significant changes to the project, including the addition of new projects for video generation and PDF file processing, updates to the solution file to accommodate these changes, and the implementation of a new program for extracting structured data from PDFs using OpenAI's Semantic Kernel. Below is a summary of the most important changes:

Solution Updates

  • Added new projects to the solution file, including 11 Video Generation and 12 Files, along with their respective project files (VideoGeneration-AzureSora-01.csproj and OpenAI-FileProcessing-Pdf-01.csproj).
  • Updated the solution configuration to include build configurations (Debug and Release) for the newly added projects.
  • Added project dependencies and GUID mappings for the new projects in the solution file.

New PDF File Processing Project

  • Created a new project OpenAI-FileProcessing-Pdf-01 targeting .NET 9.0, with dependencies on Microsoft.Extensions.Configuration.UserSecrets and Microsoft.SemanticKernel.
  • Implemented a program in Program.cs to process real estate contracts in PDF format. The program uses OpenAI's GPT model to extract structured data (e.g., seller, buyer, property details) and outputs it in JSON format.

@github-actions
Copy link

github-actions bot commented Jun 4, 2025

👋 Thanks for contributing @elbruno! We will review the pull request and get back to you soon.

@elbruno elbruno requested a review from Copilot June 4, 2025 13:31
@elbruno elbruno merged commit 0bb3f4e into main Jun 4, 2025
1 check passed
@elbruno elbruno deleted the bruno-openai-pdfsample branch June 4, 2025 13:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new PDF file processing sample project using OpenAI’s Semantic Kernel and updates the solution to include it.

  • Introduces OpenAI-FileProcessing-Pdf-01 project to extract structured contract data from PDFs.
  • Updates the solution file to include the new PDF project.
  • Provides a sample Program.cs with a top-level await and data model for deserializing contract details.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

File Description
03-CoreGenerativeAITechniques/src/OpenAI-FileProcessing-Pdf-01/Program.cs Implements PDF ingestion, chat history setup, and JSON deserialization into Contract model
03-CoreGenerativeAITechniques/src/OpenAI-FileProcessing-Pdf-01/OpenAI-FileProcessing-Pdf-01.csproj Defines the new .NET 9.0 console project and dependencies
03-CoreGenerativeAITechniques/src/CoreGenerativeAITechniques.sln Adds solution entries for the new PDF project
Comments suppressed due to low confidence (1)

03-CoreGenerativeAITechniques/src/OpenAI-FileProcessing-Pdf-01/Program.cs:53

  • Consider adding unit or integration tests around this PDF→chat pipeline to validate JSON output (e.g., mock the GetChatMessageContentAsync call and verify Contract deserialization).
var response = await chatService.GetChatMessageContentAsync(history, executionSettings);

Console.WriteLine(response.Content);
Console.WriteLine("---");

var contract = JsonSerializer.Deserialize<Contract>(response.ToString());
Copy link

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re passing response.ToString() into the deserializer, which may not be the raw JSON. Use response.Content instead to deserialize the actual payload.

Suggested change
var contract = JsonSerializer.Deserialize<Contract>(response.ToString());
var contract = JsonSerializer.Deserialize<Contract>(response.Content);

Copilot uses AI. Check for mistakes.
var kernel = builder.Build();
var chatService = kernel.GetRequiredService<IChatCompletionService>();

var filePath = Path.Combine(Directory.GetCurrentDirectory(), "docs", "real-state-contract-1.pdf");
Copy link

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The filename uses "real-state" but the domain term is "real-estate"; consider renaming the file and references to real-estate-contract-1.pdf for clarity.

Suggested change
var filePath = Path.Combine(Directory.GetCurrentDirectory(), "docs", "real-state-contract-1.pdf");
var filePath = Path.Combine(Directory.GetCurrentDirectory(), "docs", "real-estate-contract-1.pdf");

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +89
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "12 Files", "12 Files", "{E224737A-CFFB-4292-8F73-A543A0387938}"
EndProject
Copy link

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution entry references a "12 Files" folder without a project file; it won’t build. Remove or update this stub entry to avoid confusion.

Suggested change
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "12 Files", "12 Files", "{E224737A-CFFB-4292-8F73-A543A0387938}"
EndProject

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants