add metadata blob_storage_total_files and blob_storage_file_index on azure blob storage input #89
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds two new metadata fields to the Azure Blob Storage input:
blob_storage_total_files: The total number of files in the Azure Blob Storage container.
blob_storage_file_index: The current file index being processed.
These new metadata fields provide users with additional context about the progress of file processing in their Azure Blob Storage input.
Changes:
Added totalFiles and currentIndex fields to the azureBlobStorage struct.
Modified the Connect method to count the total number of files.
Updated the blobStorageMetaToBatch function to include the new metadata fields.
Incremented the currentIndex after processing each file in the ReadBatch method.
These changes will help users track the progress of their Azure Blob Storage input processing, especially when dealing with large numbers of files. The new metadata can be used for logging, monitoring, or implementing custom logic based on the processing progress.
Testing:
Tested the new metadata fields with various file counts in Azure Blob Storage containers.
Verified that the blob_storage_total_files remains constant throughout the processing.
Confirmed that the blob_storage_file_index increments correctly for each processed file.
Please review and let me know if any further changes or clarifications are needed.