In this lab, you will create a pipeline with the Business Process Automation Accelerator and utilize it to generate a JSON output in Azure Cosmos DB. We will create an indexer using search with this output and utilize the Sample Search Application within the BPA accelerator to search on specific aspects of the document.
- Use a sample invoice document and utilize the BPA accelerator to use the Form Recognizer Pretrained Invoice Model
- Add an element that converts the invoice output to simpler format
- Run the pipeline with sample document and create a Search indexer of the simplified output
- Utilize the Sample Search Application in the Accelerator to search on specific areas of the Invoice
- Utilize a pipeline to create Table index and use that in the Sample Search Application
- The accelerator is deployed and ready in the resource group
- You have access to sample invoices folder with the invoices to upload
-
Launch the accelerator from the resource group in the Static Web App
- To do this go to portal.azure.com (Azure Portal) from a web browser and click on resource group that is created for the purpose of this lab. Click on the resource group that is created for this lab, you should be able to see resources deployed as a part of Business Process Automation accelerator deployment.
Note : The names will be different in your specific labs and will not exactly match with the names of the resources or resource group
- Look for the Static Web App under type. This is what we will use as a part of lab 1. Click on the Web App.
-
This is the home page of the Accelerator. Click on Configure a new pipeline
-
Select Convert the Invoice Output to a Simpler Format in the Pipeline Preview page
-
Scroll down the page if need be and click Done. This step creates the pipeline
-
You should be able to see the pipeline created in page that loads next
-
The next step is to ingest documents into this pipeline. Click on home and select Ingest Documents
-
Select the pipeline you just created first from the dropdown and then drop documents from Sample Invoices folder. Use Sample 7 folder and drop a few documents from there. You may see a prompt that there are some active documents being processed by the pipeline
-
The results can be viewed in Azure Cosmos DB Data Explorer. To view the results, go to portal.azure.com (Azure Portal) again in your browser and get to the resource group like we did earlier in Step 1. There, in the resource group, click on the resource that is of type Azure Cosmos DB account
Click on one of the items. This represents the output from the pipeline on the documents uploaded. Since we added the item in the pipeline - Convert the Invoice Output to a Simpler Format, th output is simplified so we can create an indexer with Search Service.
Scroll through the results and check the output and compare with the invoice uploaded.
-
The get to Search Service. To view the results, go to portal.azure.com (Azure Portal) again in your browser and get to the resource group like we did earlier in Step 1. There, in the resource group, click on the resource that is of type Search Service.
-
Provide a name for datasource and click on Choose an existing connection for Connection String. Here the Azure CosmosDB resource created as a part of BPA accelerator already setup will be one of the sources you can choose from.
-
Keep the default for Managed identity Authentication, which is None. For Databases and Collection use the dropdown to select the same name as the Cosmos DB you selected at step 15.
-
Under Query, use the following Query. The pipeline should match the pipeline name you used in step 3
SELECT * from c WHERE c.id != 'pipelines' AND c.id != 'cogsearch' AND c.pipeline = 'lab1pipeline' AND c._ts >= @HighWaterMark ORDER by c._ts
-
Click Next: Add cognitive skills (Optional). This validates and creates the index schema.
-
In the next Screen(Add cognitive skills (Optional)), Click Skip to: Customize Target Index,
-
In the next screen, under aggregated results, click on the ... on invoice, click delete . Similarly, you can also delete resultindexes
-
Under aggregatedResults-> simplifyInvoice Select, customerName, invoiceId, invoicedate and dueDate to be filterable and sortable
-
Similarly, under aggregatedResults-> items, select all fields to be filterable and sortable.
-
Provide a name for the Index and click on Next: Create an indexer
-
Provide a name for the indexer and click Submit
-
You will get a notification that the import is successfully configured
-
Now, go back to the accelerator url that you retreived from Step 1 and click on Sample Search Application.
-
You can now filter and search on items and other fields configured.
We can extend this lab further by using Form Recognizer Layout Service and check how we can retrieve information in the form of tables using Azure Cognitive Search.
-
Create a new pipeline using the layout service and extract information for table search. The steps will be similar to Steps 1-8 in Part 1 that you just did. The pipeline page before you click Done at Step 7 should like like the screen shot below:
-
Next step would be to ingest documents in the pipeline similar to steps 9-11 in part 1 but use the pipeline created as a part of this exercise.
-
Now, we need to configure Search Service for table search. This can be configured similar to steps 12-17. The Query will be slightly different from what we used in step 17 and also make sure the pipeline is the name of the pipeline created for this step. Note that this query filters for table type
SELECT * from c WHERE c.id != 'pipelines' AND c.id != 'cogsearch' AND c.pipeline = 'lab1table' AND c.type = 'table' AND c._ts >= @HighWaterMark ORDER by c._ts
-
Follow steps 18-19 as before and when you get to Customize target index section, give the index a name that helps identify that it is a table index and then make all fields Searchable and Retrievable and the table data and id Filterable and Facetable.
-
Follow steps 24-25 and once you get a notification after clicking Submit, you can follow step 27 to open the Sample Search Application
-
Here, select the index created as a part of this exercise and also enable Table Search
-
Explore this UI, eg, table search configuration and filter and search on specific items to get more insights.