Before You Get Started

Medallion Architecture

Although not required, it would be helpful to be familiar with the medallion architecture used in a lakehouse. The layers in the architecture are referred to as bronze, silver, and gold. However, your organization may utilize different terminology. You can learn more about the medallion architecture here.

About This Accelerator

This accelerator is self-contained.

No third party services are used. Everything, including your data, stays within your Databricks workspace.
Although your workspace must attached to a Unity Catalog metastore, this accelerator generates the catalog, schema, tables and volumes for you.
This accelerator leverages a managed volume in Unity Catalog. External storage does not have to be defined.
No cluster init scripts are used.
All code is displayed in the notebooks.

Databricks Prerequisites

The workspace must be attached to a Unity Catalog metastore. For help on this, see https://docs.databricks.com/en/data-governance/unity-catalog/get-started.html.
Serverless compute enabled.
Personal access tokens must be enabled. For more information on this, see https://docs.databricks.com/en/administration-guide/access-control/tokens.html
Access to Databricks Foundation Model APIs. For more information on this, see https://docs.databricks.com/en/machine-learning/foundation-models/index.html.

Recommended Compute

The Databricks workspace used to test this accelerator is in the West US 2 region.

Compute policy: unrestricted
Databricks Runtime: 14.3 LTS ML
Worker type: Standard_DS3_v2
Min workers: 1
Max workers: 2
Autoscaling: yes
Photon acceleration: no
Termination period: 10 minutes

Optimization

During the course of building out this accelerator, we made heavy use of the Lakehouse Optimizer (LHO) to analyze our compute performance and orchestration.

Early on, based on back-of-napkin estimates, we had configured our compute worker for a minimum of 4 and maximum of 8. However, after evaluating the CPU Process Load and Process Memory Load KPIs in LHO, we adjusted our compute without a noticeable impact to performance.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
README.md		README.md
genAI-pdf-chatbot-quickstart-accelerator.dbc		genAI-pdf-chatbot-quickstart-accelerator.dbc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Before You Get Started

Medallion Architecture

About This Accelerator

Databricks Prerequisites

Recommended Compute

Optimization

About

BlueprintTechnologies/genAI-pdf-chatbot-databricks

Folders and files

Latest commit

History

Repository files navigation

Before You Get Started

Medallion Architecture

About This Accelerator

Databricks Prerequisites

Recommended Compute

Optimization

About

Topics

Resources

Stars

Watchers

Forks