refactor: move more queries from tb to ch #2655

chronark · 2024-11-12T18:08:39Z

Summary by CodeRabbit

Release Notes

New Features
- Introduced new fields in the evals table for enhanced data structure.
- Added a new schema for managing billing rate limits and verifications in the ClickHouse database.
- New function to retrieve billable rate limits for specified workspaces.
Improvements
- Enhanced error handling and logging mechanisms in various tasks and functions.
- Improved clarity and robustness in code structure and organization across multiple files.
Refactor
- Reorganized import statements for better clarity and maintainability.
- Streamlined existing functions and their logic without altering core functionalities.
Bug Fixes
- Resolved issues related to data retrieval and logging in several components.

changeset-bot · 2024-11-12T18:08:44Z

⚠️ No Changeset found

Latest commit: db1519c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2024-11-12T18:08:44Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
dashboard	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 12, 2024 6:15pm
engineering	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 12, 2024 6:15pm
play	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 12, 2024 6:15pm
workflows	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 12, 2024 6:15pm
www	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 12, 2024 6:15pm

github-actions · 2024-11-12T18:08:53Z

Thank you for following the naming conventions for pull request titles! 🙏

coderabbitai · 2024-11-12T18:09:45Z

📝 Walkthrough

Walkthrough

This pull request encompasses a series of changes across multiple files within the billing application. The modifications primarily involve reordering import statements for clarity and organization, with several files seeing structural updates to database schemas, particularly in the evals, sections, and billing components. New fields have been added to the evals table, and a new schema for managing billable verifications and rate limits has been introduced in the ClickHouse database. Additionally, some files have seen the removal of unused components, reflecting a cleanup of the codebase.

Changes

File Path	Change Summary
apps/billing/src/lib/db-marketing/schemas/entries.ts	Reordered import statements; no functional changes.
apps/billing/src/lib/db-marketing/schemas/evals.ts	Reordered imports; added `outline` field and updated `recommendations` field in the `evals` table schema.
apps/billing/src/lib/db-marketing/schemas/firecrawl.ts	Reordered imports; no changes to the table structure or functionality.
apps/billing/src/lib/db-marketing/schemas/keywords.ts	Reordered imports; no changes to the table structure or functionality.
apps/billing/src/lib/db-marketing/schemas/searchQuery.ts	Reordered import statements; no functional changes.
apps/billing/src/lib/db-marketing/schemas/sections.ts	Added imports for `int`, `mysqlEnum`, and `primaryKey`; defined new `sectionsToKeywords` and `sectionContentTypes` tables.
apps/billing/src/lib/db-marketing/schemas/serper.ts	Reordered import statements; no changes to the table structure or functionality.
apps/billing/src/lib/firecrawl.ts	Streamlined imports; enhanced error handling in `getOrCreateFirecrawlResponse` and `getOrCreateSummary` functions.
apps/billing/src/lib/keywords.ts	Integrated `zod` for schema validation; updated functions to use validation schema.
apps/billing/src/lib/search-query.ts	Reordered import statements; no functional changes.
apps/billing/src/trigger/glossary/_generate-glossary-entry.ts	Reordered import statements; no functional changes.
apps/billing/src/trigger/glossary/create-pr.ts	Reordered import statements; no functional changes.
apps/billing/src/trigger/glossary/draft-sections.ts	Updated type imports and parameter types for clarity.
apps/billing/src/trigger/glossary/evals.ts	Reordered imports; no changes to core functionality.
apps/billing/src/trigger/glossary/generate-outline.ts	Reordered imports; enhanced error handling in `generateOutlineTask`.
apps/billing/src/trigger/glossary/keyword-research.ts	Enhanced keyword research capabilities by integrating additional data sources.
apps/billing/src/trigger/glossary/seo-meta-tags.ts	Reordered import statements; no functional changes.
apps/dashboard/app/(app)/apis/[apiId]/page.tsx	Renamed variable `emptyOverTime` to `_emptyOverTime`; removed console logs.
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/analytics/page.tsx	Removed `interval` parameter; simplified component functionality.
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/layout.tsx	Removed "Logs" navigation item; no other changes.
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/page.tsx	Deleted component for displaying semantic cache logs.
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/table.tsx	Deleted `LogsTable` component; removed associated logic.
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/util.ts	Deleted utility function for calculating log intervals.
apps/dashboard/app/(app)/settings/billing/page.tsx	Updated method for fetching rate limits; no changes to overall logic.
apps/dashboard/lib/audit.ts	Introduced new schema for validating audit log data; updated `insertAuditLogs` function.
apps/dashboard/lib/tinybird.ts	Deleted file containing Tinybird API interactions.
apps/dashboard/lib/trpc/routers/key/createRootKey.ts	Updated import for `UnkeyAuditLog` type; enhanced error handling in mutation logic.
apps/dashboard/lib/trpc/routers/rbac.ts	Restructured imports; added `upsertPermissions` function with enhanced error handling.
internal/clickhouse/schema/025_create_billable_verifications_v2.sql	Introduced new table and materialized view for billable verifications.
internal/clickhouse/schema/026_create_billable_ratelimits_v1.sql	Created table and materialized view for billing rate limits.
internal/clickhouse/src/billing.ts	Added `getBillableRatelimits` function; updated `getBillableVerifications` to point to a new source table.
internal/clickhouse/src/index.ts	Added `getBillableRatelimits` to the `billing` property of the `ClickHouse` class.

Possibly related PRs

feat: mostly just duplicate audit logs and other clickhouse stuff #2134: Modifications to audit logs, potentially related to import reordering in this PR.
refactor: query audit logs from planetscale #2181: Focuses on querying audit logs from PlanetScale, relevant due to shared context of handling audit logs.
perf: add database indices #2192: Addition of database indices in audit_logs.ts, possibly related to structural changes in this PR.
chore: remove tinybird imports #2217: Removal of Tinybird imports, relevant if changes involve similar import adjustments.
chore: add internal package called @unkey/vault #2279: Introduction of @unkey/vault for encryption, potentially related to handling sensitive data in this PR.

Suggested reviewers

mcstepp
perkinsjr

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

chronark · 2024-11-12T18:10:23Z

going to self approve this, as the changes are mostly formatting :/

coderabbitai

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (42)

internal/clickhouse/schema/026_create_billable_ratelimits_v1.sql (2)
3-12: Consider adding constraints and optimizing table structure

While the basic table structure is correct, consider these improvements for better data integrity and performance:

Add NOT NULL constraints to prevent null values

Add CHECK constraints for year/month validity

Consider partitioning by year for better query performance

Consider adding a TTL policy if historical data isn't needed indefinitely
 CREATE TABLE billing.billable_ratelimits_per_month_v1
 (
-  year          Int,
-  month         Int,
-  workspace_id  String,
-  count         Int64
+  year          Int NOT NULL,
+  month         Int NOT NULL,
+  workspace_id  String NOT NULL,
+  count         Int64 NOT NULL,
+  CONSTRAINT valid_year CHECK (year >= 2020),
+  CONSTRAINT valid_month CHECK (month BETWEEN 1 AND 12)
 )
 ENGINE = SummingMergeTree()
+PARTITION BY year
 ORDER BY (workspace_id, year, month)
+TTL toDateTime('2020-01-01') + INTERVAL 3 YEAR
 ;
1-30: Consider data migration and consistency strategy

Since this is part of moving from TimescaleDB to ClickHouse:

Ensure there's a data migration plan for existing rate limit data

Consider running both systems in parallel initially to validate data consistency

Plan for a gradual cutover strategy to minimize risk

Would you like help creating a data migration and validation plan?
internal/clickhouse/schema/025_create_billable_verifications_v2.sql (1)
2-11: Consider optimizing table definition for better performance

While the basic structure is correct, consider these improvements for better performance and data management:

Add explicit partition by year for efficient data management

Define TTL policy if historical data cleanup is needed

Consider using more specific integer types (Int16) for year/month
 CREATE TABLE billing.billable_verifications_per_month_v2
 (
-  year          Int,
-  month         Int,
+  year          Int16,
+  month         Int16,
   workspace_id  String,
   count         Int64
 )
 ENGINE = SummingMergeTree()
+PARTITION BY year
 ORDER BY (workspace_id, year, month)
internal/clickhouse/src/billing.ts (1)
4-38: Implementation looks good with minor suggestions

The function is well-structured with proper:

Input validation using Zod

Error handling for null/undefined cases

Default values

Clear SQL query with proper parameter binding

Consider adding JSDoc comments to document:

The return type and its meaning

Whether the count represents unique rate limits or total occurrences

Apply this diff to improve documentation:
+/**
+ * Get the total number of billable rate limits for a workspace in a specific month
+ * @param ch - ClickHouse querier instance
+ * @returns Promise<number> - The total count of rate limits (0 if none found)
+ */
 export function getBillableRatelimits(ch: Querier) {
apps/billing/src/lib/db-marketing/schemas/evals.ts (3)
Line range hint 18-19: Consider using JSON column type and proper serialization.

The recommendations and outline fields are storing JSON data as TEXT with string defaults. This approach has several potential issues:

No JSON validation at the database level

Less efficient storage and querying compared to native JSON columns

String literal defaults instead of proper JSON serialization

Consider these improvements:
- recommendations: text("recommendations").notNull().default("[]"),
- outline: text("outline").default("[]"),
+ recommendations: json("recommendations").notNull().$default(() => [] as const),
+ outline: json("outline").$default(() => [] as const),
This change would:

Ensure JSON validity at the database level

Enable better query performance for JSON operations

Properly serialize default values

Line range hint 39-65: Add validation schema for the outline field.

The new outline field lacks a corresponding Zod schema for validation. This could lead to type safety issues and inconsistent data.

Consider adding a validation schema:
export const outlineSchema = z.object({
  outline: z.array(
    z.object({
      // Define the expected structure of your outline items
      // Example:
      title: z.string(),
      content: z.string(),
      // Add more fields as needed
    })
  ),
});
Then extend the existing schemas:
 export const insertEvalSchema = createInsertSchema(evals)
-  .extend({})
+  .extend({
+    outline: outlineSchema.optional(),
+  })
   .omit({ id: true });
Missing Migration for evals Schema Changes.
The current migration strategy does not include the necessary changes for the evals schema in apps/billing/src/lib/db-marketing/schemas/evals.ts.

🔗 Analysis chain

Line range hint 1-65: Verify database migration strategy.

As part of moving queries from TB to CH (as per PR title), ensure these schema changes are properly handled in the migration process.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Check for related migration files
fd -e sql -e ts migration apps/billing/src/lib/db-marketing/migrations

# Look for any existing data transformation logic
rg -l "outline|recommendations" apps/billing/src/lib/db-marketing
Length of output: 323

Script:
#!/bin/bash
# Search for any migration files in the entire repository
fd -e sql -e ts migration

# Search for data transformation logic related to evals
rg -l "evals|evalTypes" .
Length of output: 760

Script:
#!/bin/bash
# Search for references to 'evals' within migration files
rg 'evals' $(fd -e ts migration)
Length of output: 478
apps/billing/src/lib/search-query.ts (3)
Line range hint 29-42: Consider improving the prompt engineering

The NOTE comment indicates known issues with the prompt quality. The current prompt might benefit from:

More specific examples of good/bad queries

Explicit constraints on query length

Domain-specific context about API development

Would you like me to help generate an improved prompt structure?

Line range hint 45-71: Add retry logic and explicit error handling for database operations

The transaction could fail due to temporary database issues. Consider:

Adding retry logic for transient failures

Explicit error handling for transaction rollbacks

Logging of failure scenarios

Here's a suggested improvement:
+ import { retry } from '@/lib/utils/retry';  // Add retry utility

- await db.transaction(async (tx) => {
+ await retry(async () => {
+   return db.transaction(async (tx) => {
      // Insert search query
      await tx
        .insert(searchQueries)
        .values({
          ...generatedQuery.object,
        })
        .onDuplicateKeyUpdate({
          set: {
            updatedAt: sql`now()`,
          },
        });

      // Insert entry
      await tx
        .insert(entries)
        .values({
          inputTerm: term,
        })
        .onDuplicateKeyUpdate({
          set: {
            updatedAt: sql`now()`,
          },
        });
+   });
+ }, {
+   retries: 3,
+   minTimeout: 1000,
+   factor: 2,
+ });
Unusual OpenAI model name "gpt-4o-mini" detected in multiple files:

apps/billing/src/trigger/glossary/generate-outline.ts

apps/billing/src/trigger/glossary/evals.ts

apps/billing/src/trigger/glossary/draft-sections.ts

apps/billing/src/lib/search-query.ts

apps/billing/src/lib/firecrawl.ts

apps/billing/src/lib/keywords.ts

🔗 Analysis chain

Line range hint 31-31: Verify the OpenAI model name

The model name "gpt-4o-mini" appears unusual and might be a typo. Standard OpenAI models typically follow patterns like "gpt-4", "gpt-3.5-turbo", etc.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for other OpenAI model usages in the codebase
rg -g '!{*.lock,package-lock.json}' 'openai$["\']gpt-' --type ts
Length of output: 130

Script:
#!/bin/bash
# Improved search for OpenAI model usages in the codebase, accounting for potential typos
rg -g '!{*.lock,package-lock.json}' 'openai\(["\']gpt-4[o0]-mini["\']' --type ts
Length of output: 93

Script:
#!/bin/bash
# Improved search for OpenAI model usages in the codebase, correcting quotation marks
rg -g '!{*.lock,package-lock.json}' 'openai\(["'\'']gpt-4[o0]-mini["'\'']$' --type ts
Length of output: 867
apps/billing/src/lib/db-marketing/schemas/sections.ts (3)
Line range hint 42-51: Document content types and consider exporting the array

The content types array defines valid section types but lacks documentation explaining each type's purpose. Additionally, consider exporting this constant for reuse in other parts of the application.

Consider applying these improvements:
+/**
+ * Valid content types for sections:
+ * - listicle: Ordered or unordered lists of items
+ * - table: Tabular data presentation
+ * - image: Image content
+ * - code: Code snippets or examples
+ * - infographic: Visual information graphics
+ * - timeline: Time-based content presentation
+ * - other: Miscellaneous content types
+ * - text: Plain text content
+ * - video: Video content
+ */
-const contentTypes = [
+export const contentTypes = [
   "listicle",
   "table",
   // ... rest of the types
] as const;
Also applies to: 53-59

Line range hint 82-85: Standardize timestamp precision across tables

The timestamps in sectionsToKeywords use microsecond precision (fsp: 3), while other tables use default precision. Consider standardizing this across all tables for consistency.

Database migration strategy not fully implemented

The table sectionsToKeywords is still using MySQL schema definitions. Consider updating it to ClickHouse-specific schema definitions to align with the migration objective.

🔗 Analysis chain

Line range hint 78-91: Verify database migration strategy

Given that this PR's objective is to move queries from TB to CH (TimescaleDB to ClickHouse), please verify if this table should be using ClickHouse-specific schema definitions instead of MySQL.

Run this script to check for other ClickHouse schema definitions in the codebase:
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Look for ClickHouse schema definitions and verify consistency

# Search for ClickHouse table definitions
rg -l "clickhouse.*Table|createTable.*clickhouse" --type ts

# Search for similar junction tables in ClickHouse
rg "many-to-many|junction.*table|.*ToKeywords" --type ts
Length of output: 2613
apps/billing/src/trigger/glossary/keyword-research.ts (6)
1-11: Consider organizing imports for better maintainability

Consider grouping imports into sections:

External dependencies (trigger.dev, drizzle-orm)

Internal shared utilities

Local module imports
+ // External dependencies
import { AbortTaskRunError, task } from "@trigger.dev/sdk/v3";
import { sql, inArray, and, eq } from "drizzle-orm";

+ // Internal shared utilities
import { db } from "@/lib/db-marketing/client";
import { keywords } from "@/lib/db-marketing/schemas";
import { getOrCreateFirecrawlResponse } from "@/lib/firecrawl";

+ // Local imports
import { getOrCreateKeywordsFromHeaders, getOrCreateKeywordsFromTitles } from "../../lib/keywords";
import { getOrCreateSearchQuery } from "../../lib/search-query";
import { getOrCreateSearchResponse } from "../../lib/serper";
import type { CacheStrategy } from "./_generate-glossary-entry";
Line range hint 13-19: Consider a more descriptive constant name

The constant THREE could be more descriptively named to indicate its purpose, such as TOP_RESULTS_LIMIT or MAX_SEARCH_RESULTS.
-export const THREE = 3;
+export const TOP_RESULTS_LIMIT = 3;
Line range hint 24-35: Consider enhancing cache invalidation logic

The current caching strategy only checks for existence but doesn't consider the age of the cached data. Consider adding a timestamp-based invalidation strategy.
 if (existing.length > 0 && onCacheHit === "stale") {
+  const oldestAllowedTimestamp = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000); // 7 days
+  const isStale = existing.some(entry => entry.updatedAt < oldestAllowedTimestamp);
+  if (!isStale) {
     return {
       message: `Found existing keywords for ${term}`,
       term,
       keywords: existing,
     };
+  }
 }
Line range hint 47-57: Add timeout handling for external API calls

The external API calls to fetch search results and content could benefit from timeout handling to prevent hanging operations.
 const topThree = searchResponse.serperOrganicResults
   .sort((a, b) => a.position - b.position)
   .slice(0, THREE);

+const FETCH_TIMEOUT = 30000; // 30 seconds
+
+const timeoutPromise = (promise: Promise<any>, ms: number) => {
+  return Promise.race([
+    promise,
+    new Promise((_, reject) => 
+      setTimeout(() => reject(new Error('Operation timed out')), ms)
+    )
+  ]);
+};
+
 // Get content for top 3 results
 const firecrawlResults = await Promise.all(
   topThree.map((result) =>
-    getOrCreateFirecrawlResponse({ url: result.link, connectTo: { term: term } }),
+    timeoutPromise(
+      getOrCreateFirecrawlResponse({ url: result.link, connectTo: { term: term } }),
+      FETCH_TIMEOUT
+    ),
   ),
 );
Line range hint 74-94: Wrap database operations in a transaction

The related search keywords insertion should be wrapped in a transaction to ensure data consistency.
+const insertRelatedSearches = async (searchQuery, searches) => {
+  return await db.transaction(async (tx) => {
     await tx
       .insert(keywords)
       .values(
         searches.map((search) => ({
           inputTerm: searchQuery.inputTerm,
           keyword: search.query.toLowerCase(),
           source: "related_searches",
           updatedAt: sql`now()`,
         })),
       )
       .onDuplicateKeyUpdate({
         set: {
           updatedAt: sql`now()`,
         },
       });
     
     return await tx.query.keywords.findMany({
       where: and(
         eq(keywords.inputTerm, searchQuery.inputTerm),
         eq(keywords.source, "related_searches"),
         inArray(
           keywords.keyword,
           searches.map((search) => search.query.toLowerCase()),
         ),
       ),
     });
+  });
+};

-await db.insert(keywords)...
-const insertedRelatedSearches = await db.query.keywords...
+const insertedRelatedSearches = await insertRelatedSearches(
+  searchQuery,
+  searchResponse.serperRelatedSearches
+);
Line range hint 96-106: Enhance error handling and result validation

Consider adding validation for the final results before returning and improve error handling for better debugging.
+import { z } from "zod";
+
+const KeywordResultSchema = z.object({
+  message: z.string(),
+  term: z.string(),
+  keywords: z.array(z.object({
+    keyword: z.string(),
+    source: z.string(),
+    inputTerm: z.string(),
+    updatedAt: z.date()
+  }))
+});
+
+try {
   const result = {
     message: `Keyword Research for ${term} completed`,
     term: searchQuery.inputTerm,
     keywords: [...keywordsFromTitles, ...keywordsFromHeaders, ...insertedRelatedSearches],
   };
+  return KeywordResultSchema.parse(result);
+} catch (error) {
+  console.error('Failed to validate keyword research results:', error);
+  throw new Error('Invalid keyword research results');
+}
-  return result;
apps/dashboard/lib/audit.ts (2)
55-92: Add JSDoc comments to improve type documentation

The UnkeyAuditLog type would benefit from documentation explaining its purpose and usage.
+/**
+ * Represents an audit log entry in the Unkey system.
+ * @property workspaceId - The ID of the workspace where the action occurred
+ * @property event - The type of event that occurred
+ * @property resources - Array of resources affected by the action
+ */
 export type UnkeyAuditLog = {
Line range hint 93-183: Add error handling and consider batch operations

Two suggestions to improve the implementation:

Add error handling for database operations

Consider using batch inserts for better performance when handling multiple logs
 export async function insertAuditLogs(
   db: Transaction | Database,
   logOrLogs: MaybeArray<UnkeyAuditLog>,
 ) {
   const logs = Array.isArray(logOrLogs) ? logOrLogs : [logOrLogs];
 
   if (logs.length === 0) {
     return Promise.resolve();
   }
 
+  try {
     for (const log of logs) {
       // ... bucket handling ...
     }
 
-    // 2. Insert the log
-    const auditLogId = newId("auditLog");
-    await db.insert(schema.auditLog).values({
+    // 2. Batch insert logs
+    const auditLogValues = logs.map(log => ({
+      id: newId("auditLog"),
       workspaceId: log.workspaceId,
       // ... other fields ...
     }));
+    await db.insert(schema.auditLog).values(auditLogValues);
 
     // ... resource handling ...
+  } catch (error) {
+    console.error("Failed to insert audit logs:", error);
+    throw new Error("Failed to record audit logs");
+  }
 }
apps/billing/src/trigger/glossary/create-pr.ts (3)
Line range hint 82-83: Consider moving repository details to configuration

The hardcoded owner and repo values should be moved to environment variables or configuration for better maintainability and reusability.
-    const owner = "p6l-richard";
-    const repo = "unkey";
+    const owner = process.env.GITHUB_REPO_OWNER;
+    const repo = process.env.GITHUB_REPO_NAME;
Line range hint 87-117: Add error handling for GitHub API operations

The branch creation and deletion operations lack proper error handling. Consider wrapping these operations in try/catch blocks to handle potential API failures gracefully.

Example implementation:
-    const mainRef = await octokit.git.getRef({
-      owner,
-      repo,
-      ref: "heads/main",
-    });
+    try {
+      const mainRef = await octokit.git.getRef({
+        owner,
+        repo,
+        ref: "heads/main",
+      });
+    } catch (error) {
+      throw new AbortTaskRunError(
+        `Failed to get main branch reference: ${error.message}`
+      );
+    }
Line range hint 19-33: Optimize database queries

The code makes multiple separate queries to fetch the same entry. Consider consolidating these queries to improve performance.
-    const existing = await db.query.entries.findFirst({
-      where: eq(entries.inputTerm, input),
-      columns: {
-        id: true,
-        inputTerm: true,
-        githubPrUrl: true,
-      },
-      orderBy: (entries, { desc }) => [desc(entries.createdAt)],
-    });
-
-    if (existing?.githubPrUrl && onCacheHit === "stale") {
-      return {
-        entry: existing,
-        message: `Found existing PR for ${input}.mdx`,
-      };
-    }
-
-    // ==== 1. Prepare MDX file ====
-    const entry = await db.query.entries.findFirst({
+    const entry = await db.query.entries.findFirst({
       where: eq(entries.inputTerm, input),
+      columns: {
+        id: true,
+        inputTerm: true,
+        githubPrUrl: true,
+        dynamicSectionsContent: true,
+        metaTitle: true,
+        metaDescription: true,
+      },
       orderBy: (entries, { desc }) => [desc(entries.createdAt)],
     });

+    if (entry?.githubPrUrl && onCacheHit === "stale") {
+      return {
+        entry: {
+          id: entry.id,
+          inputTerm: entry.inputTerm,
+          githubPrUrl: entry.githubPrUrl,
+        },
+        message: `Found existing PR for ${input}.mdx`,
+      };
+    }
apps/billing/src/trigger/glossary/_generate-glossary-entry.ts (1)
Line range hint 35-37: Resolve inconsistency in retry configuration.

The task's maxAttempts is set to 0, but the documentation comments mention "maxAttempts for a subtask is 5". This seems inconsistent and might prevent the intended retry behavior for handling undeterministic LLM executions.

Consider updating the retry configuration:
  retry: {
-   maxAttempts: 0,
+   maxAttempts: 5,
  },
apps/billing/src/lib/firecrawl.ts (2)
Line range hint 9-11: Add environment variable validation

The Firecrawl API key is accessed with a non-null assertion (!), which could cause runtime errors if the environment variable is missing.

Consider adding validation:
+if (!process.env.FIRECRAWL_API_KEY) {
+  throw new Error("FIRECRAWL_API_KEY environment variable is required");
+}
 const firecrawl = new FirecrawlApp({
-  apiKey: process.env.FIRECRAWL_API_KEY!,
+  apiKey: process.env.FIRECRAWL_API_KEY,
 });
Issue Found: Potential Typo in OpenAI Model Name

The model name "gpt-4o-mini" appears multiple times and may be a typo (e.g., "gpt-4-turbo").

Inconsistent model names like "gpt-4-turbo" and "gpt-4o" are also present.
🔗 Analysis chain

Line range hint 171-175: Verify the model name and consider extracting the system prompt

The model name "gpt-4o-mini" seems unusual and might be a typo.

The long system prompt could be moved to a constant for better maintainability.

Let's verify the model name usage:

Consider moving the system prompt to a constant:
const SUMMARY_SYSTEM_PROMPT = `You are the **Chief Technology Officer (CTO)** ...`;
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for OpenAI model names in the codebase
rg "openai\(['\"]gpt-" --type typescript
Length of output: 79

Script:
#!/bin/bash
# Search for OpenAI model names in the codebase with TypeScript extensions
rg "openai\(['\"]gpt-" --type ts --type tsx
Length of output: 75

Script:
#!/bin/bash
# Search for OpenAI model names in the codebase within .ts and .tsx files
rg "openai\(['\"]gpt-" --glob "*.ts" --glob "*.tsx"
Length of output: 1002
apps/billing/src/lib/keywords.ts (4)
Line range hint 83-95: Consider enhancing error handling for the OpenAI API call.

The generateObject call could fail due to API errors, rate limits, or invalid responses. Consider adding specific error handling:
   const keywordsFromTitles = await generateObject({
     model: openai("gpt-4o-mini"),
     system: keywordResearchSystemPrompt,
     prompt: promptTitles,
     schema: z.object({
       keywords: z.array(z.object({ keyword: z.string(), sourceUrl: z.string().url() })),
       keywordsWithBrandNames: z.array(
         z.object({ keyword: z.string(), sourceUrl: z.string().url() }),
       ),
-    }),
+    }).catch((error) => {
+      if (error.code === 'rate_limit_exceeded') {
+        throw new Error(`OpenAI rate limit exceeded: ${error.message}`);
+      }
+      throw new Error(`Failed to generate keywords: ${error.message}`);
+    });
Line range hint 98-111: Consider wrapping database operations in a transaction.

The database operations (insert and subsequent query) should be wrapped in a transaction to ensure data consistency.
+  return await db.transaction(async (tx) => {
     await db
       .insert(keywords)
       .values(
         keywordsFromTitles.object.keywords.map((keyword) => ({
           inputTerm: term,
           keyword: keyword.keyword.toLowerCase(),
           sourceUrl: keyword.sourceUrl,
           source: "titles",
         })),
       )
       .onDuplicateKeyUpdate({
         set: {
           updatedAt: sql`now()`,
         },
       });
 
-    return db.query.keywords.findMany({
+    return tx.query.keywords.findMany({
       where: and(
         eq(keywords.inputTerm, term),
         eq(keywords.source, "titles"),
         inArray(
           keywords.keyword,
           keywordsFromTitles.object.keywords.map((k) => k.keyword.toLowerCase()),
         ),
       ),
     });
+  });
Line range hint 134-137: Consider improving the markdown header parsing regex.

The current regex ^##\s+(.*)$/gm might miss some edge cases. Consider using a more robust pattern:
-      ${firecrawlResponse.markdown?.match(/^##\s+(.*)$/gm)?.join("\n")}
+      ${firecrawlResponse.markdown?.match(/^#{1,6}\s+(.+?)(?:\s*{[^}]*})?$/gm)?.join("\n")}
This pattern will:

Match headers of all levels (h1-h6)

Handle optional trailing attributes in curly braces

Ensure non-empty header content

Line range hint 115-190: Consider reducing code duplication with getOrCreateKeywordsFromTitles.

Both functions share similar database operations and OpenAI call patterns. Consider extracting common logic into shared utilities:
interface KeywordExtractionParams {
  term: string;
  source: 'titles' | 'headers';
  prompt: string;
}

async function extractAndStoreKeywords({ term, source, prompt }: KeywordExtractionParams) {
  const keywords = await generateObject({
    model: openai("gpt-4o-mini"),
    system: keywordResearchSystemPrompt,
    prompt,
    schema: z.object({
      keywords: z.array(z.object({ keyword: z.string(), sourceUrl: z.string().url() })),
      keywordsWithBrandNames: z.array(
        z.object({ keyword: z.string(), sourceUrl: z.string().url() }),
      ),
    }),
  });

  return db.transaction(async (tx) => {
    // ... shared database operations
  });
}
apps/billing/src/trigger/glossary/draft-sections.ts (3)
Line range hint 15-106: Consider implementing rate limiting and error handling for AI operations.

The task uses multiple OpenAI API calls in sequence without explicit rate limiting or comprehensive error handling. This could lead to reliability issues or unexpected costs.

Consider:

Implementing a rate limiter for OpenAI API calls

Adding retry logic with exponential backoff

Validating AI response length and structure

Monitoring and logging AI operation costs

Line range hint 107-190: Refactor AI model configuration for better maintainability.

The code has hardcoded model names (gpt-4-turbo, gpt-4o-mini) scattered across different functions. This could make model updates or A/B testing difficult.

Consider extracting model configuration to a central configuration:
const AI_CONFIG = {
  models: {
    draft: "gpt-4-turbo",
    review: "gpt-4o-mini",
    seo: "gpt-4o-mini"
  },
  maxTokens: {
    draft: 1000,
    review: 500,
    seo: 800
  }
} as const;

// Usage
const completion = await generateText({
  model: openai(AI_CONFIG.models.draft),
  maxTokens: AI_CONFIG.maxTokens.draft,
  system,
  prompt,
});
Line range hint 191-236: Consider extracting complex types for better maintainability.

The nested type definition for the entry parameter is complex and could be hard to maintain. Consider extracting it to a separate type definition.
type DynamicSection = typeof sections.$inferSelect & {
  contentTypes: Array<typeof sectionContentTypes.$inferSelect>;
  sectionsToKeywords: Array<
    typeof sectionsToKeywords.$inferSelect & {
      keyword: typeof keywords.$inferSelect;
    }
  >;
};

type EntryWithDynamicSections = typeof entries.$inferSelect & {
  dynamicSections: Array<DynamicSection>;
};

// Usage
function draftSections({
  term,
  entry,
}: {
  term: string;
  entry: EntryWithDynamicSections;
})
apps/dashboard/app/(app)/apis/[apiId]/page.tsx (3)
Line range hint 71-108: Consider adding error handling for unknown verification outcomes

The switch statement for processing verification outcomes has a default case that silently ignores unknown outcomes. This could hide potential issues if new outcome types are added in the future.
      case "FORBIDDEN":
        forbiddenOverTime.push({ x, y: d.count });
        break;
      default:
+       console.warn(`Unknown verification outcome encountered: ${d.outcome}`);
+       break;
Line range hint 28-41: Add error handling for missing keyAuthId

The code assumes api.keyAuthId exists (using ! operator) but should handle the case where it might be null more gracefully.
   const api = await db.query.apis.findFirst({
     where: (table, { eq, and, isNull }) =>
       and(eq(table.id, props.params.apiId), isNull(table.deletedAt)),
     with: {
       workspace: true,
     },
   });
-  if (!api || api.workspace.tenantId !== tenantId) {
+  if (!api || api.workspace.tenantId !== tenantId || !api.keyAuthId) {
     return redirect("/new");
   }
Line range hint 42-108: Consider memoizing verification data processing

The processing of verification data into different categories is performed on every render. Consider memoizing this computation using useMemo if this component is re-rendered frequently.

Here's how you could refactor this:
const processVerifications = useMemo(() => {
  const result = {
    successOverTime: [],
    ratelimitedOverTime: [],
    // ... other arrays
  };
  
  for (const d of verifications.sort((a, b) => a.time - b.time)) {
    const x = new Date(d.time).toISOString();
    switch (d.outcome) {
      case "":
      case "VALID":
        result.successOverTime.push({ x, y: d.count });
        break;
      // ... other cases
    }
  }
  
  return result;
}, [verifications]);
apps/billing/src/trigger/glossary/evals.ts (1)
Line range hint 158-321: Standardize logging format across tasks

The editorial evaluation task uses a different logging format ([workflow=glossary] [task=editorial_eval]) compared to other tasks. Consider standardizing the logging format across all tasks for consistency.

Consider abstracting common task patterns

The three evaluation tasks share significant code structure. Consider creating a base evaluation task factory to reduce code duplication.

Here's a suggested approach:
type EvalType = 'technical' | 'seo' | 'editorial';

const createEvalTask = (type: EvalType) => {
  return task({
    id: `perform_${type}_eval`,
    run: async ({ input, onCacheHit = "stale", ...options }: TaskInput & EvalOptions) => {
      console.info(`Starting ${type} evaluation for term: ${input}`);

      // ... common implementation ...

      const result = {
        ratings: ratingsResult.output,
        recommendations: recommendationsResult.output.recommendations,
      };

      // Add outline for editorial type
      if (type === 'editorial') {
        return { ...result, outline: options.content };
      }

      return result;
    }
  });
};

export const performTechnicalEvalTask = createEvalTask('technical');
export const performSEOEvalTask = createEvalTask('seo');
export const performEditorialEvalTask = createEvalTask('editorial');
apps/billing/src/trigger/glossary/generate-outline.ts (3)
Line range hint 26-29: Address TODO comment with concrete improvements

The TODO comment indicates known issues but doesn't have a clear timeline or ownership. Consider breaking this into specific GitHub issues for:

Task decomposition

Database caching implementation

Prompt improvements

Would you like me to help create GitHub issues for tracking these improvements?

Line range hint 26-196: Consider refactoring the task for better maintainability

The generateOutlineTask is quite large and handles multiple concerns. Consider:

Breaking it into smaller, focused tasks

Wrapping database operations in transactions

Implementing proper rollback mechanisms for partial failures

Here's a suggested approach for breaking down the task:
// 1. Create separate tasks for each major step
export const generateInitialOutlineTask = task({/*...*/})
export const performEvaluationsTask = task({/*...*/})
export const persistOutlineTask = task({/*...*/})

// 2. Wrap database operations in transactions
async function persistOutline(entry, outline) {
  return await db.transaction(async (tx) => {
    const sectionIds = await tx.insert(sections)/*...*/
    await tx.insert(sectionsToKeywords)/*...*/
    await tx.insert(sectionContentTypes)/*...*/
    return sectionIds
  })
}

// 3. Main task becomes an orchestrator
export const generateOutlineTask = task({
  id: "generate_outline",
  run: async ({ term, onCacheHit }) => {
    const initialOutline = await generateInitialOutlineTask.triggerAndWait(/*...*/)
    const evaluations = await performEvaluationsTask.triggerAndWait(/*...*/)
    return await persistOutlineTask.triggerAndWait(/*...*/)
  }
})
Custom model gpt-4o-mini is used without proper configuration.

Ensure that gpt-4o-mini is defined and configured correctly in the production environment.

Review the configuration files to include settings for gpt-4o-mini.

🔗 Analysis chain

Line range hint 213-350: Verify the availability and capabilities of the custom OpenAI model

The code uses a custom model gpt-4o-mini. Ensure that:

This model is properly configured and available in production

It has sufficient context window for processing the input content

It's cost-effective for the use case
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for other usages of this custom model
rg -l "gpt-4o-mini"

# Check for any configuration files that might define the model
fd -e json -e yaml -e env . | xargs rg "gpt-4o"
Length of output: 371

Script:
#!/bin/bash
# Check for any configuration files that define the 'gpt-4o-mini' model
fd -e json -e yaml -e env . | xargs rg "gpt-4o-mini"
Length of output: 52
apps/dashboard/lib/trpc/routers/rbac.ts (1)

Line range hint 270-313: Consider renaming upsertPermissions to better reflect its functionality

The function upsertPermissions currently inserts new permissions if they do not exist but does not update existing permissions. Since it doesn't perform an update on existing records, the term upsert might be misleading. To improve clarity, consider renaming the function to something like createPermissionsIfNotExist or ensurePermissionsExist.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ba369f9 and db1519c.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (32)

apps/billing/src/lib/db-marketing/schemas/entries.ts (1 hunks)
apps/billing/src/lib/db-marketing/schemas/evals.ts (1 hunks)
apps/billing/src/lib/db-marketing/schemas/firecrawl.ts (2 hunks)
apps/billing/src/lib/db-marketing/schemas/keywords.ts (1 hunks)
apps/billing/src/lib/db-marketing/schemas/searchQuery.ts (1 hunks)
apps/billing/src/lib/db-marketing/schemas/sections.ts (1 hunks)
apps/billing/src/lib/db-marketing/schemas/serper.ts (1 hunks)
apps/billing/src/lib/firecrawl.ts (2 hunks)
apps/billing/src/lib/keywords.ts (1 hunks)
apps/billing/src/lib/search-query.ts (1 hunks)
apps/billing/src/trigger/glossary/_generate-glossary-entry.ts (1 hunks)
apps/billing/src/trigger/glossary/create-pr.ts (1 hunks)
apps/billing/src/trigger/glossary/draft-sections.ts (1 hunks)
apps/billing/src/trigger/glossary/evals.ts (1 hunks)
apps/billing/src/trigger/glossary/generate-outline.ts (2 hunks)
apps/billing/src/trigger/glossary/keyword-research.ts (1 hunks)
apps/billing/src/trigger/glossary/seo-meta-tags.ts (1 hunks)
apps/dashboard/app/(app)/apis/[apiId]/page.tsx (1 hunks)
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/analytics/page.tsx (2 hunks)
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/layout.tsx (0 hunks)
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/page.tsx (0 hunks)
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/table.tsx (0 hunks)
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/util.ts (0 hunks)
apps/dashboard/app/(app)/settings/billing/page.tsx (1 hunks)
apps/dashboard/lib/audit.ts (1 hunks)
apps/dashboard/lib/tinybird.ts (0 hunks)
apps/dashboard/lib/trpc/routers/key/createRootKey.ts (1 hunks)
apps/dashboard/lib/trpc/routers/rbac.ts (1 hunks)
internal/clickhouse/schema/025_create_billable_verifications_v2.sql (1 hunks)
internal/clickhouse/schema/026_create_billable_ratelimits_v1.sql (1 hunks)
internal/clickhouse/src/billing.ts (2 hunks)
internal/clickhouse/src/index.ts (2 hunks)

💤 Files with no reviewable changes (5)

apps/dashboard/app/(app)/semantic-cache/[gatewayId]/layout.tsx
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/page.tsx
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/table.tsx
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/logs/util.ts
apps/dashboard/lib/tinybird.ts

✅ Files skipped from review due to trivial changes (6)

apps/billing/src/lib/db-marketing/schemas/entries.ts
apps/billing/src/lib/db-marketing/schemas/firecrawl.ts
apps/billing/src/lib/db-marketing/schemas/keywords.ts
apps/billing/src/lib/db-marketing/schemas/searchQuery.ts
apps/billing/src/lib/db-marketing/schemas/serper.ts
apps/dashboard/app/(app)/semantic-cache/[gatewayId]/analytics/page.tsx

🔇 Additional comments (35)

internal/clickhouse/schema/026_create_billable_ratelimits_v1.sql (2)

29-30: LGTM! Down migration is correct

The down migration properly handles the cleanup in the correct order.

14-27: Verify source table schema and time handling

The materialized view looks functionally correct, but please verify:

The schema of ratelimits.ratelimits_per_month_v1 matches expectations
The time zone handling for toYear(time) and toMonth(time) is appropriate

internal/clickhouse/schema/025_create_billable_verifications_v2.sql (3)

29-30: LGTM! Down migration is correct

The down migration properly handles cleanup in the correct order (view before table).

13-26: Verify source table schema and consider performance optimizations

The materialized view looks functionally correct, but let's verify a few things:

Additionally, consider adding an index on the outcome column in the source table if not already present, as it's used in the WHERE clause.

1-30: Verify migration sequence and differences from v1

Let's ensure this migration is properly sequenced and understand the changes from v1.

✅ Verification successful

Migration sequence verified and differences from v1 are consistent

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check migration ordering and compare with v1

# Check for previous versions
fd "billable_verifications.*\.sql" internal/clickhouse/schema/

# Look for related tables/views
rg "billable_verifications" internal/clickhouse/schema/

Length of output: 1831

Script:

#!/bin/bash
# Description: Corrected search for 'billable_verifications' within the clickhouse schema

# Look for related tables/views with the correct directory path
rg "billable_verifications" internal/clickhouse/schema/

Length of output: 1414

Script:

#!/bin/bash
# Description: Retrieve contents of v1 and v2 billable_verifications migration scripts

# Get contents of v1 table creation script
cat internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql

# Get contents of v1 materialized view creation script
cat internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql

# Get contents of v2 migration script
cat internal/clickhouse/schema/025_create_billable_verifications_v2.sql

Length of output: 1691

internal/clickhouse/src/billing.ts (2)

51-51: Verify migration strategy for v2 table

The change from billable_verifications_per_month_v1 to v2 might require data migration. Please ensure:

Data from v1 is migrated to v2 if needed
No data loss during the transition
Backward compatibility if both versions need to coexist

Let's check for migration scripts and table definitions:

16-16: Consider using v2 table for consistency

The function queries billing.billable_ratelimits_per_month_v1 while getBillableVerifications uses a v2 table. Should this also be using a v2 table for consistency?

Let's check if a v2 table exists:

✅ Verification successful

Confirmed: Using v1 table is correct

The billing.billable_ratelimits_per_month_v1 table is the current version. No v2 table exists.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for billable_ratelimits table definitions
rg -A 5 "CREATE TABLE.*billable_ratelimits.*v2"

Length of output: 49

Script:

#!/bin/bash
# List all billable_ratelimits table definitions
rg "CREATE TABLE.*billable_ratelimits"

# Search for migrations introducing v2 tables
rg "migration.*v2.*billable_ratelimits"

Length of output: 201

internal/clickhouse/src/index.ts (2)

2-2: LGTM: Clean import addition

The import statement correctly adds the new getBillableRatelimits function alongside the existing getBillableVerifications.

68-68: Verify API consumers are updated

The implementation cleanly adds rate limit billing capabilities following the established pattern. However, we should ensure all consumers of the billing API are updated to handle this new property.

apps/billing/src/lib/db-marketing/schemas/evals.ts (1)

1-2: LGTM! Import organization looks good.

The separation of imports improves readability and git diff clarity.

apps/billing/src/lib/search-query.ts (2)

6-6: LGTM: Import organization looks good

The reordering of imports maintains clarity and follows standard practices.

Line range hint 1-1: Verify database client usage

Given that the PR's objective is to move queries from "tb" to "ch" (ClickHouse), verify if this file should also be updated to use the ClickHouse client instead of the marketing database client.

#!/bin/bash
# Search for ClickHouse client usage patterns
rg -g '*.ts' 'from ["'\'']@/lib/(?:ch|clickhouse)' -A 5

# Search for similar files that might have been migrated
fd -e ts -e tsx -x grep -l "db-marketing" {} \; | xargs rg 'clickhouse'

apps/billing/src/lib/db-marketing/schemas/sections.ts (1)

Line range hint 16-29: Verify the nullable status of the markdown field

The markdown field is nullable while description is required. Please confirm if this is intentional, as typically markdown content would be the primary content source.

Consider making both fields required or document why markdown is optional while description is mandatory.

apps/dashboard/lib/audit.ts (2)

8-53: LGTM! Well-structured schema definition

The schema effectively validates and transforms audit log data with proper typing. The transformation to a more structured object format improves type safety and usability.

Line range hint 1-183: Verify database migration compatibility

The PR title mentions moving queries from TimescaleDB to ClickHouse, but this file doesn't explicitly show the database transition. Ensure that:

The schema definitions are compatible with ClickHouse
The data types are properly mapped
The existing data is migrated correctly

✅ Verification successful

Database migration handled correctly

The presence of ClickHouse-specific configurations and migration scripts indicates that the migration from TimescaleDB to ClickHouse is being managed appropriately.

Migration scripts are present in the repository.
Schema definitions are compatible with ClickHouse.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for ClickHouse-specific types or configurations
rg -l 'clickhouse|ClickHouse' .

# Look for migration scripts
fd -g '*migrate*.{ts,js,sql}'

# Check for other files using similar schema
ast-grep --pattern 'schema.auditLog'

Length of output: 14050

apps/billing/src/trigger/glossary/create-pr.ts (2)

3-4: LGTM! Import statements are well organized.

The reordering of imports maintains a logical grouping while keeping the code functionality intact.

Line range hint 92-117: Handle potential race condition in branch management

The check-then-act pattern when handling existing branches could lead to race conditions. Consider using a more atomic approach or adding retry logic.

apps/billing/src/trigger/glossary/_generate-glossary-entry.ts (1)

1-2: LGTM! Import statements are well organized.

The reorganization of imports improves code readability by grouping related imports together (database-related and task-related imports).

Also applies to: 6-10

apps/billing/src/trigger/glossary/seo-meta-tags.ts (3)

3-7: LGTM! Import statements are well-organized.

The imports are logically grouped and provide clear dependencies for the functionality.

Line range hint 1-100: Consider migrating database queries to ClickHouse

Given the PR objective to move queries from TimescaleDB to ClickHouse, this file contains several database operations that could be candidates for migration:

Querying existing entries
Fetching related keywords
Fetching top-ranking pages
Updating entries with new meta tags

Let's verify the database schema migrations and existing query patterns:

#!/bin/bash
# Description: Check for ClickHouse schema definitions and query patterns

# Look for ClickHouse schema definitions
echo "Searching for ClickHouse schema definitions..."
rg -l "createTable.*ClickHouse" 

# Look for similar query patterns that have been migrated
echo "Searching for similar query patterns that have been migrated..."
rg "db\.query" --type ts

# Look for migration patterns
echo "Searching for migration patterns..."
fd -e sql -e ts "migrate|schema"

Line range hint 22-33: Consider implementing rate limiting for OpenAI calls

The task uses GPT-4 which can be expensive. Consider implementing rate limiting to control costs.

Let's check for existing rate limiting implementations:

apps/billing/src/lib/firecrawl.ts (1)

Line range hint 19-116: Consider refactoring for better error handling and DRY principles

The function has several areas that could be improved:

Error handling could be more specific about different types of errors (network, parsing, etc.).
Database operations are duplicated across success/failure paths.
The success/failure paths could be more clearly separated.

Consider extracting the database operations into a helper function:

async function upsertFirecrawlResponse(data: Partial<typeof firecrawlResponses.$inferInsert>) {
  const [response] = await db
    .insert(firecrawlResponses)
    .values(data)
    .onDuplicateKeyUpdate({
      set: {
        ...data,
        updatedAt: new Date(),
      },
    })
    .$returningId();
  
  return await db.query.firecrawlResponses.findFirst({
    where: eq(firecrawlResponses.sourceUrl, data.sourceUrl),
  });
}

This would reduce duplication and make the main function cleaner.

Let's verify the error handling patterns in similar functions:

apps/billing/src/lib/keywords.ts (2)

1-7: LGTM! Import statements are well organized.

The imports are logically grouped with AI-related imports at the top, followed by database and schema-related imports.

Line range hint 83-89: Verify the OpenAI model alias exists.

The code uses gpt-4o-mini which appears to be a custom model alias. Ensure this model alias is properly configured in your OpenAI client setup.

apps/billing/src/trigger/glossary/draft-sections.ts (1)

3-7: LGTM! Import changes improve type safety.

The reordering of imports and addition of proper type imports from the schema enhance type safety and maintainability.

apps/dashboard/app/(app)/apis/[apiId]/page.tsx (1)

Line range hint 42-58: Verify complete migration of queries to ClickHouse

The PR's objective is to move queries from "tb" to "ch". While this component already uses ClickHouse for verification and active key metrics, let's verify no queries remain to be migrated.

✅ Verification successful

Complete migration of queries to ClickHouse confirmed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining database queries that might need migration
# Look for patterns indicating traditional database usage

# Check for direct database queries
rg -l 'from\s+schema\.' --type typescript

# Check for specific table access patterns
rg -l 'schema\.(keys|verifications|active_keys)' --type typescript

# Look for potential migration TODOs
rg -l 'TODO.*migrate.*to.*ch' --type typescript

Length of output: 267

Script:

#!/bin/bash
# Search for any remaining database queries that might need migration
# Look for patterns indicating traditional database usage

# Check for direct database queries
rg -l 'from\s+schema\.' --type ts

# Check for specific table access patterns
rg -l 'schema\.(keys|verifications|active_keys)' --type ts

# Look for potential migration TODOs
rg -l 'TODO.*migrate.*to.*ch' --type ts

Length of output: 3742

Script:

#!/bin/bash
# Search for any remaining "tb" database queries that might need migration to "ch"

# Check for Teradata-specific schema references
rg -l 'schema_tb\.' --type ts

# Check for any connection strings or configurations referencing "tb"
rg -i -l 'connect.*tb' --type ts

# Check for comments or code indicating use of "tb"
rg -i -l 'use\s+tb' --type ts

Length of output: 98

apps/billing/src/trigger/glossary/evals.ts (4)

1-11: LGTM: Import section is well-organized

The import statements are properly organized and all imports are being used in the code.

Line range hint 14-27: LGTM: Type definitions are clear and well-structured

The type definitions provide good type safety and are used consistently throughout the tasks.

Line range hint 1-321: LGTM: Error handling and database operations are robust

The implementation demonstrates:

Proper error handling with AbortTaskRunError
Secure database operations with parameterized queries
Consistent JSON handling for stored data

Line range hint 29-156: Consider extracting system prompts to constants

The system prompts are currently embedded in the task implementations. Consider extracting them to constants for better maintainability and reusability.

+const RATING_SYSTEM_PROMPT = `You are a Senior Technical Content Evaluator...`;
+const RECOMMENDATION_SYSTEM_PROMPT = `You are a Senior Technical Content Strategist...`;

 export const getOrCreateRatingsTask = task({
   // ...
-    const systemPrompt = `You are a Senior Technical Content Evaluator...`;
+    const systemPrompt = RATING_SYSTEM_PROMPT;

Verify the OpenAI model configuration

The code uses a custom model alias "gpt-4o-mini". Ensure this is correctly configured in your OpenAI client setup.

apps/billing/src/trigger/glossary/generate-outline.ts (2)

Line range hint 3-15: LGTM! Import organization improved

The consolidation of imports from db-marketing/schemas and the addition of type imports improves code organization and type safety.

Line range hint 197-211: LGTM! Well-structured schema definitions

The schema definitions are clear, properly typed with zod, and maintain good separation between initial and final outline structures.

apps/dashboard/app/(app)/settings/billing/page.tsx (1)

192-196: LGTM! Verify error handling for the new ClickHouse query.

The migration from ratelimits to clickhouse.billing.billableRatelimits looks good and maintains the same interface. The results are properly integrated into the billing calculations.

Let's verify the error handling in the new ClickHouse implementation:

apps/dashboard/lib/trpc/routers/key/createRootKey.ts (1)

1-1: Verify the correctness of the updated import path

The import path for UnkeyAuditLog has been changed from "@/lib/tinybird" to "@/lib/audit". Please ensure that UnkeyAuditLog is correctly exported from the new module and that all references are updated accordingly.

Run the following script to confirm that UnkeyAuditLog is exported from "@/lib/audit":

✅ Verification successful

Import path for UnkeyAuditLog is correct and properly exported

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that UnkeyAuditLog is exported from "@/lib/audit"

# Find the 'audit.ts' file in the 'lib' directory and search for the export of UnkeyAuditLog
fd 'audit.ts' lib --exec rg 'export.*UnkeyAuditLog' {} \;

Length of output: 151

Script:

#!/bin/bash
# Description: Locate 'audit.ts' in the repository and verify that UnkeyAuditLog is exported

# Search for 'audit.ts' anywhere in the repository
audit_files=$(fd 'audit.ts' .)

if [ -z "$audit_files" ]; then
  echo "No audit.ts file found in the repository."
  exit 1
fi

# Check each found 'audit.ts' for the export of UnkeyAuditLog
for file in $audit_files; do
  rg 'export\s+.*UnkeyAuditLog' "$file"
done

Length of output: 412

apps/dashboard/lib/trpc/routers/rbac.ts (1)

3-3: Import statement updated appropriately

The addition of UnkeyAuditLog to the import ensures the necessary type is available for audit logging within the file.

coderabbitai · 2024-11-12T18:17:45Z

apps/dashboard/lib/audit.ts

+    resources: z.string().transform((rs) =>
+      z
+        .array(
+          z.object({
+            type: z.string(),
+            id: z.string(),
+            meta: z.record(z.union([z.string(), z.number(), z.boolean(), z.null()])).optional(),
+          }),
+        )
+        .parse(JSON.parse(rs)),
+    ),


⚠️ Potential issue

Add input validation before JSON.parse

The direct use of JSON.parse without try-catch could throw on invalid JSON input. Consider adding validation or error handling.

resources: z.string().transform((rs) => - z - .array( + z.preprocess((val) => { + try { + return JSON.parse(rs); + } catch { + return []; + } + }, z.array(

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-12T18:17:45Z

apps/billing/src/trigger/glossary/_generate-glossary-entry.ts

@@ -1,13 +1,13 @@
+import { db } from "@/lib/db-marketing/client";


💡 Codebase verification

Update the import to use the ClickHouse client.
The file apps/billing/src/trigger/glossary/_generate-glossary-entry.ts is still importing the old db-marketing client. Please migrate this to use the ClickHouse client to align with the PR objectives.

🔗 Analysis chain

Verify database client migration to ClickHouse.

Based on the PR objectives to "move more queries from tb to ch", should this import still be using the marketing database client or should it be migrated to use the ClickHouse client?

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for any new ClickHouse database client implementations rg -g '!node_modules' -g '!dist' -A 5 'clickhouse|ch-client' # Search for similar database operations that might have been migrated ast-grep --pattern 'db.query.entries.findFirst'

Length of output: 61314

coderabbitai · 2024-11-12T18:17:45Z

apps/billing/src/lib/firecrawl.ts

+  const prompt = `Summarize the following content for the term "${
+    connectTo.term
+  }" that's ranking #${serperResult?.position ?? "unknown"}:


⚠️ Potential issue

Sanitize user input in prompt construction

The template literal directly interpolates user input (connectTo.term) into the prompt, which could potentially lead to prompt injection vulnerabilities.

Consider sanitizing the input:

- const prompt = `Summarize the following content for the term "${ - connectTo.term - }" that's ranking #${serperResult?.position ?? "unknown"}: + const prompt = `Summarize the following content for the term "${ + sanitizeInput(connectTo.term) + }" that's ranking #${serperResult?.position ?? "unknown"}:

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-12T18:17:45Z

apps/dashboard/app/(app)/apis/[apiId]/page.tsx

  const successOverTime: { x: string; y: number }[] = [];
  const ratelimitedOverTime: { x: string; y: number }[] = [];
  const usageExceededOverTime: { x: string; y: number }[] = [];
  const disabledOverTime: { x: string; y: number }[] = [];
  const insufficientPermissionsOverTime: { x: string; y: number }[] = [];
  const expiredOverTime: { x: string; y: number }[] = [];
  const forbiddenOverTime: { x: string; y: number }[] = [];
-  const emptyOverTime: { x: string; y: number }[] = [];
+  const _emptyOverTime: { x: string; y: number }[] = [];


🛠️ Refactor suggestion

Remove unused variable _emptyOverTime

The variable _emptyOverTime is declared but never used throughout the component. While prefixing with underscore indicates intent to mark it as unused, it would be cleaner to remove it entirely.

- const _emptyOverTime: { x: string; y: number }[] = [];

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const _emptyOverTime: { x: string; y: number }[] = [];

refactor: move more queries from tb to ch

db1519c

chronark requested review from perkinsjr, mcstepp and MichaelUnkey as code owners November 12, 2024 18:08

vercel bot deployed to Preview – play November 12, 2024 18:10 View deployment

vercel bot deployed to Preview – engineering November 12, 2024 18:10 View deployment

vercel bot deployed to Preview – www November 12, 2024 18:13 View deployment

vercel bot deployed to Preview – workflows November 12, 2024 18:15 View deployment

vercel bot deployed to Preview – dashboard November 12, 2024 18:15 View deployment

chronark merged commit 0e08b26 into main Nov 12, 2024
27 of 29 checks passed

chronark deleted the clickhouse-billing-fix branch November 12, 2024 18:16

coderabbitai bot reviewed Nov 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: move more queries from tb to ch #2655

refactor: move more queries from tb to ch #2655

chronark commented Nov 12, 2024 •

edited by coderabbitai bot

Loading

changeset-bot bot commented Nov 12, 2024

vercel bot commented Nov 12, 2024 •

edited

Loading

github-actions bot commented Nov 12, 2024 •

edited

Loading

coderabbitai bot commented Nov 12, 2024 •

edited

Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

chronark commented Nov 12, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 12, 2024

coderabbitai bot Nov 12, 2024

coderabbitai bot Nov 12, 2024

coderabbitai bot Nov 12, 2024

		@@ -1,13 +1,13 @@
		import { db } from "@/lib/db-marketing/client";

refactor: move more queries from tb to ch #2655

refactor: move more queries from tb to ch #2655

Conversation

chronark commented Nov 12, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

changeset-bot bot commented Nov 12, 2024

⚠️ No Changeset found

vercel bot commented Nov 12, 2024 • edited Loading

github-actions bot commented Nov 12, 2024 • edited Loading

coderabbitai bot commented Nov 12, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

chronark commented Nov 12, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

chronark commented Nov 12, 2024 •

edited by coderabbitai bot

Loading

vercel bot commented Nov 12, 2024 •

edited

Loading

github-actions bot commented Nov 12, 2024 •

edited

Loading

coderabbitai bot commented Nov 12, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)