Skip to content

Commit

Permalink
feat(parser): parallel file parsing (#6382)
Browse files Browse the repository at this point in the history
This PR enables parallel parsing of files, utilizing multiple parser
workers.
Until now we had only one `ParserPrinterWorker`, which had to handle
parsing all of the files serially. In our sample store this could be
~10-15s spent only on parsing (depending on the machine).

In this PR we're introducting the ability to spin multiple workers, and
divide the file load between them, allowing the parsing to be parallel.

**Note to reviewers:** Main logic changes are in `worker-types.ts` and
`workers.ts`

Some important points:
1. Since parsing the files happens on the load critical path, we can
easily hit a hardware limitation due to multiple workers/main thread
that take up all the processes. This limits us both in the number of new
workers we can spin up, and also causing N workers to work for more that
1/N of the time in parallel. Even though - there are still meaningful
gains.
2. There is an overhead to spinning up workers, since our parsing code
itself is heavy and takes some time to evaluate. A proper threadpool
solution might be able to solve that (spin workers when needed and keep
them alive as long as the parse load is large). However this requires
changing our entire worker mechansim, and can even hurt performance
since spinning up new workers in the "middle" of the parsing flow
creates an overhead. However this can be a future optimization - to
start with 3-4 workers in advance and if needed spin up more using a
threadpool library (`workerspool` or `threads`, for example).
3. Our current chunking algorithm splits the parsable files (not all
files are parsable) into N chunks that are more-or-less equal in terms
of the sum of their file length, and sends them in parallel to N
workers. This makes sure that all of the workers will work for roughly
the same amount of time. A proper threadpool queue-based solution could
be more efficient here (since it will always send the next file to an
available worker) but the difference to the current "naive" chunking
implementation wouldn't be that large, and it will also add a small
messaging overhead (sending each file in turn).
4. The current concurrency is 3, and this will be configurable from the
settings/RYO menu after #6381 will be merged.

**Commit Details:**

- Added support for multiple parse workers in `UtopiaTsWorkers`
- Changed `getParseResult` to chunk the files, and call on each chunk
`getParseResultSerial`, waiting for all of them to return before
returning the result
- `getParseResultSerial` now looks for the next available
`ParserPrinterWorker` to send the next chunk to, for parsing
- Worker code itself remained unchanged
- The feature can be toggles on/off from the settings menu

**Manual Tests:**
I hereby swear that:

- [X] I opened a hydrogen project and it loaded
- [X] I could navigate to various routes in Preview mode

Fixes #5655
  • Loading branch information
liady authored Oct 5, 2024
1 parent 328b3ed commit d482650
Show file tree
Hide file tree
Showing 17 changed files with 299 additions and 44 deletions.
4 changes: 2 additions & 2 deletions editor/src/components/canvas/ui-jsx.test-utils.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -569,7 +569,7 @@ export async function renderTestEditorWithModel(
}

const workers = new UtopiaTsWorkersImplementation(
new FakeParserPrinterWorker(),
[new FakeParserPrinterWorker()],
new FakeLinterWorker(),
new FakeWatchdogWorker(),
)
Expand Down Expand Up @@ -1042,7 +1042,7 @@ export function createBuiltinDependenciesWithTestWorkers(
extraBuiltinDependencies: BuiltInDependencies,
): BuiltInDependencies {
const workers = new UtopiaTsWorkersImplementation(
new FakeParserPrinterWorker(),
[new FakeParserPrinterWorker()],
new FakeLinterWorker(),
new FakeWatchdogWorker(),
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ function createEditorStore(
},
},
workers: new UtopiaTsWorkersImplementation(
new FakeParserPrinterWorker(),
[new FakeParserPrinterWorker()],
new FakeLinterWorker(),
new FakeWatchdogWorker(),
),
Expand Down
8 changes: 2 additions & 6 deletions editor/src/components/editor/store/dispatch-strategies.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,12 @@ import type {
StrategyApplicationResult,
} from '../../canvas/canvas-strategies/canvas-strategy-types'
import { strategyApplicationResult } from '../../canvas/canvas-strategies/canvas-strategy-types'
import { isFeatureEnabled } from '../../../utils/feature-switches'
import { PERFORMANCE_MARKS_ALLOWED } from '../../../common/env-vars'
import { last } from '../../../core/shared/array-utils'
import type { BuiltInDependencies } from '../../../core/es-modules/package-manager/built-in-dependencies-list'
import { isInsertMode } from '../editor-modes'
import { patchedCreateRemixDerivedDataMemo } from './remix-derived-data'
import { allowedToEditProject } from './collaborative-editing'
import { canMeasurePerformance } from '../../../core/performance/performance-utils'

interface HandleStrategiesResult {
unpatchedEditorState: EditorState
Expand Down Expand Up @@ -672,10 +671,7 @@ export function handleStrategies(
result: EditorStoreUnpatched,
oldDerivedState: DerivedState,
): HandleStrategiesResult & { patchedDerivedState: DerivedState } {
const MeasureDispatchTime =
(isFeatureEnabled('Debug – Performance Marks (Fast)') ||
isFeatureEnabled('Debug – Performance Marks (Slow)')) &&
PERFORMANCE_MARKS_ALLOWED
const MeasureDispatchTime = canMeasurePerformance()

if (MeasureDispatchTime) {
window.performance.mark('strategies_begin')
Expand Down
28 changes: 22 additions & 6 deletions editor/src/components/editor/store/dispatch.tsx
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import { PERFORMANCE_MARKS_ALLOWED } from '../../../common/env-vars'
import type { UtopiaTsWorkers } from '../../../core/workers/common/worker-types'
import { getParseResult } from '../../../core/workers/common/worker-types'
import { runLocalCanvasAction } from '../../../templates/editor-canvas'
Expand Down Expand Up @@ -59,7 +58,6 @@ import {
getProjectChanges,
sendVSCodeChanges,
} from './vscode-changes'
import { isFeatureEnabled } from '../../../utils/feature-switches'
import { handleStrategies, updatePostActionState } from './dispatch-strategies'

import type { MetaCanvasStrategy } from '../../canvas/canvas-strategies/canvas-strategies'
Expand Down Expand Up @@ -89,6 +87,15 @@ import {
import type { PropertyControlsInfo } from '../../custom-code/code-file'
import { getFilePathMappings } from '../../../core/model/project-file-utils'
import type { ElementInstanceMetadataMap } from '../../../core/shared/element-template'
import {
getParserChunkCount,
getParserWorkerCount,
isConcurrencyLoggingEnabled,
} from '../../../core/workers/common/concurrency-utils'
import {
canMeasurePerformance,
startPerformanceMeasure,
} from '../../../core/performance/performance-utils'
import { getParseCacheOptions } from '../../../core/shared/parse-cache-utils'

type DispatchResultFields = {
Expand Down Expand Up @@ -327,15 +334,27 @@ function maybeRequestModelUpdate(

// Should anything need to be sent across, do so here.
if (filesToUpdate.length > 0) {
const { endMeasure } = startPerformanceMeasure('file-parse', { uniqueId: true })
const parseFinished = getParseResult(
workers,
filesToUpdate,
getFilePathMappings(projectContents),
existingUIDs,
isSteganographyEnabled(),
getParserChunkCount(),
getParseCacheOptions(),
)
.then((parseResult) => {
const duration = endMeasure()
if (isConcurrencyLoggingEnabled() && filesToUpdate.length > 1) {
console.info(
`parse finished for ${
filesToUpdate.length
} files, using ${getParserChunkCount()} chunks and ${getParserWorkerCount()} workers, in ${duration.toFixed(
2,
)}ms`,
)
}
const updates = parseResult.map((fileResult) => {
return parseResultToWorkerUpdates(fileResult)
})
Expand Down Expand Up @@ -810,10 +829,7 @@ function editorDispatchInner(
): DispatchResult {
// console.log('DISPATCH', simpleStringifyActions(dispatchedActions), dispatchedActions)

const MeasureDispatchTime =
(isFeatureEnabled('Debug – Performance Marks (Fast)') ||
isFeatureEnabled('Debug – Performance Marks (Slow)')) &&
PERFORMANCE_MARKS_ALLOWED
const MeasureDispatchTime = canMeasurePerformance()

if (MeasureDispatchTime) {
window.performance.mark('dispatch_begin')
Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,11 @@
import { PERFORMANCE_MARKS_ALLOWED } from '../../../common/env-vars'
import { canMeasurePerformance } from '../../../core/performance/performance-utils'
import { isFeatureEnabled } from '../../../utils/feature-switches'
import type { EditorAction } from '../action-types'
import { simpleStringifyActions } from '../actions/action-utils'

export function createPerformanceMeasure() {
const MeasureSelectorsEnabled = isFeatureEnabled('Debug – Measure Selectors')
const PerformanceMarks =
(isFeatureEnabled('Debug – Performance Marks (Slow)') ||
isFeatureEnabled('Debug – Performance Marks (Fast)')) &&
PERFORMANCE_MARKS_ALLOWED
const PerformanceMarks = canMeasurePerformance()

let stringifiedActions = ''

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ jest.mock('../../../core/workers/common/worker-types', () => ({
filePathMappings: FilePathMappings,
alreadyExistingUIDs: Set<string>,
applySteganography: SteganographyMode,
parserChunkCount: number,
parsingCacheOptions: ParseCacheOptions,
): Promise<Array<ParseOrPrintResult>> {
mockParseStartedCount++
Expand All @@ -38,6 +39,7 @@ jest.mock('../../../core/workers/common/worker-types', () => ({
filePathMappings,
alreadyExistingUIDs,
applySteganography,
parserChunkCount,
parsingCacheOptions,
)
mockLock2.resolve()
Expand Down
17 changes: 14 additions & 3 deletions editor/src/components/navigator/left-pane/roll-your-own-pane.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,10 @@ type GridFeatures = {

type PerformanceFeatures = {
parseCache: boolean
verboseLogCache: boolean
parallelParsing: boolean
cacheArbitraryCode: boolean
verboseLogCache: boolean
logParseTimings: boolean
}

type RollYourOwnFeaturesTypes = {
Expand All @@ -47,6 +49,8 @@ const featureToFeatureFlagMap: Record<keyof Partial<PerformanceFeatures>, Featur
parseCache: 'Use Parsing Cache',
verboseLogCache: 'Verbose Log Cache',
cacheArbitraryCode: 'Arbitrary Code Cache',
parallelParsing: 'Parallel Parsing',
logParseTimings: 'Log Parse Timings',
}

const defaultRollYourOwnFeatures: () => RollYourOwnFeatures = () => ({
Expand All @@ -64,8 +68,10 @@ const defaultRollYourOwnFeatures: () => RollYourOwnFeatures = () => ({
},
Performance: {
parseCache: getFeatureFlagValue('parseCache'),
verboseLogCache: getFeatureFlagValue('verboseLogCache'),
parallelParsing: getFeatureFlagValue('parallelParsing'),
cacheArbitraryCode: getFeatureFlagValue('cacheArbitraryCode'),
logParseTimings: getFeatureFlagValue('logParseTimings'),
verboseLogCache: getFeatureFlagValue('verboseLogCache'),
},
})

Expand Down Expand Up @@ -256,10 +262,15 @@ const SimpleFeatureControls = React.memo(({ subsection }: { subsection: Section
<React.Fragment>
{Object.keys(defaultFeatures[subsection]).map((key) => {
const feat = key as keyof RollYourOwnFeaturesTypes[Section]
const featName = Object.keys(featureToFeatureFlagMap).includes(
feat as keyof typeof featureToFeatureFlagMap,
)
? featureToFeatureFlagMap[feat as keyof typeof featureToFeatureFlagMap]
: feat
const value = features[subsection][feat] ?? defaultFeatures[subsection][feat]
return (
<UIGridRow padded variant='<----------1fr---------><-auto->' key={`feat-${feat}`}>
<Ellipsis title={feat}>{feat}</Ellipsis>
<Ellipsis title={featName}>{featName}</Ellipsis>
{typeof value === 'boolean' ? (
<input type='checkbox' checked={value} onChange={onChange(feat)} />
) : typeof value === 'string' ? (
Expand Down
38 changes: 38 additions & 0 deletions editor/src/core/performance/performance-utils.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
import { PERFORMANCE_MARKS_ALLOWED } from '../../common/env-vars'
import { isFeatureEnabled } from '../../utils/feature-switches'

export function timeFunction(fnName: string, fn: () => any, iterations: number = 100) {
const start = Date.now()
for (var i = 0; i < iterations; i++) {
Expand All @@ -8,3 +11,38 @@ export function timeFunction(fnName: string, fn: () => any, iterations: number =
// eslint-disable-next-line no-console
console.log(`${fnName} took ${timeTaken}ms`)
}

export function canMeasurePerformance(): boolean {
return (
(isFeatureEnabled('Debug – Performance Marks (Fast)') ||
isFeatureEnabled('Debug – Performance Marks (Slow)')) &&
PERFORMANCE_MARKS_ALLOWED
)
}

export function startPerformanceMeasure(
measureName: string,
{ uniqueId }: { uniqueId?: boolean } = {},
): { id: string; endMeasure: () => number } {
const id = uniqueId ? `${measureName}-${Math.random()}` : measureName
if (PERFORMANCE_MARKS_ALLOWED) {
performance.mark(`${id}-start`)
}
return {
id: id,
endMeasure: () => endPerformanceMeasure(id),
}
}

export function endPerformanceMeasure(id: string): number {
if (PERFORMANCE_MARKS_ALLOWED) {
performance.mark(`${id}-end`)
performance.measure(`${id}-duration`, `${id}-start`, `${id}-end`)
const measurements = performance.getEntriesByName(`${id}-duration`)
const latestMeasurement = measurements[measurements.length - 1]
if (latestMeasurement != null) {
return latestMeasurement.duration
}
}
return 0
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import type { Imports } from '../shared/project-file-types'
import { isParseFailure, isParseSuccess } from '../shared/project-file-types'
import { emptySet } from '../shared/set-utils'
import { isSteganographyEnabled } from '../shared/stegano-text'
import { getParserChunkCount } from '../workers/common/concurrency-utils'
import type { UtopiaTsWorkers } from '../workers/common/worker-types'
import { createParseFile, getParseResult } from '../workers/common/worker-types'
import { ARBITRARY_CODE_FILE_NAME } from '../workers/common/worker-types'
Expand Down Expand Up @@ -51,6 +52,7 @@ async function getParseResultForUserStrings(
[],
emptySet(),
isSteganographyEnabled(),
getParserChunkCount(),
getParseCacheOptions(),
)

Expand Down
20 changes: 20 additions & 0 deletions editor/src/core/shared/array-utils.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ import {
intersection,
mapAndFilter,
possiblyUniqueInArray,
sortArrayByAndReturnPermutation,
revertArrayOrder,
strictEvery,
} from './array-utils'

Expand Down Expand Up @@ -53,6 +55,24 @@ describe('aperture', () => {
})
})

describe('sortArrayByAndReturnPermutation', () => {
it('should sort the array', () => {
const { sortedArray } = sortArrayByAndReturnPermutation([3, 1, 2], (a) => a)
expect(sortedArray).toEqual([1, 2, 3])
})
it('should respect the sort order', () => {
const { sortedArray } = sortArrayByAndReturnPermutation([3, 1, 2], (a) => a, false)
expect(sortedArray).toEqual([3, 2, 1])
})
it('should return the permutation that reverses the sort', () => {
const originalArray = [10, 5, 6, 32, 102, 7, 91]
const { sortedArray, permutation } = sortArrayByAndReturnPermutation(originalArray, (a) => a)
expect(sortedArray).toEqual([5, 6, 7, 10, 32, 91, 102])
const reversedToOriginal = revertArrayOrder(sortedArray, permutation)
expect(reversedToOriginal).toEqual(originalArray)
})
})

describe('mapAndFilter', () => {
const input = [1, 2, 3, 4, 5]
const mapFn = (n: number) => n + 10
Expand Down
38 changes: 38 additions & 0 deletions editor/src/core/shared/array-utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -538,3 +538,41 @@ export function matrixGetter<T>(array: T[], width: number): (row: number, column
return array[row * width + column]
}
}

export function chunkArrayEqually<T>(
sortedArray: T[],
numberOfChunks: number,
valueFn: (t: T) => number,
): T[][] {
const chunks: T[][] = Array.from({ length: numberOfChunks }, () => [])
const chunkSums: number[] = Array(numberOfChunks).fill(0)
for (const data of sortedArray) {
let minIndex = 0
for (let i = 1; i < numberOfChunks; i++) {
if (chunkSums[i] < chunkSums[minIndex]) {
minIndex = i
}
}
chunks[minIndex].push(data)
chunkSums[minIndex] += valueFn(data)
}
return chunks.filter((chunk) => chunk.length > 0)
}

export function sortArrayByAndReturnPermutation<T>(
array: T[],
sortFn: (t: T) => number,
ascending: boolean = true,
): { sortedArray: T[]; permutation: number[] } {
const permutation = array.map((_, index) => index)
permutation.sort((a, b) => {
const sortResult = sortFn(array[a]) - sortFn(array[b])
return ascending ? sortResult : -sortResult
})
const sortedArray = permutation.map((index) => array[index])
return { sortedArray, permutation }
}

export function revertArrayOrder<T>(array: T[], permutation: number[]): T[] {
return array.map((_, index) => array[permutation.indexOf(index)])
}
2 changes: 2 additions & 0 deletions editor/src/core/shared/parser-projectcontents-utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ import { fastForEach } from '../../core/shared/utils'
import { codeNeedsPrinting, codeNeedsParsing } from '../../core/workers/common/project-file-utils'
import { isFeatureEnabled } from '../../utils/feature-switches'
import { isSteganographyEnabled } from './stegano-text'
import { getParserChunkCount } from '../workers/common/concurrency-utils'
import { getParseCacheOptions } from './parse-cache-utils'

export function parseResultToWorkerUpdates(fileResult: ParseOrPrintResult): WorkerUpdate {
Expand Down Expand Up @@ -204,6 +205,7 @@ export async function updateProjectContentsWithParseResults(
getFilePathMappings(projectContents),
existingUIDs,
isSteganographyEnabled(),
getParserChunkCount(),
getParseCacheOptions(),
)

Expand Down
23 changes: 23 additions & 0 deletions editor/src/core/workers/common/concurrency-utils.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import { isFeatureEnabled } from '../../../utils/feature-switches'

// TODO: this will be configurable from the RYO menu
export const PARSE_CONCURRENCY = 3
const PARSER_CONCURRENCY_FEATURE = 'Parallel Parsing'
const LOG_CONCURRENCY_TIMINGS_FEATURE = 'Log Parse Timings'

export function isConcurrencyEnabled() {
return isFeatureEnabled(PARSER_CONCURRENCY_FEATURE)
}

export function getParserWorkerCount() {
return isConcurrencyEnabled() ? PARSE_CONCURRENCY : 1
}

export function getParserChunkCount() {
return isConcurrencyEnabled() ? PARSE_CONCURRENCY : 1
}

// TODO: this will be configurable from the RYO menu
export function isConcurrencyLoggingEnabled() {
return isFeatureEnabled(LOG_CONCURRENCY_TIMINGS_FEATURE)
}
Loading

0 comments on commit d482650

Please sign in to comment.