forked from mlcommons/inference
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update processorca for new rules and patching potential TTFT exploit.…
… Enable GPU for Offline scenario in Llama-v2 Reference Implementation. (mlcommons#1544) * Add environment files for GPU * Update flags to enable running on device=gpu * Enable BS>1 on GPU, update processorca for new rules and first token workaround * Dump outputs and add script to consolidate into single pickle file for analysis * Add option to continue run if prior session was killed * Small comment fix, fix default flags to match original reference implementation * Generalize launch scripts for outside users, update README * Minor fix: comments and default arguments * Add accuracy target to README * Add calibration dataset generation to processorca.py * Update language/llama2-70b/README.md Co-authored-by: Zhihan Jiang <[email protected]> * Make calibration rng seed a kwarg --------- Co-authored-by: Zhihan Jiang <[email protected]>
- Loading branch information
1 parent
678ed4f
commit 94b0cc4
Showing
11 changed files
with
514 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04 | ||
SHELL ["/bin/bash", "-c"] | ||
|
||
ENV LC_ALL=C.UTF-8 | ||
ENV LANG=C.UTF-8 | ||
|
||
ENV TZ=US/Pacific | ||
ENV DEBIAN_FRONTEND=noninteractive | ||
|
||
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone | ||
RUN rm -rf /var/lib/apt/lists/* && rm /etc/apt/sources.list.d/* \ | ||
&& apt update \ | ||
&& apt install -y --no-install-recommends build-essential autoconf \ | ||
libtool git ccache curl wget pkg-config sudo ca-certificates \ | ||
automake libssl-dev bc python3-dev python3-pip google-perftools \ | ||
gdb libglib2.0-dev clang sshfs libre2-dev libboost-dev \ | ||
libnuma-dev numactl sysstat sshpass ntpdate less iputils-ping \ | ||
&& apt -y autoremove \ | ||
&& apt remove -y cmake \ | ||
&& apt install -y --no-install-recommends pkg-config zip g++ zlib1g-dev \ | ||
unzip libarchive-dev | ||
RUN apt install -y --no-install-recommends rsync | ||
|
||
# Install setuptools | ||
RUN python3 -m pip install --upgrade pip \ | ||
&& python3 -m pip install --upgrade setuptools wheel virtualenv | ||
|
||
# Install conda | ||
WORKDIR /tmp | ||
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py310_23.5.2-0-Linux-x86_64.sh \ | ||
&& bash Miniconda3-* -b -p /opt/miniconda3 | ||
ENV PATH="$PATH:/opt/miniconda3/bin" | ||
RUN conda create -n llama2-70b python=3.10 | ||
RUN chmod -R 777 /opt/miniconda3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.