Note: Bashi is still very early stage and under active development, if you are looking for a polished product you will be disappointed. But if you are looking to spend some time building your own personal assistant AI then you're at the right place!
Bashi is an extensible platform that bridges LLMs to tasks and actions.
It comes with an OSX personal assistant app, so you can try it out quickly, and mould the app to your own needs.
This repo has two components:
- The Bashi API server.
- The OSX app 'assist', which serves as an example client implementation, as well as a usable product
Examples of the assist OSX app in action:
Bashi uses a novel approach in that the LLM is asked to write Javascript and the server effectively provides a REPL for the model. Below are some example prompt+completions. All of these are real examples that you can try on the OSX app 👍
Request: help me write a commit message please
Thought: I need to generate a commit message based on a diff
Action: returnText(writeCommitMessage(getInputText(\"diff\")));
Request: there is a function I don't understand, can help me summarize it?
Thought: I need to extract the information from the given string
Action: returnText(extractInformation(\"summarize the function\", getInputText(\"what is the function?\")))
There seems to be some advantages to this:
- Reduced completion sizes since not only does code compact information, but by having multiple steps in a single action you save on model round trips.
- GPT3.5 was probably trained on lots of Javascript, so asking it to write Javascript may lead to tendencies towards emergent reasoning/logic characteristics.
Clients are able to extend the capabilties of the agent by providing their own
commands/functions. For example the OSX client provides the server with
information about the createCalendarEvent
command:
AnonymousCommand(
name: "createCalendarEvent",
cost: .Low,
description: "make calendar event for the given name, datetime and duration",
args: [
.init(type: .string, name: "name"),
.init(type: .string, name: "iso8601Date"),
.init(type: .number, name: "event duration in hours")
],
returnType: .void,
triggerTokens: ["calendar", "event", "appointment", "meeting"],
runFn: { (api, ctx, args) async throws -> BashiValue in
// ... redacted guard code
let event = EKEvent.init(eventStore: self.eventStore)
event.startDate = date
event.title = name
event.endDate = date.addingTimeInterval(60 * 60 * hours.doubleValue)
event.calendar = defaultCalendar
try self.eventStore.save(event, span: .thisEvent, commit: true)
await api.indicateCommandResult(message: "Calendar event created")
return .init(.void)
}),
Commands can be resolved on either the client or the server. The example above
is a command resolved on the client. In contrast, there are commands like math
that are resolved on the server.
Clients just need to interface with the API defined in
openapi.json
. There are plenty of OpenAPI
definition -> client library generation tools out there to help get things
started if you wish to write a client in a new language.
After cloning the repo, set up your API keys. At minimum you'll need an OpenAI API key.
cp server/.env.template server/.env
# edit server/.env
Run the entire server stack using docker:
make build
make up-all
open http://localhost:8003
The index page has some examples and you can play around with text or audio prompts. Although any commands that must be resolved on the client-side are fixtures/dummies. (The OSX app includes 'real' client-side commands).
If you are working on the server, you should set up the live-reloading server. Not only does this pick up code changes, but it also generates the OpenAPI spec when there are changes to API interface.
make dev
open http://localhost:8080
Currently pre-built binaries are not available so you'll need to build the OSX app yourself.
Open the xcworkspace
in Xcode:
open assist/assist.xcworkspace
Build the 'assist' scheme. Note by default the app points to
http://localhost:8003/api
which corresponds to the server running in docker
via make up-all
. If you are running the server with make dev
you'll want to
update the API base URL to http://localhost:8080/api
by going to the app
settings.
Any changes to the API surface will require a new swift client to be generated
using make clients
.
I still need to work on some more comprehensive documentation for the codebase 🙇
The API is described in openapi.json which can be plugged into https://editor.swagger.io/ for viewing.
Let's build JARVIS together :)
There is no contribution guide for now, but you are welcome to make contributions to the OSX client (or introduce new clients if you are okay with an unstable API).
Issues and feature requests are accepted for the server/
, but not code changes
at this moment.
For the OSX client run tests via Xcode as per usual.
For the server use make test
to run tests, and make test-update
to update
test snapshots. Snapshot
testing is
used liberally, for better or for worse 🙈
Running the server in make dev
results in more verbose logging, please copy
and paste the error output into any bug reports (after redacting sensitive
information, if any).
(Need to migrate these to GH issues)
-
If the model does not have the commands to complete the request, it will tend to just repeat the question back to the user.
- A general knowledge lookup command would be useful here, the existing search command is not sufficient as it does not provide a knowledge graph. Perhaps this command can have multiple layers - ask a LLM first, if no answer use some search API.
-
The model gets confused when there are overlapping commands relevant to a request. For example if you ask it to
write example code to create a reminder using swift
, it ends up calling thecreateReminder()
command instead.- I'm exploring an approach to alleviate this by having a pre-process stage where the command set is filtered down, this could also result in token savings.
-
The model will sometimes output an un-parseable expression leading to error. A quick fix is usually to rephrase your request, in the long term the following approaches are being considered:
- Progressively loosen the definition of the action language to accomodate for common model errors.
- Support configurable automatic retries.
- Prompt engineering and fine tuning.
-
Apple APIs for audio recording and transcription is hard :( Push-to-talk is very flaky with airpods on.