Skip to content

Releases: princeton-nlp/SWE-agent

SWE-agent EnIGMA (0.7.0)

25 Sep 14:45
dc18a74
Compare
Choose a tag to compare

SWE-agent is SOTA on offensive cybersecurity

SWE-agent EnIGMA (Enhanced Interactive Generative Model Agent) is SOTA on offensive cybersecurity challenges, with a 3.3x improvement over previous agents on the NYU CTF challenge dataset. The EnIGMA project introduces multiple novelties that are available to all use cases of SWE-agent, such as Interactive Agent Tools and a Summarizer to handle long outputs.

Major additions

Smaller additions

Fixes

  • Compatibility with SWE-bench 2.0 by @klieret in #671
  • ensure variables work in special command docstring by @forresty in #628
  • Important fix: Catch CostLimitExceeded in retry because of format/block by @klieret in #682
  • Fix: Handle empty traj in should_skip by @klieret in #616
  • Fix for end-marker communicate: Exit status always 0/invalid by @klieret in #644
  • Fix: Insufficient quoting of git commit message by @klieret in #646
  • Fix nonsensical trajectory formatting for PRs by @klieret in #647
  • Fix: sweunexpected keyword 'python_version' by @klieret in #692
  • Fix: Use LONG_TIMEOUT for pre_install commands by @klieret in #695
  • Fix: UnboundLocalError when catching decoding issue by @klieret in #709
  • Also create empty patch files for completeness by @klieret in #725
  • Fix: Raise ContextWindowExceeded instead of exit_cost by @klieret in #727
  • Fix: Deal with non-utf8 encoded bytes in comm by @klieret in #731
  • Fix: Handle spaces in repo names by @klieret in #734
  • Fix: Ensure utils is part of package by @klieret in #742
  • Fix: Submitting ' ' in human mode crashes container by @klieret in #749
  • Fix: Block su as command by @klieret in #752
  • Fix: SWE_AGENT_MODEL_MAX_RETRIES needs casting by @klieret in #757

New Contributors

🎉 @talorabr, @udiboy1209, @haoranxi, @NickNameInvalid, @rollingcoconut joined the team to build EnIGMA 🎉

v0.6.1

20 Jun 15:21
Compare
Choose a tag to compare

This is (mostly) a patch release, in particular fixing several issues that had been introduced by the speed improvements of v0.7.0.
We also solve a bug where existing linter errors in a file left SWE-agent unable to edit (because of our lint-retry-loop).

Breaking changes

  • Change: sparse clone method is now correctly called "shallow" by @klieret in #591

Improved

  • Enh: Show commands when encountering timeout error by @klieret in #582
  • Enh: Configuration option to show time in log by @klieret in #583
  • Enh: Allow to configure LONG_TIMEOUT for SWEEnv by @klieret in #584
  • Enh: Always write log to traj directory by @klieret in #588

Fixed

  • fix docker.errors.NotFound by @klieret in #587
  • Fix: Revert to full clone method when needed by @klieret in #589
  • Fix: Refresh container_obj before querying status by @klieret in #590
  • Fixed #571 - show message that model arg is ignored in case of using Azure OpenAI by @jank in #592
  • Fix: Linting blocks for existing lint errors by @klieret in #593
  • Fix: Process done marker not found in read with timeout by @klieret in #596

v0.6.0

05 Jun 13:16
14a5189
Compare
Choose a tag to compare

What's Changed

image

We sped up SWE-agent by 2x (timed with GPT4o). This is mostly due to faster communication with the running processes inside of the Docker container and other container setup & installation related improvements. Here are a few relevant PRs:

  • Switch to fast communicate and shallow clone by default by @klieret in #530
  • Change: Only wait 1s for docker to start by @klieret in #541
  • Feat: experimental shallow cloning by @klieret in #498
  • Enh: Start from clone of python conda environment for speedup by @klieret in #548
  • Enh: Use uv for editable install by default by @klieret in #547

Fixed

  • Web UI: Remove -n option to wait by @klieret in #487
  • Web UI: Kill the Flask server on exit. by @kwight in #479
  • Web UI: Avoid proxy errors on MacOS by @klieret in #506
  • Ensure container_name is reset for non-persistent containers by @klieret in #463
  • Fix: Do not allow persistent container with cache task imgs by @klieret in #551

Improved

  • Improve scrolling behavior in web UI by @anishfish2 in #420
  • Web UI: Render Markdown in agent feed messages. by @kwight in #486
  • Enh: Remove redundant 'saved traj to X' messages by @klieret in #528
  • Allow to disable config dump to log by @klieret in #537
  • Resolve relative paths to demonstrations and commands by @klieret in #444

New Contributors

Full Changelog: v0.5.0...v0.6.0

v0.5.0

28 May 17:14
c8e8ba6
Compare
Choose a tag to compare

What's Changed

✨ The big news is our brand new documentation

image

Secondly, @ollmer added a new flag --cache_task_images that will significantly speed up SWE-agent when running on the same environment/repository multiple times (no more waiting for cloning and installation!)

Breaking changes

  • We have reformatted our codebase. If you create a PR based on a previous commit, make sure you install our pre-commit hook to avoid merge-conflicts because of formatting. See our docs for more information.
  • Remove direct imports in __init__.py (you can no longer from sweagent import Agent by @klieret in #436

Added

  • Running the web UI is now supported when running swe-agent completely in docker
  • Speed up evaluation by caching task environments as docker images by @ollmer in #317

Improved

  • Add gpt-4o model by @raymyers in #344
  • Web: Allow to specify commit hash by @klieret in #358
  • Add default environment_setup config by @klieret in #351
  • Enh: Suppress openai logging; improve formatting of stats by @klieret in #416
  • Remove signal dependency by @klieret in #428
  • Do not use select if running on Windows by @klieret in #429
  • Use custom Config class to support env and keys.cfg (this allows passing keys as environment variables) by @klieret in #430

Fixes

  • Web: Fix script_path input by @klieret in #334
  • Fix: Don't print patch msg for exit_cost patch by @klieret in #343
  • Fix: Do not request job control in bash by @klieret in #345
  • Fix: --base_commit not used for gh urls by @klieret in #346
  • Fix: Separate data path/traj dir cause exception by @klieret in #348
  • Add docker-py lower bound by @klieret in #406
  • Fix: IndexError when replaying incomplete trajectories by @klieret in #410

New Contributors

Full Changelog: v0.4.0...v0.5.0

0.4.0 Web UI

09 May 14:58
1e065f8
Compare
Choose a tag to compare

What's Changed

We’re excited to launch the SWE-agent web UI! Specify a bug, press start and watch SWE-agent do the magic ✨

quick_ui

New Contributors

Full Changelog: v0.3.0...v0.4.0

0.3.0

02 May 15:47
43b8de5
Compare
Choose a tag to compare

What's Changed

✨ Features

  • Run SWE-agent in the cloud using GitHub Codespaces
  • Add GPT4-turbo model by @zgrannan in #252
  • feat: Amazon Bedrock support (Claude models) by @JGalego in #207

🐛 Fixes

❤️ New Contributors

Full Changelog: v0.2.0...v0.3.0

v0.2.0

15 Apr 19:01
58aa046
Compare
Choose a tag to compare

What's Changed

Added

  • Allow to run on local repos (new flag: --repo_path) by @klieret in #193
  • Patch files are now saved separately to a patch directory by @klieret in #126
  • Allow to supply custom installation commands when running on gh issues or locally (--environment_setup) by @klieret in #153
  • Allow to specify openapi base url in keys.cfgby @bvandorf in #118

Improved

Fixed

New Contributors

Full Changelog: v0.1.2...v0.2.0