PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World

Demo

Check out our demo of PC Agent autonomously controlling a computer to complete complex tasks involving dozens of steps!

Attention-.mp4

Introduction

PC Agent introduces a novel framework to empower autonomous digital agents through human cognition transfer. This transfer is implemented through three key components:

PC Tracker, the first lightweight infrastructure for large-scale human-computer interaction data collection;
A Cognition Completion postprocess pipeline that transforms raw interaction data into cognitive trajectories;
A multi-agent system combining a planning agent for decision-making with a grounding agent for robust visual grounding.

Quick Start

Setup

To get started with PC Agent, we recommend setting up your Python environment using conda:

# Clone the repository and navigate to the folder
git clone https://github.com/GAIR-NLP/PC-Agent.git
cd PC-Agent
# Create and activate conda environment
conda env create -f environment.yml
conda activate pcagent

PC Tracker

PC Tracker is an infrastructure for human-computer interaction data collection. The source code in tracker/ directory can be modified to fit your specific data collection requirements.

To deploy:

Build the executable (Windows):

cd tracker
.\package.ps1

Customize tasks.json according to your annotation needs
Distribute to annotators
Collect annotation data from annotators - annotated data will be saved in the events/ folder (hidden) under working directory

For user instructions, please refer to our PC Tracker User Manual.

Post Processing

To convert raw interaction data into cognitive trajectories, follow these steps:

Place your data in the postprocess/data/ directory. Example data is available in this directory for reference.
Run post-processing pipeline:

python postprocess/refinement.py    # Data refinement
python postprocess/completion.py    # Cognition completion

Note: You need to prepare your OpenAI API key in advance to perform cognition completion.

Agent

We provide a reference implementation of our multi-agent system in the agent/ directory, combining planning and grounding agents. To run:

python agent/main.py

Reference scripts for model deployment can be found in agent/server/ directory.

Citation

If you find this work helpful, please consider citing:

@article{he2024pcagent,
      title={PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World},
      author={Yanheng He and Jiahe Jin and Shijie Xia and Jiadi Su and Runze Fan and Haoyang Zou and Xiangkun Hu and Pengfei Liu},
      year={2024},
      journal={arXiv preprint arXiv:2412.17589},
      url={https://arxiv.org/abs/2412.17589}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
agent		agent
assets		assets
postprocess		postprocess
tracker		tracker
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World

Demo

Introduction

Quick Start

Setup

PC Tracker

Post Processing

Agent

Citation

About

Releases 1

Packages

Contributors 3

Languages

License

GAIR-NLP/PC-Agent

Folders and files

Latest commit

History

Repository files navigation

PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World

Demo

Introduction

Quick Start

Setup

PC Tracker

Post Processing

Agent

Citation

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages