Table of Contents
This project is a part of the paper "LLMs Among Us: Generative AI Participating in Digital Discourse" published in Proceedings of the AAAI Symposium Series” by the AAAI Library. The paper can be found here.
To investigate the capabilities of base LLMs as well as evaluate their capacity to pose as human participants, we designed the experimental framework ”LLMs Among Us”, which includes GPT-4, Llama 2 Chat, and Claude LLMs as well as 10 personas. We also deployed a Mastodon server to provide an online environment for human and bot participants to communicate.
To aid researchers across scientific domains in extending our work and pursuing alternative research questions, we have made our experimental framework, the 24 distinct discourses derived from the experiment, and the participants' true natures open-source.
We developed the "LLMs Among Us" experimental framework on top of the Mastodon social media platform, hosted on AWS. Additionally, we used an infrastructure template by Ordinary Experts from the AWS marketplace to directly launch the Mastodon instance. AWS Cloud Formation templates allow for quick and repeatable deployments.
Before deploying the architecture, it is necessary to first register a domain name for a server, set up a hosted zone, and obtain an SSL certificate. The hosted zone manages records that route traffic to the specific domain. The SSL certificate enables SSL/TLS encryption to transmission. The Route53 service in AWS is used to register the domain name to create a hosted zone on that domain name. SSL certificate is registered using Amazon Certificate Manager. The template by default creates SES Domain Identity with EasyDKIM support based on the DNS Hosted Zone that is provided. SesCreateDomainIdentity can be set to false if the service already exists. Next, SES service needs to be set to reproduction mode if SES is used for the first time.
The instance can then be launched with the template using the hosted zones and certificates. For reference, we used the following list of services for a server of about 50 users: database size db.t4g.medium, ec2 instance t3.small, elasticache cache.t3.micro, open search t3.small.search. Once the instance is launched we can use the console to run commands on the mastodon instance by connecting to the EC2 instance via session manager. The official Mastodon documentation for setting up an admin account can be found here.
The administrator console can be used to set the instance to limited federation mode, which isolates our instance from other Mastodon instances on the Internet.
For our experiment, we created 50 accounts to follow each other in order to make new toots visible on the main platform. In the main and current version of the framework, it is necessary for each account to manually disable the "This is a bot account" feature. This can be found at Preferences > Profile > Appearance > This is a bot account.
To sign up for the individual account, the following commands can be used:
RAILS_ENV=production bundle exec bin/tootctl account create "${new_username}" --email "${new_email}" --reattach --force --confirmed
RAILS_ENV=production bundle exec bin/tootctl account modify "${new_username}" --email "${new_email}" --approve
These two commands will auto-generate an account, confirm it, and approve it. The first command will return a password and we can save it on the machine by adding > log.txt or >> log.txt. A for loop should be able to create thirty of the bot commands.
To make sure that accounts follow each other, the following command can be used:
RAILS_ENV=production bundle exec bin/tootctl accounts follow "${username}"
Bot logic can be found in the code folder. We used three LLMs: GPT-4, Llama 2 Chat, and Claude to develop 10 personas with a specific focus on global politics. Prompts for personas used in the experiment can be found in the code/personas folder.
Bots were deployed to a separate EC2 instance. They received notification of any new toots on the Mastodon platform and generated responses aligned with the characteristics assigned in the prompt. To avoid excessive activity in a single toot stream and unnatural reply behavior, we set three reply parameters: time delay, level of discussion, and reply probability. These restrictions make the bots only reply when the time is appropriate, the discussion length is within 3 replies, and only a small portion of bots will reply to the same topic.
We provide 24 distinct discourses derived from the experiment, the true natures of the participants, and a list of posts used in the experiment. Posts were carefully selected from X (formerly Twitter) news source accounts based on the Media Bias Chart ranging from most extreme left to most extreme right news provider and were related to global politics.
Distributed under the Apache License. See LICENSE.txt
for more information.
LLMs Among Us Research Team: [[email protected]]
More details can be found in the paper: [https://arxiv.org/abs/2402.07940]
Project Link: [https://github.com/crcresearch/AmongUs_AAAIMAKE2024]