Component for loading osu! scores into Elasticsearch.
- .NET 6
- Elasticsearch 7
- Redis 6
If using Elasticsearch 8, a minimum version of Elasticsearch 8.2 is required.
The following env needs to be set on the indexer:
ELASTIC_CLIENT_APIVERSIONING=true
and the following must be set in elasticsearch server configuration
elasticsearch.yml
xpack.security.enabled: false
or docker environment, e.g. in docker compose:
environment:
xpack.security.enabled: false
This will enable http connections to elasticsearch and disable the https and authentication requirement, as well as, returning a compatible response to the client.
A string value is used to indicate the current schema version to be used.
When the queue processor is running, it will store the version it is processing in a set in Redis at osu-queue:score-index:${prefix}active-schemas
.
If a queue processor is stops automatically due to a schema version change, it will remove the version it is processing from the set of versions; it will not be removed if the processor if stopped manually or from processor failures; this is to allow other services to continue pushing to those queues.
Scores with preserve
=true
belonging to a user with user_warnings
=0
will be added to the index,
scores where any of the previous conditions are false will be removed from the index.
Push items to osu-queue:score-index-${schema}
Run dotnet run schema set ${schema}
or set osu-queue:score-index:${prefix}schema
directly in Redis
If there is an already running indexer watching the queue for the new schema version, it will automatically update the alias to point to the new index. When the alias is updated, any index previously used by the alias will be closed.
The alias will not be updated if:
- the schema value does not change
- the indexer processing the queue for that version was not running before the change.
When the schema version changes, all indexers processing the queues for any other version will automatically stop.
Configuration is loaded from environment variables. No environment files are automatically loaded.
To read environment variables from an env file, you can prefix the command to run with env $(cat {envfile})
replacing {envfile}
with your env file, e.g.
Note that this method of passing envvars does not support values with spaces.
env $(cat .env) dotnet run
Additional envs can be set:
env $(cat .env) SCHEMA=1 dotnet run
Maximum number of items to handle/dequeue per batch. This affects the size of the _bulk
request sent to Elasticsearch.
Defaults to 10000
.
Maximum number of BATCH_SIZE * BUFFER_SIZE
items allowed inflight during queue processing.
Defaults to 5
(default of 50000
items).
Host for MySQL.
Defaults to localhost
.
Database name.
Defaults to osu
.
Database username.
Defaults to root
.
Database password.
Host to submit DataDog/StatsD metrics to.
Defaults to localhost
.
Enables DataDog origin detection when running in a container. See DataDog documentation.
Optional prefix for the index names in elasticsearch.
Url to the Elasticsearch host.
Defaults to http://localhost:9200
Redis connection string; see here for configuration options.
Defaults to localhost
Schema version for the queue; see Schema.
This documentation assumes dotnet run
can be used;
in cases where dotnet run
is not available, the assembly should be used, e.g. dotnet osu.ElasticIndexer.dll
Running queue
will automatically create an index if an open index matching the requested schema
does not exist.
If a matching open index exists, it will be reused.
SCHEMA=${schema} dotnet run queue watch
e.g.
SCHEMA=1 dotnet run queue watch
dotnet run schema get
dotnet run schema set ${schema}
This is used to unset the schema version for testing purposes.
dotnet run schema clear
The index the alias points to can be changed manually:
dotnet run schema alias ${schema}
will update the index alias to the latest index with schema ${schema} tag.
To list all indices and their corresponding states (schema, aliased, open or closed)
dotnet run index list
This will close all score indices except the active one, unloading them from Elasticsearch's memory pool.
dotnet run index close
A specific index can be closed by passing in index's name as an argument; e.g. the following will close index_1
:
dotnet run index close index_1
This will delete all closed indices and free up the storage space used by those indices.
The command will only delete an index if it is in the closed
state.
dotnet run index delete
Passing arguments to the command will delete the matching index:
dotnet run index delete index_1
For testing purposes, we can add fake items to the queue:
SCHEMA=1 dotnet run queue pump-fake
It should be noted that these items will not exist or match the ones in the database.
SCHEMA=${schema} dotnet run queue pump-score ${id}
will queue the score with ${id}
for indexing; the score will be added or deleted as necessary, according to the value of SoloScore.ShouldIndex
.
See Queuing items for processing from another client
SCHEMA=1 dotnet run queue pump-all
will read existing solo_scores
in chunks and add them to the queue for indexing. Only scores with a corresponding phpbb_users
entry will be queued.
Extra options:
--from {id}
: solo_scores.id
to start reading from
--switch
: Sets the schema version after the last item is queued; it does not wait for the item to be indexed; this option is provided as a conveninence for testing.
dotnet run active-schemas list
will list the versions known to have queue processors listening on the queue.
For debugging purposes or to perform and manual maintenance or cleanups, the list of versions can be updated manually:
dotnet run active-schemas add ${schema}
dotnet run active-schemas remove ${schema}
Populating an index is done by pushing score items to a queue.
docker build -t ${tagname} -f osu.ElasticIndexer/Dockerfile osu.ElasticIndexer
docker run -e SCHEMA=1 -e "ES_HOST=http://host.docker.internal:9200" -e "ES_INDEX_PREFIX=docker." -e "REDIS_HOST=host.docker.internal" -e "DB_CONNECTION_STRING=Server=host.docker.internal;Database=osu;Uid=osuweb;SslMode=None;" ${tagname} ${cmd}
where ${cmd}
is the command to run, e.g. dotnet osu.ElasticIndexer.dll queue
Push items into the Redis queue "osu-queue:score-index-${schema}
"
e.g.
ListLeftPush("osu-queue:score-index-1", "{ \"ScoreId\": 1 }");
or from redis-cli:
LPUSH "osu-queue:score-index-1" "{\"ScoreId\":1}"
{ "ScoreId": 1 }
{
"Score": {Solo.Score}
}