diff --git a/service/README.md b/service/README.md index d589f76..4943b05 100644 --- a/service/README.md +++ b/service/README.md @@ -7,7 +7,7 @@ Implementations of the URL Frontier Service. There are currently 2 implementatio - a simple memory-based which was used primarily for testing - the default one which is scalable, persistent and is based on [RocksDB](https://rocksdb.org/) -Web crawlers can connect to it using the gRPC code generated from the API. There is also a simple client available +Web crawlers can connect to it using the gRPC code generated from the API. There is also a simple client available which can do basic interactions with a Frontier. ## Compilation @@ -20,17 +20,17 @@ To build and run the service from source, compile with `mvn clean package` You can specify the implementation to use for the service and its configuration by passing a configuration file with '-c'. -The configuration file below will set RocksDBService as the implementation to use and configure the path where its data should be stored. +The configuration file below will set RocksDBService as the implementation to use and configure the path where its data should be stored. ``` implementation = crawlercommons.urlfrontier.service.rocksdb.RocksDBService rocksdb.path = /pathToCrawlDir/rocksdb ``` -The key values from the configuration file can also be passed on the command line. Since the RocksDBService is the default implementation, +The key values from the configuration file can also be passed on the command line. Since the RocksDBService is the default implementation, the call above can have the following equivalent without the config file: -`java -Xmx2G -cp target/urlfrontier-service-*.jar crawlercommons.urlfrontier.service.URLFrontierServer rocksdb.path=/pathToCrawlDir/rocksdb` +`java -Xmx2G -cp target/urlfrontier-service-*.jar crawlercommons.urlfrontier.service.URLFrontierServer rocksdb.path=/pathToCrawlDir/rocksdb` If no path is set explicitly for RocksDB, the default value _./rocksdb_ will be used. @@ -39,14 +39,14 @@ on which it is running so that it can report its location with the heartbeat. ## Logging configuration -The logging is done with Logback. A default configuration is loaded and will dump logs on the console at INFO level and above but the configuration -file can be overriden with +The logging is done with Logback. A default configuration is loaded and will dump logs on the console at INFO level and above but the configuration +file can be overriden with `java -Xmx2G -Dlogback.configurationFile=test.xml ...` Alternatively, the Frontier service has a _SetLogLevel_ endpoint and the CLI allows to set the level for a given package from the console. -## Metrics with Prometheus +## Metrics with Prometheus The service implementation takes a parameter *-s*, the value of which is used as port number to expose metrics for [Prometheus](https://prometheus.io/). A [dashboard](https://github.com/crawler-commons/url-frontier/blob/2.x/service/monitoring/provisioning/dashboards/URLFrontier-Prometheus.json) for Grafana is provided. @@ -62,4 +62,12 @@ The easiest way to run the Frontier is to use Docker The service will run on the default port (7071). Additional parameters can simply be added to the command, for instance, to persist RocksDB between runs -`docker run --rm --name frontier -v /pathOnDisk:/crawldir -p 7071:7071 crawlercommons/url-frontier rocksdb.path=/crawldir/rocksdb` +``` +docker run --rm --name frontier -v /pathOnDisk:/crawldir -p 7071:7071 crawlercommons/url-frontier rocksdb.path=/crawldir/rocksdb +``` + +Specify a config file with a volume and the `-c` flag: + +``` +docker run --rm --name frontier -p 7071:7071 -p 9100:9100 -v /path/to/config.ini:/config/config.ini ufrontier -s 9100 -c /config/config.ini +``` \ No newline at end of file diff --git a/service/config.ini b/service/config.ini index 0e402a0..dcf0063 100644 --- a/service/config.ini +++ b/service/config.ini @@ -21,5 +21,8 @@ rocksdb.max_background_jobs = 4 rocksdb.max_subcompactions = 2 rocksdb.max_bytes_for_level_base = 536870912 +# Set to true to enable gRPC server reflection. +server.enable_reflection = false + diff --git a/service/src/main/java/crawlercommons/urlfrontier/service/URLFrontierServer.java b/service/src/main/java/crawlercommons/urlfrontier/service/URLFrontierServer.java index 5634fd2..1136e36 100644 --- a/service/src/main/java/crawlercommons/urlfrontier/service/URLFrontierServer.java +++ b/service/src/main/java/crawlercommons/urlfrontier/service/URLFrontierServer.java @@ -4,9 +4,9 @@ package crawlercommons.urlfrontier.service; import crawlercommons.urlfrontier.service.rocksdb.RocksDBService; -import io.grpc.protobuf.services.ProtoReflectionService; import io.grpc.Server; import io.grpc.ServerBuilder; +import io.grpc.protobuf.services.ProtoReflectionService; import io.prometheus.client.exporter.HTTPServer; import io.prometheus.client.hotspot.DefaultExports; import java.io.Closeable; @@ -171,11 +171,18 @@ public void start() throws Exception { } } - this.server = ServerBuilder.forPort(port) - .addService(service) - .addService(ProtoReflectionService.newInstance()) - .build(); - server.start(); + Boolean enableReflection = + Boolean.parseBoolean( + configuration.getOrDefault("server.enable_reflection", "false")); + + ServerBuilder builder = ServerBuilder.forPort(port).addService(service); + + if (enableReflection) { + builder.addService(ProtoReflectionService.newInstance()); + } + + this.server = builder.build(); + this.server.start(); LOG.info( "Started URLFrontierServer [{}] on port {} as {}", service.getClass().getSimpleName(),