Installation and usage of the platform
Precise platform configuration: confidential data groups configuration¶
If you use the privacy API methods to manage confidential data, configure the access to the confidential data in the node configuration file. Use the privacy
section for this purpose. In the example below the PostgreSQL database is used:
privacy {
replier {
parallelism = 10
stream-timeout = 1 minute
stream-chunk-size = 1MiB
}
synchronizer {
request-timeout = 2 minute
init-retry-delay = 5 seconds
inventory-stream-timeout = 15 seconds
inventory-request-delay = 3 seconds
inventory-timestamp-threshold = 10 minutes
crawling-parallelism = 100
max-attempt-count = 24
lost-data-processing-delay = 10 minutes
network-stream-buffer-size = 10
}
inventory-handler {
max-buffer-time = 500ms
max-buffer-size = 100
max-cache-size = 100000
expiration-time = 5m
replier-parallelism = 10
}
cache {
max-size = 100
expire-after = 10m
}
storage {
vendor = postgres
schema = "public"
migration-dir = "db/migration"
profile = "slick.jdbc.PostgresProfile$"
upload-chunk-size = 1MiB
jdbc-config {
url = "jdbc:postgresql://postgres:5432/node-1"
driver = "org.postgresql.Driver"
user = postgres
password = wenterprise
connectionPool = HikariCP
connectionTimeout = 5000
connectionTestQuery = "SELECT 1"
queueSize = 10000
numThreads = 20
}
}
service {
request-buffer-size = 10MiB
meta-data-accumulation-timeout = 3s
}
}
Choosing the database¶
Before changing the node configuration file, decide on the database that you plan to use to store confidential data. The Waves Enterprise blockchain platform supports interaction with PostgreSQL database or Amazon S3.
PostgreSQL¶
During the installation of a database running under PostgreSQL, you will create an account to access the database. The username and password you set for this account must then be specified in the node configuration file (in the user
and password
fields of the storage
block of the privacy
section, see the vendor = postgres section for details).
To use PostgreSQL DBMS, you will need to install the JDBC interface (Java DataBase Connectivity). When installing JDBC, set the profile name. This name must then be specified in the node configuration file (in the profile
field of the storage
block of the privacy
section, see the vendor = postgres section for details).
For optimization purposes, connection to PostgreSQL can be done through the pgBouncer tool. In this case, pgBouncer requires special configuration, which is described below in the storage-pgBouncer section.
Amazon S3¶
When using Amazon S3, the information must be stored on the Minio server. During the Minio server installation, you will be prompted for a login and password to access the data. These login and password must then be specified in the node configuration file (in the access-key-id
and secret-access-key
fields, see vendor = s3 section for details).
After installing the DBMS appropriate for your project, adjust the storage
block of the privacy
section in the node configuration file as specified below.
storage
block¶
Specify the DBMS you are using in the vendor
parameter in the storage
block of the privacy
section:
postgres
– for PostgreSQL;s3
– for Amazon S3.
Important
If you do not use the privacy API methods, specify none
in the vendor
parameter and comment out or delete the rest of the parameters in the privacy
section.
vendor = postgres
¶
When using the PostgreSQL DBMS, the storage
block of the privacy
section looks like this:
storage {
vendor = postgres
schema = "public"
migration-dir = "db/migration"
profile = "slick.jdbc.PostgresProfile$"
upload-chunk-size = 1MiB
jdbc-config {
url = "jdbc:postgresql://postgres:5432/node-1"
driver = "org.postgresql.Driver"
user = postgres
password = wenterprise
connectionPool = HikariCP
connectionTimeout = 5000
connectionTestQuery = "SELECT 1"
queueSize = 10000
numThreads = 20
}
}
The block must contain the following parameters:
schema
– the used scheme of interaction between elements within the database. By default, thepublic
scheme is used, but if your database provides another scheme, specify its name;migration-dir
– directory for data migration;profile
– profile name for JDBC access, set during JDBC installation (see the PostgreSQL section);upload-chunk-size
– the size of the data fragment uploaded using POST /privacy/sendLargeData REST API method or SendLargeData gRPC API method;url
– the PostgreSQL database address (see the url field section for details);driver
– the name of the JDBC driver that allows Java applications to communicate with the database;user
– user name to access the database; specify the login of the account you created to access the database under PostgreSQL;password
– the password to access the database; specify the password of the account you created to access the database under PostgreSQL;connectionPool
– the connection pool name,HikariCP
by default;connectionTimeout
– time of connection inactivity before it is broken (in milliseconds);connectionTestQuery
– a test query to test the connection to the database; for PostgreSQL, it is recommended to sendSELECT 1
;queueSize
– the size of the query queue;numThreads
– the number of simultaneous connections to the database.
url
field¶
In the url
field, specify the address of the database you are using in the following format:
jdbc:postgresql://<POSTGRES_ADDRESS>:<POSTGRES_PORT>/<POSTGRES_DB>
, where
POSTGRES_ADDRESS
– PostgreSQL host address;POSTGRES_PORT
– PostgreSQL host port number;POSTGRES_DB
– the PostgreSQL database name.
You can specify the database address along with the account data using the user
and password
parameters:
privacy {
storage {
...
url = "jdbc:postgresql://yourpostgres.com:5432/privacy_node_0?user=user_privacy_node_0@company&password=7nZL7Jr41qOWUHz5qKdypA&sslmode=require"
...
}
}
In this example, user_privacy_node_0@company
is the username, 7nZL7Jr41qOWUHz5qKdypA
is its password. You can also use the sslmode=require
command to require ssl usage when authorizing.
pgBouncer¶
To optimize work with the PostgreSQL database you can use pgBouncer – the tool to connect to the PostgreSQL database. pgBouncer is configured in a separate configuration file – pgbouncer.ini. Because pool_mode = transaction
mode in pgBouncer configuration does not support prepared server-side statements, we recommend to use pool_mode
with session
mode in pgbouncer.ini settings file to prevent data loss. When using session mode you should set the server_reset_query
parameter to DISCARD ALL
.
[pgbouncer]
pool_mode = session
server_reset_query = DISCARD ALL
More information about how session mode with prepared operators works can be found in the official documentation for pgBouncer.
vendor = s3
¶
When using the Amazon S3 DBMS, the storage
block of the privacy
section looks like this:
storage {
vendor = s3
url = "http://localhost:9000/"
bucket = "privacy"
region = "aws-global"
access-key-id = "minio"
secret-access-key = "minio123"
path-style-access-enabled = true
connection-timeout = 30s
connection-acquisition-timeout = 10s
max-concurrency = 200
read-timeout = 0s
upload-chunk-size = 5MiB
}
url
– address of the Minio server to store data; by default, Minio uses the 9000 port;bucket
– name of the S3 database table to store data;region
– name of the S3 region, the parameter value isaws-global
;access-key-id
– identifier of the data access key; specify the data access login that you set during the Minio server installation (see Amazon S3);secret-access-key
– data access key in the S3 repository; specify the data access password that you set during the Minio server installation (see Amazon S3);path-style-access-enabled = true
– the path to S3 table; unchangeable parameter;connection-timeout
– period of inactivity before the connection is broken (in seconds);connection-acquisition-timeout
– period of inactivity when establishing a connection (in seconds);max-concurrency
– the maximum number of concurrent accesses to the storage;read-timeout
– period of inactivity when reading data (in seconds);upload-chunk-size
– the size of the data fragment uploaded using POST /privacy/sendLargeData REST API method or SendLargeData gRPC API method.
replier
block¶
Use the replier
block in the privacy
section to specify confidential data streaming parameters:
replier {
parallelism = 10
stream-timeout = 1 minute
stream-chunk-size = 1MiB
}
The block must contain the following parameters:
parallelism
– the maximum number of parallel tasks for processing privacy data requests;stream-timeout
– the maximum time the read operation on the stream should perform;stream-chunk-size
– the size of one partition when transferring data as a stream.
inventory-handler
block¶
Use the inventory-handler
block in the privacy
section to specify policies inventory data aggregation parameters:
inventory-handler {
max-buffer-time = 500ms
max-buffer-size = 100
max-cache-size = 100000
expiration-time = 5m
replier-parallelism = 10
}
The block must contain the following parameters:
max-buffer-time
– the maximum time for buffer; when the specified time elapses, the node processes all inventories in batch;
max-buffer-size
– the maximum number of inventories in buffer; when the limit is reached, the node processes all inventories in batch;
max-cache-size
– the maximum size of inventories cache; using this cache the node selects only new inventories;
expiration-time
– expiration time for cache items (inventories);
replier-parallelism
– the maximum parallel tasks for processing inventory requests.
cache
block¶
Use the cache
block in the privacy
section to specify policy data responses cache parameters:
cache {
max-size = 100
expire-after = 10m
}
Note
Large files (files uploaded using POST /privacy/sendLargeData REST API method or SendLargeData gRPC API method) are not cached.
The block must contain the following cache parameters:
max-size
– the maximum count of elements;
expire-after
– the time to expire for element if it hasn’t got access during this time.
synchronizer
block¶
Use the synchronizer
block in the privacy
section to specify private data synchronization parameters:
synchronizer {
request-timeout = 2 minute
init-retry-delay = 5 seconds
inventory-stream-timeout = 15 seconds
inventory-request-delay = 3 seconds
inventory-timestamp-threshold = 10 minutes
crawling-parallelism = 100
max-attempt-count = 24
lost-data-processing-delay = 10 minutes
network-stream-buffer-size = 10
}
The block must contain the following parameters:
request-timeout
– maximum response waiting time after a data request; the default value is2 minute
;init-retry-delay
– first delay after an unsuccessful attempt; with each attempt, the delay increases by 4/3; the default value is5 seconds
;inventory-stream-timeout
– the maximum time the node waits for a network message with the inventory information, i.e. confirmation from a particular node that it has certain data and can provide it for downloading. When this timeout expires, the node sends inventory-request to all the peers to see if they have the necessary data for downloading; the default value is15 seconds
;inventory-request-delay
– delay after requesting peers data inventory (inventory-request); the default value is –3 seconds
;inventory-timestamp-threshold
– time threshold for inventory broadcast; inventory broadcast is used for new transactions to speed up the privacy subsystem; the parameter is used to decide whether to send PrivacyInventory message when the data is synchronized (downloaded) successfully; the default value is 10 minutes`;crawling-parallelism
– the maximum parallel crawling tasks count; the default value is100
;max-attempt-count
– the number of attempts that the crawler will take before the data is marked as lost; the default value is24
;lost-data-processing-delay
– the delay between the attempts to process the lost items queue; the default value is10 minutes
;network-stream-buffer-size
– the maximum count of the data chunks in the buffer; when the limit is reached, back pressure is activated; the default value is10
.
inventory-timestamp-threshold
field¶
A node sends a PrivacyInventory message to peers after it has inserted data into its private storage by a certain data hash. A cache is used to store the PrivacyInventory, which is limited by the number of objects and their time in the cache. Depending on the value of the inventory-timestamp-threshold
parameter, the data insertion event handler decides whether the PrivacyInventory message should be sent when the data is inserted. The handler compares the transaction timestamp, which corresponds to the given data hash, and the current time on the node. If the difference exceeds the value of the inventory-timestamp-threshold
parameter, the PrivacyInventory messages are not sent. By adjusting the value of the inventory-timestamp-threshold
parameter, you can avoid the situation where a node which synchronizes the state with the network clogs the network with unnecessary PrivacyInventory messages.
service
block¶
In the service
block of the privacy
section, specify the SendLargeData gRPC method and POST /privacy/sendLargeData REST method parameters to send a stream of confidential data.
service {
request-buffer-size = 10MiB
meta-data-accumulation-timeout = 3s
}
The block must contain the following parameters:
request-buffer-size
– the maximum request buffer size; when the specified size is reached, the back pressure is activated;meta-data-accumulation-timeout
– the maximum time of metadata entity accumulation when sending data via POST /privacy/sendLargeData REST API method.