'Managing configuration of a distributed system with Apache ZooKeeper: Loading initial configuration' post illustration

Managing configuration of a distributed system with Apache ZooKeeper: Loading initial configuration

avatar

This post is the second in the series of publications about using the Apache ZooKeeper for building configuration management solutions for a distributed system. It focuses on implementing a tool for loading initial configuration data into a fresh ZooKeeper ensemble.

In the previous publication, Managing configuration of a distributed system with Apache ZooKeeper, you can find assumptions about the possible structure of the configuration data in the distributed system, assessment of using ZooKeeper ensemble as a centralized configuration storage and basic examples of managing configuration data of the simple Scala-based HTTP service.

Using ZooKeeper command line utility

The most obvious way to import the initial configuration into the ZooKeeper ensemble is to use the bundled command line utility. The CLI shell scripts are located in the /bin subdirectory of a ZooKeeper distribution, where the zkCli.sh script is designed for UNIX systems and the zkCli.cmd script - for Windows systems.

A complete list of available shell commands you can find in the previous publication as well as the instructions on importing initial configuration entries into the fresh ZooKeeper ensemble using the command line utility.

Although the ZooKeeper CLI shell allows to execute file system like operations and looks pretty nice for the first acquaintance with the ZooKeeper or working with the simplest solutions, it is useless for more or less advanced use cases. For creating complex distributed applications ZooKeeper offers bindings in two languages: C and Java. Detailed information on client libraries are available in the "Bindings" section of the ZooKeeper Programmer's Guide.

Importing initial configuration data from a file

Consider creating the script for importing initial configuration into a ZooKeeper ensemble from the file. This solution will have advantages over the CLI shell, eliminating the need to manually create a ZNode for each configuration item and enter values in the console every time they should be imported. Thus, all required data can be just placed into the file (or group of files), structured and then imported by executing only one command.

I prefer Groovy language for implementing the import script, as it is great for writing concise and maintainable code, and, being the JVM-based language, allows using a giant Java codebase with the help of the embedded JAR dependency manager - Grape.

In general, the process of importing configuration data from the file can be divided into two steps:

  • obtaining configuration data from the file;
  • uploading configuration data into the ZooKeeper ensemble.

Let's go through them one-by-one.

Parsing the configuration from a file

I'm sure that Typesafe Config library is an excellent choice for the task of reading and parsing configuration data from the file. It has rich functionality and good documentation - it is a pleasure to use this library for working with the application configuration. Currently, it supports the following formats: HOCON (*.conf), json (*.json), properties (*.properties). This means that you are free to choose any of these three formats to store your initial configuration data. Examples of how the same data looks in different formats are shown in the code snippets below:

HOCON
1
2
3
4
key1 = value1
prefix {
  key2 = value2
}
JSON
1
2
3
4
5
6
{
  "key1": "value1",
  "prefix": {
    "key2": "value2"
  }
}
Properties
1
2
key1=value1
prefix.key2=value2

The following Grapes annotation adds Typesafe Config library to the script's classpath:

1
@Grab('com.typesafe:config:1.3.0')

The initial configuration from the previous publication in the HOCON format might look like the following:

settings.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Default settings to import

# Development environment settings
system.dev {

  # Common DB settings
  db {
    host = "jdbc:mysql://10.10.10.1:3306/"
    maxConnections = 10
  }

  # Example service settings
  example {
    host = "localhost"
    port = 8081
    db.name = "example"
    db.user = "dev"
    db.password = "password"
  }
}

# Test environment settings
system.test {

  # Common DB settings
  db {
    host = "jdbc:mysql://10.10.10.2:3306/"
    maxConnections = 100
  }

  # Example service settings
  example {
    host = "localhost"
    port = 8082
    db.name = "example"
    db.user = "test"
    db.password = "password"
  }
}

Finally, here is a code example for loading and parsing configuration file:

1
2
3
4
// Tries to parse the configuration from the file
final Config config = ConfigFactory.parseFileAnySyntax(configFile)
// Flattens the configuration entries and sorts them by the key (asc)
final Map<String, ConfigValue> entries = config.entrySet().collectEntries().sort()

Uploading the configuration to a server

At the second step - uploading initial configuration data into the ZooKeeper ensemble - Apache Curator Framework is used. It provides the clean and simple high-level API and automates connection management, which simplifies using ZooKeeper.

In the example to the first publication Curator Framework was used to obtain specific configuration entries from the ZooKeeper server. Here, it will be responsible for creating ZNode structure and filling it with the initial configuration data.

Below is the client initialization code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/**
 * The ZooKeeper client's connection retry policy.
 */
final BoundedExponentialBackoffRetry retryPolicy = 
        new BoundedExponentialBackoffRetry(MIN_INTERVAL_BETWEEN_RETRIES, 
            MAX_INTERVAL_BETWEEN_RETRIES, NUMBER_OF_CONNECTION_RETRIES)

/**
 * Configures ZooKeeper client.
 */
final CuratorFramework client = CuratorFrameworkFactory.builder()
        .connectString(connectString)
        .retryPolicy(retryPolicy)
        .namespace(namespace)
        .build()

The Connect string parameter value contains comma-separated list of "host:port" pairs of the ZooKeeper servers in the ensemble.

The Retry policy parameter is responsible for setting behavior for recovering from connection errors. In the current example, a BoundedExponentialBackoffRetry policy is used. This means that a client retries connection attempts the specified number of times with an increasing (up to a maximum bound) sleep time between retries.

The Namespace parameter should be set if all ZNode paths are expected to be prefixed by the specific namespace.

Default values are set in the script code for all of the mentioned variables. Also, a user can override the "connectString" and the "namespace" variables via script parameters. This will be described in the next paragraph.

It is easy to start a client:

1
client.start()

Also, please, do not forget to close the client when it is not needed anymore to avoid leaks of the system resources. In the example script, the call is placed into the finally block of the script's common error handling wrapper:

1
2
3
4
} finally {
    // Terminates the ZooKeeper client
    client.close()
}

After the content of the configuration file is loaded and parsed, and the client is initialized, the script starts to create ZNode structure and upload data entries. As you might notice in the previous paragraph, configuration entries are being sorted by path (ascending). This way, ZNodes are being created in a natural order, starting from the root path and then moving deeper through the config structure.

Pay additional attention on the fact that default separator of the path tokens in the Config object is a dot symbol, while ZooKeeper requires to use a slash symbol for this, that's why additional processing of the path tokens takes place here.

1
2
3
4
5
6
7
8
9
10
11
12
13
entries.each { k, v ->
    // Builds the path to the ZooKeeper node
    final String path = "/" + k.replaceAll('\\.', '/')
    try {
        // Creates the appropriate ZNode and assigns configuration value to it
        client.create()
            .creatingParentsIfNeeded()
            .forPath(path, v.unwrapped().toString().getBytes())
        logger.info("Node '${path}' is created")
    } catch (KeeperException e) {
        logger.warn("Unable to create node for path '${path}': ${e.code()}")
    }
}

Running the script

The command line interface of the script is the following:

1
2
3
4
5
6
7
8
9
10
usage: groovy import_settings.groovy -[cfn]
 -c,--connect-string <connect-string>   The list of Zookeeper servers to
                                        connect to. Default is
                                        "localhost:2181"
 -f,--file <filename>                   Path to the settings file. Default
                                        is "settings.conf"
 -help                                  Show usage information
 -n,--namespace <namespace>             Optional ZooKeeper namespace to
                                        use when importing configuration
                                        data

This output can be displayed when executing the script with -help option:

1
$ groovy import_settings.groovy -help

The call of the script against the single local ZooKeeper instance in the ensemble and the "application.conf" configuration file, located in the same directory as the script, looks like this:

1
$ groovy import_settings.groovy -c localhost:2181 -f application.conf

The call of the script against the ensemble of three ZooKeeper instances and a custom path to the configuration file looks like this:

1
2
$ groovy import_settings.groovy -c 10.10.10.1:2181,10.10.10.2:2181,10.10.10.3:2181 
-f /opt/configuration/app/settings.conf

Sources

Complete code of the import script is available in the code snipped below or in the GitHub repo.

/scripts/import_settings.groovy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
import com.typesafe.config.Config
import com.typesafe.config.ConfigFactory
import com.typesafe.config.ConfigValue
import org.apache.curator.framework.CuratorFramework
import org.apache.curator.framework.CuratorFrameworkFactory
import org.apache.curator.retry.BoundedExponentialBackoffRetry
import org.apache.zookeeper.KeeperException
import org.slf4j.Logger
import org.slf4j.LoggerFactory

@Grapes([
        @Grab(group = 'org.slf4j', module = 'slf4j-simple', version = '1.7.13'),
        @Grab('org.apache.curator:curator-framework:2.9.1'),
        @Grab('com.typesafe:config:1.3.0'),
        @Grab('commons-cli:commons-cli:1.3.1')
])

/**
 * The default path to the file with the HOCON configuration data to be imported.
 */
final String DEFAULT_CONFIG_PATH = "settings.conf"

/**
 * The default list of ZooKeeper servers in the ZooKeeper ensemble.
 */
final String DEFAULT_ZK_CONNECTION_STRING = "localhost:2181"

/**
 * The default number of connection retries in case if 
 * ZooKeeper communication failure occurs.
 */
final int NUMBER_OF_CONNECTION_RETRIES = 5

/**
 * The minimum interval between connection retry attempts (in milliseconds).
 */
final int MIN_INTERVAL_BETWEEN_RETRIES = 250

/**
 * The maximum interval between connection retry attempts (in milliseconds).
 */
final int MAX_INTERVAL_BETWEEN_RETRIES = 25000

/**
 * The script logger instance.
 */
final Logger logger = LoggerFactory.getLogger(this.getClass())

/**
 * The CLI Builder.
 */
final CliBuilder cli = new CliBuilder(usage: 'groovy import_settings.groovy -[fcn]')
cli.with {
    f(longOpt: 'file', args: 1, argName: 'filename', 
        'Path to the settings file. Default is "settings.conf"')
    c(longOpt: 'connect-string', args: 1, argName: 'connect-string', 
        'The list of Zookeeper servers to connect to. Default is "localhost:2181"')
    n(longOpt: 'namespace', args: 1, argName: 'namespace', 
        'Optional ZooKeeper namespace to use when importing configuration data')
    help 'Show usage information'
}

/**
 * The Options accessor; contains parsed CLI options.
 */
final OptionAccessor options = cli.parse(args)

/**
 * The descriptor of the file with the configuration to import.
 */
File configFile

/**
 * The connection string with a list of server addresses in the ZooKeeper ensemble.
 */
String connectString

/**
 * The ZooKeeper namespace for the specified configuration data.
 */
String namespace

// Terminates execution if invalid options are provided.
if (!options) {
    cli.usage()
    return
}

// Displays script usage information and terminates the script execution 
// if help option is specified.
if (options.help) {
    cli.usage()
    return
}

// Sets a path to the file with the configuration to import (or uses the default one).
if (options.f) {
    configFile = new File(options.f.toString())
} else {
    configFile = new File(DEFAULT_CONFIG_PATH)
}

// Sets a ZooKeeper ensemble connection string (or uses the default one).
if (options.c) {
    connectString = options.c.toString()
} else {
    connectString = DEFAULT_ZK_CONNECTION_STRING
}

// Sets a ZooKeeper namespace for the imported configuration (or uses empty).
if (options.n) {
    namespace = options.n.toString()
} else {
    namespace = ""
}

/**
 * The ZooKeeper client's connection retry policy.
 */
final BoundedExponentialBackoffRetry retryPolicy = 
        new BoundedExponentialBackoffRetry(MIN_INTERVAL_BETWEEN_RETRIES, 
            MAX_INTERVAL_BETWEEN_RETRIES, NUMBER_OF_CONNECTION_RETRIES)

/**
 * Configures ZooKeeper client.
 */
final CuratorFramework client = CuratorFrameworkFactory.builder()
        .connectString(connectString)
        .retryPolicy(retryPolicy)
        .namespace(namespace)
        .build()

try {
    // Checks if a file with the configuration exists and whether it is not a directory
    if (configFile.exists() && !configFile.isDirectory()) {
        // Tries to parse the configuration from the file
        final Config config = ConfigFactory.parseFileAnySyntax(configFile)
        // Flattens the configuration entries and sorts them by the key (asc)
        final Map<String, ConfigValue> entries = config
                .entrySet().collectEntries().sort()

        // Starts the ZooKeeper client
        client.start()

        entries.each { k, v ->
            // Builds the path to the ZooKeeper node
            final String path = "/" + k.replaceAll('\\.', '/')
            try {
                // Creates the appropriate node on ZooKeeper and assigns 
                // configuration value to it
                client.create()
                    .creatingParentsIfNeeded()
                    .forPath(path, v.unwrapped().toString().getBytes())
                logger.info("Node '${path}' is created")
            } catch (KeeperException e) {
                logger.warn("Unable to create node for path '${path}': ${e.code()}")
            }
        }
    } else {
        logger.warn("File with configuration at path ${configFile.getPath()} " + 
                "does not exist")
    }
} catch (Throwable t) {
    logger.error("Unable to import settings from file ${configFile.getPath()}: " +
            "${t.getMessage()}", t)
} finally {
    // Terminates the ZooKeeper client
    client.close()
}

The source code of the whole example Scala project is available on GitHub.

Hope you find this helpful.

If you're looking for a developer or considering starting a new project,
we are always ready to help!