<chain template="cluster-s3"/>
The "cluster-s3" chain stands for the following configuration:
<chain> <!--template="cluster-s3"--> <provider id="cache-fs-eventual-s3" type="cache-fs"> <provider id="sharding-cluster-eventual-s3" type="sharding-cluster"> <sub-provider id="eventual-cluster-s3" type="eventual-cluster"> <provider id="retry-s3" type="retry"> <provider id="s3" type="s3"/> </provider> </sub-provider> <dynamic-provider id="remote-s3" type="remote"/> </provider> </provider> </chain>
The main difference between the “cluster” chain configuration to the non-cluster, is that in “cluster”, each Artifactory node will use his own storage for the “eventual” provider, so a shared mount is not needed.
Here each Artifactory node will manage his own queue and dispatch his own events against the Cloud storage.
This chain configures the following provider layers:
- Cache-FS
- Sharding-Cluster - This provider allows us to cluster the “Eventual” in a way that will allow Artifactory to recognize files which are persisted in other nodes.
- Eventual-Cluster - A clustered version of the Eventual provider. Located at $ARTIFACTORY_HOME/data/eventual and has 2 subfolders: “_pre” & “_queue”.
- Retry - Responsible for “retry” events in case of failures.
- S3 - The cloud storage provider.
- Remote - Responsible for communication with other HA member nodes.
As in this configuration binaries stored locally rather than on an external drive, they are accessible only when Artifactory is up and running (it is not enough for the host to be up).
In order to ensure data is always available, it is important to familiarize yourself with the following 2 parameters of the “sharding-cluster” provider:
redundancy - Default: 2. The number of copies that should be stored for each binary. Although the “eventual” is only a transitive directory, we would usually want at least 2 copies of each file in the “eventual” to ensure all files are always available. For example, during a restart, we may take down a node that has files in the “eventual”. If the same file exists in another node, it will be served from there. (Assuming it is a rolling restart)
lenientLimit - Default: 1. The minimum number of copies that needs to be stored in order for the upload to be successful. If set to 0, the configured redundancy must be kept.
Upon Downloading
- The node which received the download request will first check if the file is available in the Cache-FS layer.
- If not available, he will check if the file exists in his local Eventual folder.
- In case not, he will check using the “Remote” provider if the file is available in one of the other member nodes.
- Else, he will proceed to download the file from the cloud provider, and then provide it to the client.
Direct Cloud Storage will allow you to skip the part where Artifactory downloads the file from the cloud storage and then serves it to the client.
Upon Uploading
- The node which received the upload request will stream the file to the eventual “_pre” folder.
- Once the node has fully received the file, it will move the file from “_pre” to “_queue”, and only then close the upload request. In case “lenientLimit” and “redundancy” are set to more than 1, it will first assure that the binary exists at least at “n” number of “eventual” folders of nodes of the HA cluster. (“n” being the configured “lenientLimit”)
- Once the file is in the “_queue” folder, it is available for download to all other HA members.
- The node checks every 1 second (by default, configurable) if there are any new files to be handled in the “eventual” folder, and uploads them to S3.
Best Practices:
- A large local Cache-FS partition is important to assure frequent artifacts are served as quickly as possible.
- Monitoring the number of files at the “eventual” subfolders is important, so in case of a failure, you as an administrator are notified in time.
- If “lenientLimit” is set to 1, before restarting an Artifactory node, take it out of the Load-Balancer for at least a minute. This is to ensure that redundancy of recently deployed artifacts is being kept.
binarystore.xml example:
<config version="2"> <chain> <!--template="cluster-s3"--> <provider id="cache-fs-eventual-s3" type="cache-fs"> <provider id="sharding-cluster-eventual-s3" type="sharding-cluster"> <sub-provider id="eventual-cluster-s3" type="eventual-cluster"> <provider id="retry-s3" type="retry"> <provider id="s3" type="s3"/> </provider> </sub-provider> <dynamic-provider id="remote-s3" type="remote"/> </provider> </provider> </chain> <provider id="cache-fs-eventual-s3" type="cache-fs"> <maxCacheSize>100000000000</maxCacheSize> </provider> <provider id="sharding-cluster-eventual-s3" type="sharding-cluster"> <readBehavior>crossNetworkStrategy</readBehavior> <writeBehavior>crossNetworkStrategy</writeBehavior> <redundancy>2</redundancy> <lenientLimit>1</lenientLimit> <property name="zones" value="local,remote"/> </provider> <provider id="eventual-cluster-s3" type="eventual-cluster"> <maxWorkers>10</maxWorkers> <dispatcherInterval>1000</dispatcherInterval> <checkPeriod>15000</checkPeriod> <addStalePeriod>300000</addStalePeriod> <zone>local</zone> </provider> <provider id="remote-s3" type="remote"> <checkPeriod>15000</checkPeriod> <connectionTimeout>5000</connectionTimeout> <socketTimeout>15000</socketTimeout> <maxConnections>200</maxConnections> <connectionRetry>2</connectionRetry> <zone>remote</zone> </provider> <provider id="s3" type="s3"> <endpoint>http://s3.amazonaws.com</endpoint> <identity>[ENTER IDENTITY HERE]</identity> <credential>[ENTER CREDENTIALS HERE]</credential> <path>[ENTER PATH HERE]</path> <bucketName>[ENTER BUCKET NAME HERE]</bucketName> </provider> </config>