ARTIFACTORY: Understanding Artifact UI Upload Mechanism with Local Filestore and S3

ARTIFACTORY: Understanding Artifact UI Upload Mechanism with Local Filestore and S3

AuthorFullName__c
Haritha Arumugam
articleNumber
000006309
ft:sourceType
Salesforce
FirstPublishedDate
2024-12-31T12:04:50Z
lastModifiedDate
2024-12-31
VersionNumber
3
In JFrog Artifactory, the filestore is the physical storage location for all uploaded binaries and artifacts. Artifactory supports multiple storage configurations for binaries, including the local file system and various cloud storage options such as Amazon S3, Microsoft Azure, and Google Cloud Storage. This article covers the artifact upload mechanisms using two common filestore configurations: Local Filestore and AWS S3 storage. The workflow of uploading artifacts through the Artifactory UI is demonstrated along with their respective folder structures and configurations.

Filestore Configuration Options

Artifactory provides multiple storage options for managing binaries:
1. Local File System – Stores binaries locally or on a mounted file system.
2. Cloud Storage (Amazon S3, Azure, Google Cloud Storage) – Stores binaries in the cloud, such as AWS S3.

This article will focus on two storage configurations:
  • Local File System Filestore
  • Amazon S3 Filestore (with two upload mechanisms: Direct and Eventual)

 

Local Filestore Configuration


The Local Filestore is the simplest configuration where binaries are stored locally or on a mounted file system. It is typically used in smaller or self-contained installations of Artifactory.

Sample Local Filestore Configuration Template

The binarystore.xml file configures Artifactory's filestore. Below is an example of the configuration for using the local file system:

<config version="1">
    <chain template="file-system"/>
</config>


This template uses the local file system for storing binaries.

Local Filestore Folder Structure

The Artifactory filestore has the following directory structure when using the local file system:

/opt/jfrog/artifactory/var/data/artifactory
├── derby
│   ├── log
│   ├── seg0
│   └── tmp
├── federated-queues
│   ├── _local
│   └── lazy_temp
├── filestore
│   ├── _pre
├── git
├── import
├── testProvider-startup
├── tmp
│   ├── artifactory-uploads
│   └── work
│       └── fullSync
└── usage
    └── archived


When an artifact is uploaded via the Artifactory UI, it follows a sequence of steps before it reaches the final storage location.

File Upload Process

  1. Initial Upload: The uploaded file is placed in the artifactory-uploads folder (e.g., /opt/jfrog/artifactory/var/data/artifactory/tmp/artifactory-uploads/).
  2. Checksum Calculation: The file is then moved to the _pre folder for checksum calculation.
  3. Final Storage: The file is finally stored in a directory under the filestore folder. The folder name is derived from the first two digits of the SHA-1 checksum of the file.

Example of the file movement for an uploaded artifact:

  • Artifact Upload:
    /opt/jfrog/artifactory/var/data/artifactory/tmp/artifactory-uploads/caca9011-1570-46b2-ab36-5fd5e739b740_JFUI-5650.doc
  • Checksum Calculation:
    /opt/jfrog/artifactory/var/data/artifactory/filestore/_pre/dbRecord5106005714129505647-4113fa333a6180d9-example-repo-local.bin
  • Final Binary Storage:
    /opt/jfrog/artifactory/var/data/artifactory/filestore/70
    The 70 folder under filestore contains the actual binary and is named after the first two digits of the SHA-1 value of the artifact.
    /opt/jfrog/artifactory/var/data/artifactory
    ├── derby
    │   ├── log
    │   ├── seg0
    │   └── tmp
    ├── federated-queues
    │   ├── _local
    │   └── lazy_temp
    ├── filestore
    │   ├── 70
    │   ├── _pre
    ├── git
    ├── import
    ├── testProvider-startup
    ├── tmp
    │   ├── artifactory-uploads
    │   └── work
    │       └── fullSync
    └── usage
        └── archived
    

     

User-added image 

AWS S3 Filestore Configuration


JFrog Artifactory also supports AWS S3 for storing binaries. This setup requires a valid Artifactory license, such as JFrog Pro, Enterprise X, or Enterprise+. AWS S3 provides scalable cloud storage for managing Artifactory's filestore, enabling easy integration for distributed systems.

Two S3 Upload Mechanisms
Artifactory provides two mechanisms for uploading artifacts to S3:

  1. Direct Upload Mechanism (Recommended)
    In this mechanism, artifacts are directly uploaded to S3, without the need for intermediate storage on the local file system. This method is faster and more efficient, especially when Artifactory is hosted on AWS.
  2. Eventual Upload Mechanism
    The Eventual Upload mechanism is useful when upload speed is slower or if network issues are present. In this case, artifacts are first uploaded to local storage (cache/eventual) and later moved to S3. This mechanism ensures uploads can proceed even if S3 is temporarily unavailable.
     

Direct Upload Mechanism Configuration (S3 Direct)
In the Direct Upload mechanism, the configuration file (binarystore.xml) uses the s3-storage-v3-direct template for storing binaries in S3:

<config version="2">
    <chain template="s3-storage-v3-direct"/>
    <provider type="cache-fs" id="cache-fs">
        <maxCacheSize>5000000000</maxCacheSize>
    </provider>
    <provider type="s3-storage-v3" id="s3-storage-v3">
        <path>1733896608721</path>
        <bucketName>artifactory-mill-bucket</bucketName>
        <endpoint>http://s3.amazonaws.com</endpoint>
        <credential>...</credential>
        <identity>...</identity>
    </provider>
</config>


S3 Direct Upload Folder Structure
In the Direct Upload mechanism, the files are first uploaded to the artifactory-uploads directory, and then moved to the cache and ultimately to the S3 filestore:

/opt/jfrog/artifactory/var/data/artifactory
├── cache
│   ├── _pre
├── derby
│   ├── log
│   ├── seg0
│   └── tmp
├── federated-queues
│   ├── _local
│   └── lazy_temp
├── filestore
│   └── _pre
├── git
├── import
├── testProvider-startup
├── tmp
│   ├── artifactory-uploads
│   └── work
│       └── fullSync
└── usage
    └── archived


File Upload Process (S3 Direct)

When an artifact is uploaded using the Direct Upload mechanism:

  1. Initial Upload: The file is placed in the artifactory-uploads folder.
  2. Checksum Calculation: The file moves to the cache/_pre folder for checksum calculation.
  3. Final S3 Storage: After checksum calculation, the file is uploaded to S3.

Example of file movement for Direct Upload:

  • Artifact Upload:
    /opt/jfrog/artifactory/var/data/artifactory/tmp/artifactory-uploads/9d8c7c0e-90e2-4d7a-8ade-240fd25e418e_JFCON-467.doc
  • Checksum Calculation:
    /opt/jfrog/artifactory/var/data/artifactory/cache/_pre/dbRecord2002472206079155842-6e15b0653645fb42-example-repo-local.bin
  • Final S3 Binary Storage:
    The file is then stored in a folder under S3, such as /artifactory/4f/.

The folder name is derived from the first two digits of the artifact's SHA-1 hash.

/opt/jfrog/artifactory/var/data/artifactory
├── cache
│   ├── 4f
│   ├── _pre
├── derby
│   ├── log
│   ├── seg0
│   └── tmp
├── federated-queues
│   ├── _local
│   └── lazy_temp
├── filestore
│   └── _pre
├── git
├── import
├── testProvider-startup
├── tmp
│   ├── artifactory-uploads
│   └── work
│       └── fullSync
└── usage
    └── archived

 

User-added image 

User-added image 

Eventual Upload Mechanism Configuration (S3 Eventual)

In the Eventual Upload mechanism, artifacts are first stored in a local cache/eventual before being moved to S3, allowing for retries if there are network or service disruptions.

<config version="2">
    <chain template="s3-storage-v3"/>
    <provider type="cache-fs" id="cache-fs">
        <maxCacheSize>5000000000</maxCacheSize>
    </provider>
    <provider type="s3-storage-v3" id="s3-storage-v3">
        <path>1733896608721</path>
        <bucketName>artifactory-mill-bucket</bucketName>
        <endpoint>http://s3.amazonaws.com</endpoint>
        <credential>...</credential>
        <identity>...</identity>
    </provider>
</config>


Eventual Upload Folder Structure

The Eventual Upload mechanism includes additional temporary directories for staging before the final upload to S3:

/opt/jfrog/artifactory/var/data/artifactory
├── cache
│   ├── _pre
├── derby
│   ├── log
│   ├── seg0
│   └── tmp
├── eventual
│   ├── _add
│   ├── _delete
│   └── _pre
├── federated-queues
│   ├── _local
│   └── lazy_temp
├── filestore
│   ├── _pre
├── git
├── import
├── testProvider-startup
├── tmp
│   ├── artifactory-uploads
│   └── work
│       └── fullSync
└── usage
    └── archived


Artifacts uploaded using this mechanism follow a similar path:

  1. Initial Upload: The file is placed in artifactory-uploads.
  2. Checksum Calculation: The file moves to the cache/_pre and eventual/_pre directory.
  3. Staging in Eventual Directory: The file is then staged in the eventual/_add directories.
  4. Final S3 Storage: Finally, the file is uploaded to S3.

Example of file movement for Eventual Upload:

  • Artifact Upload:
    /opt/jfrog/artifactory/var/data/artifactory/tmp/artifactory-uploads/9d8c7c0e-90e2-4d7a-8ade-240fd25e418e_JFCON-467.doc
  • Checksum Calculation:
    /opt/jfrog/artifactory/var/data/artifactory/cache/_pre/dbRecord7276279592561853539-3c02935ceaa6a28f-example-repo-local.bin
    /opt/jfrog/artifactory/var/data/artifactory/eventual/_pre/dbRecord7276279592561853539-3c02935ceaa6a28f-example-repo-local.bin
  • Staging for S3 Upload:
    /opt/jfrog/artifactory/var/data/artifactory/eventual/_add/9c/
    The folder name is derived from the first two digits of the artifact's SHA-1 hash.
/opt/jfrog/artifactory/var/data/artifactory
├── cache
│   ├── 9c
│   ├── _pre
├── derby
│   ├── log
│   ├── seg0
│   └── tmp
├── eventual
│   ├── _add
│   │   ├── 9c
│   ├── _delete
│   └── _pre
├── federated-queues
│   ├── _local
│   └── lazy_temp
├── filestore
│   ├── _pre
├── git
├── import
├── testProvider-startup
├── tmp
│   ├── artifactory-uploads
│   └── work
│       └── fullSync
└── usage
    └── archived

User-added image

User-added image

 

Conclusion


Artifactory provides flexible options for storing binaries in both local file systems and cloud storage platforms like AWS S3. By understanding the different filestore configurations, such as Local Filestore and S3 Filestore (with Direct and Eventual upload mechanisms), users can choose the most suitable storage solution for their needs. The artifact upload process whether on local file systems or cloud storage follows a structured flow to ensure that binaries are properly stored and organized.