As mentioned in this JIRA ticket we reported the storage info REST API behavior in the past, It's worth mentioning that the size reported on this part of the response does count duplicate layers, and it is intended to be this way.
This call uses the same information that is displayed in the Storage page of the Artifactory UI:
As you may already know there are two different sizes: binaries size (orange) and artifacts size (green). The binary size is the size of each file being counted once, and it's the value you want for the repository size, which is represented with the blue line. As you can see on that table, all of the repositories add up to the total artifacts size, which is why the call counts duplicates in the response.
The reason we don't also provide a binaries size per repo is because that technically doesn't exist. You could delete an entire repository that contains 1TB of artifacts, and if you have enough references in different repos you might not free a single megabyte of space. However, we understand that calculating a deduplicated-size of the artifacts can be useful and we can make that happen using the AQL.
For example, this Python3 script will get you the size of each repo if each checksum is counted once and print out the results:
def audit(): import requests # necessary library for API requests from collections import defaultdict # nice data structure for counting base_url = 'http://localhost:8081/artifactory/' # your artifactory instance headers = {'content-type': 'text/plain',} # Headers for query data = 'items.find({"name":{"$match":"*"}}).include("actual_sha1", "repo", "size")' # Query to find all artifacts myResp = requests.post(base_url+'api/search/aql', auth=('admin', 'password'), headers=headers, data=data) # Execute the query myResp = eval(myResp.text) total = defaultdict(int) repos = {} for item in myResp["results"]: try: repos[item["repo"]][item["actual_sha1"]] = item["size"] except: repos[item["repo"]] = {} repos[item["repo"]][item["actual_sha1"]] = item["size"] total[item["repo"]] += int(item["size"]) for repo, artifacts in repos.items(): print("Storage per Artifact for Repo {}".format(repo)) for artifact, size in artifacts.items(): print("[{}] -- Checksum:{} -- Size:{}".format(repo, artifact, size)) print("==================") print("==================") print("==================") for repo, total in total.items(): print("Repo {} uses a total of {} byes.".format(repo, total)) if __name__ == '__main__': audit() Here is an example for output: Storage per Artifact for Repo test-generic [test-generic] -- Checksum:3ae3f83349b04656faa27ae59b2287c06bdc428b -- Size:423232 [test-generic] -- Checksum:591d8d38b865ab1ef4218120779f78ec950d97b0 -- Size:3045251 [test-generic] -- Checksum:a113a4b034a150990514c3f0c6f1c0f2b72384a5 -- Size:1410580 ================== Storage per Artifact for Repo nuget-local [nuget-local] -- Checksum:3b71f43ff30f4b15b5cd85dd9e95ebc7e84eb5a3 -- Size:1048576 [...more data...] Storage per Artifact for Repo debian-remote-cache [debian-remote-cache] -- Checksum:a113a4b034a150990514c3f0c6f1c0f2b72384a5 -- Size:1410580 ================== Storage per Artifact for Repo maven-remote [maven-remote] -- Checksum:b899da20a0f408d00cfe32a268458fb401d5d698 -- Size:1546674 ================== Storage per Artifact for Repo test-generic-2 [test-generic-2] -- Checksum:e6bbc45386305b92f08f894deb1b47c66bd3d815 -- Size:788 ================== ================== ================== Repo npm-remote-cache uses a total of 2699 byes. Repo docker-remote-cache uses a total of 553 byes. Repo pypi-local uses a total of 163 byes. Repo libs-release-local uses a total of 34 byes. Repo conan-local uses a total of 791405 byes. Repo pypi-remote-cache uses a total of 17592 byes. Repo debian-local uses a total of 713 byes. Repo test-generic uses a total of 423232 byes. Repo nuget-local uses a total of 1048576 byes. Repo test-cache uses a total of 5836 byes. Repo debian-remote-cache uses a total of 1410580 byes. Repo maven-remote uses a total of 1546674 byes. Repo test-generic-2 uses a total of 788 byes.