There are several use cases for determining which repositories within Artifactory are the largest. This can include monitoring team usage, confirming if cleanup policies are working as desired, etc…
In order to easily find the largest repositories in Artifactory, we can use a combination of the Artifactory Get Storage Summary Info endpoint in conjunction with some basic JSON manipulation with the jq library.
First, let’s save the result of the Get Storage Summary Info endpoint to a file (repos.json):
> curl -u user -XGET ART-URL/artifactory/api/storageinfo > repos.json
The response contains a list of all of the Artifactory repositories (called repositoriesSummaryList) with the following fields:
{"repoKey":"npm-fed", "repoType":"FEDERATED", "foldersCount":6, "filesCount":11, "usedSpace":"34.46 MB", "usedSpaceInBytes":36138261, "itemsCount":17, "packageType":"npm", "projectKey":"default", "percentage":"3.95%"}
We can then sort the array by the usedSpaceInBytes field in order to sort the repositories by size (smallest to largest):
> jq '[.repositoriesSummaryList[]] | sort_by(.usedSpaceInBytes)' repos.jsonIn order to return a list of only the 10 biggest repositories, use the following:
> jq '[.repositoriesSummaryList[]] | sort_by(.usedSpaceInBytes) | .[-11:-1]' repos.jsonThe last item is omitted (ending the spliced array at -1) because this isn’t a normal repository but rather the total size of the storage used by Artifactory.