Set up Ivysettings.xml

ARTIFACTORY: How to use the Artifactory Maven repository to download dependencies for spark cluster with Authentication

AuthorFullName__c
Shisiya Sebastian
articleNumber
000005980
ft:sourceType
Salesforce
FirstPublishedDate
2024-01-14T15:39:01Z
lastModifiedDate
2024-01-12
VersionNumber
1
Once a user application is bundled, it can be launched using the bin/spark-submit script. This script takes care of setting up the classpath with Spark and its dependencies. We can use the Ivysettings.xml with the spark-submit to specify the Artifactory repository and its credentials.
<?xml version="1.0" encoding="UTF-8"?>
<ivy-settings>
  <settings defaultResolver="main" />
  <!--Authentication required for publishing (deployment). 'Artifactory Realm' is the realm used by Artifactory so don't change it.-->
  <credentials host="artifactory_host" realm="Artifactory Realm" username="admin" passwd="mypassword" />
  <resolvers>
    <chain name="main">
      <filesystem name="local" checkmodified="true" validate="true">
        <ivy pattern="${ivy.settings.dir}/../repository/[module]-ivy-[revision].xml"/>
        <artifact pattern="${ivy.settings.dir}/../repository/[module]-[revision].[ext]"/>
      </filesystem>
      <ibiblio name="public" m2compatible="true" root="http://artifactory_host/artifactory/maven-local/" />
      <url name="artifactory" m2compatible="false">
        <ivy pattern="http://artifactory_host/artifactory/maven-local/[organization]/[module]/[revision]/[type]s/ivy-[revision].xml" />
    <artifact pattern="http://artifactory_host/artifactory/maven-local/[organization]/[module]/[revision]/[type]s/[module](-[classifier])-[revision].[ext]" />
      </url>
    </chain>
  </resolvers>
</ivy-settings>

Note: Here, maven-local is the Maven repository in the Artifactory and admin/password is the Artifactory credentials.

Sample command:
spark-submit --master local --conf spark.dynamicAllocation.enabled=false  --conf spark.jars.ivySettings=conf/ivysettings.xml --conf spark.task.maxFailures=8  --executor-cores 1 --executor-memory 3g --driver-memory 2g --name airflow-test-submit --class org.apache.spark.examples.SparkPi.scala  --packages com.cognite:cognite-sdk-scala_2.11:1.5.16 --verbose  /opt/bitnami/spark/examples/jars/cdf-spark-datasource_2.11-1.4.43.jar

Note: Here, the dependency for the application is specified with the argument option “--packages” and we can see that it is downloaded from the Artifactory successfully from the below results.

Result snippet:
:: loading settings :: file = conf/ivysettings.xml
:: loading settings :: url = jar:file:/opt/bitnami/spark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml
Ivy Default Cache set to: /opt/bitnami/spark/.ivy2/cache
The jars for the packages stored in: /opt/bitnami/spark/.ivy2/jars
com.cognite#cognite-sdk-scala_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-5ca4ae6e-a7f0-4b71-8071-b3a70a4eec4d;1.0
    confs: [default]
    found com.cognite#cognite-sdk-scala_2.11;1.5.16 in public
downloading http://artifactory_host/artifactory/maven-local/com/cognite/cognite-sdk-scala_2.11/1.5.16/cognite-sdk-scala_2.11-1.5.16.jar ...
    [SUCCESSFUL ] com.cognite#cognite-sdk-scala_2.11;1.5.16!cognite-sdk-scala_2.11.jar (2751ms)