Load to BigQuery Via Spark Job Fails with an Exception for Multiple sources found for parquet

Solution for Load to BigQuery Via Spark Job Fails with an Exception for Multiple sources found for parquet
is Given Below:

I have a spark job that is loading data into BigQuery.The spark job runs in dataproc cluster.
This is the snippet

df.write
      .format("bigquery")
      .mode(writeMode)
      .option("table",tabName)
      .save()

I have specified the spark bigquery dependency jar (spark-bigquery-with-dependencies_2.12-0.19.1.jar ) in –jars argument in the spark-submit command

When I am running the code I am getting the following exception java.lang.RuntimeException: Failed to write to BigQuery

Detailed error

Caused by: org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2, org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat), please specify the fully qualified class name.
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:717)

This are the dependencies in my project

<dependencies>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>2.12.14</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.12</artifactId>
            <version>2.4.8</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-bigquery</artifactId>
            <version>1.133.1</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud.spark</groupId>
            <artifactId>spark-bigquery_2.12</artifactId>
            <version>0.21.1</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-storage</artifactId>
            <version>1.116.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.3.3</version>
        </dependency>
    </dependencies>

I am building an uber jar to run the spark job
If , I remove the –jars param the job fails while reading a bigquery table

java.lang.ClassNotFoundException: Failed to find data source: bigquery. Please find packages at http://spark.apache.org/third-party-projects.html
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:689)

It seems you are using Spark 3.x with a jar that was compiled and includes spark 2.4.8 artifacts. The solution is simple: mark scala-library and spark-sql with the scope provided. Also, as you bring the spark-bigquery-connector externally, you don’t need to add it to the code (as well as the google-cloud-* dependencies, unless you’re using them directly)