Skip to content

Conversation

@Urvashi2303
Copy link

@Urvashi2303 Urvashi2303 commented Nov 23, 2025

Summary
This PR upgrades MLeap to Scala 2.13.16, Apache Spark 4.0.1, Java 17 and XGBoost 2.0.3. This brings the project up to date with the latest and stable versions across the entire stack.

Version Upgrades
Core Dependencies:

  • Scala upgraded from 2.12.18 to 2.13.16
  • Apache Spark upgraded from 3.4.0 to 4.0.1
  • Java upgraded from 11 to 17
  • XGBoost upgraded from 1.7.6 to 2.0.3

Supporting Libraries:

  • Spring Boot upgraded to 3.2.0 for Java 17 support
  • Logback upgraded to 1.4.14 for Java 17/21 compatibility
  • Hadoop Client upgraded to 3.4.0 to match Spark 4.0 requirements
  • Scalameter upgraded to 0.21 for Scala 2.13 compatibility

Build and Test Summary
Build Results:

  • All modules compiled successfully
  • No compilation errors
  • Total build time: ~14 seconds

Test Results:

  • mleap-benchmark
    Pass

  • mleap-xgboost-runtime
    Tests: 22 | Pass: 22 | Fail: 0

  • mleap-executor-tests
    Tests: 31 | Pass: 31 | Fail: 0

  • root tests : mleap-base, mleap-tensor, bundle-ml, mleap-core, mleap-runtime, mleap-avro, mleap-spark
    Tests: 8262 | Pass: 8261 | Fail: 1 (SparkTransformBuilderSpec fails with RemoteClassLoaderError in local mode)

  • Aborted(All these are local test environment limitations):

    • TensorFlow tests (3) - No ARM64 artifacts available
    • XGBoost Spark (2) - Distributed training fails in local[4] mode only
    • Spark Parity (2) - Related to local mode classloader

@austinzh
Copy link
Contributor

Thanks for the PR.
Have you test python package also? I think the python package is still pyspark 3.4.0 and running at python 3.7 and 3.8.

Also:
We ran out of credit for Travis, We might have to move to github action for CI.

@austinzh
Copy link
Contributor

Hey @Urvashi2303 I have created a PR to added github action and fix the python issues.
Urvashi2303#1

@austinzh austinzh requested review from austinzh and Copilot December 16, 2025 19:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades MLeap to modern versions across the technology stack: Scala 2.13.16, Apache Spark 4.0.1, Java 17, and XGBoost 2.0.3. The changes address compatibility requirements and API changes introduced by these major version upgrades.

Key Changes:

  • Core dependency upgrades in build configuration (Scala, Spark, Java, XGBoost, Spring Boot, Logback, Hadoop)
  • Scala 2.13 collection API compatibility fixes throughout the codebase (.toSeq, .toMap conversions)
  • Java namespace migration from javax.* to jakarta.* for Spring Boot 3.x
  • XGBoost 2.0 API adjustments (file format parameters, ArrayRow constructor changes)

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
project/Dependencies.scala Updated version numbers for core dependencies and supporting libraries
project/Common.scala Changed Scala version to 2.13.16 and added Java 17 compiler options
mleap-xgboost-spark test files Refactored implicit imports to avoid Spark 4.0 compilation issues
mleap-xgboost-runtime/CachedDatasetUtils.scala Added XGBoost 2.0 format parameter and fixed ArrayRow constructor
mleap-tensorflow files Added ClassTag imports and .toSeq conversions for Scala 2.13
mleap-spring-boot files Migrated from javax to jakarta namespaces for Spring Boot 3.x
mleap-spark-testkit/SparkParityBase.scala Adjusted Spark configuration and added .toSet conversion
mleap-spark-extension/OneVsRest.scala Removed parallel processing from Range.map
mleap-spark-base files Updated DataFrame API usage for Spark 4.0
mleap-runtime files Extensive collection API updates for Scala 2.13 compatibility
mleap-core files Added .toMap conversions and type annotations for Scala 2.13
mleap-benchmark/SparkTransformBenchmark.scala Removed deprecated config parameter
mleap-avro files Collection API conversions throughout value and schema converters
bundle-ml files Collection API fixes and empty parameter list syntax updates
bundle-hdfs/HadoopBundleFileSystem.scala Added .toSeq conversion for config list
README.md Updated artifact IDs and added version compatibility row

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

/** Object for serializing/deserializing model definitions with JSON.
*/
case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
case class JsonFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry)

Copilot uses AI. Check for mistakes.
/** Object for serializing/deserializing model definitions with Protobuf.
*/
case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry)

Suggested change
case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer {

Copilot uses AI. Check for mistakes.
.collect(Collectors.toList())
.asScala
.map(removeElement)
.map(removeElement(_))
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit underscore in .map(removeElement(_)) is redundant. Scala can infer this automatically, so it can be simplified to .map(removeElement)

Suggested change
.map(removeElement(_))
.map(removeElement)

Copilot uses AI. Check for mistakes.
namedIndices: Array[(String, Int)] = Array(),
inputSize: Int) extends Model {
val allIndices: Array[Int] = indices.union(namedIndices.map(_._2))
val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses around indices.union(namedIndices.map(_._2)) are unnecessary. The .toArray call can be applied directly: val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray

Suggested change
val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray
val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants