Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3 #887

Urvashi2303 · 2025-11-23T18:31:42Z

Summary
This PR upgrades MLeap to Scala 2.13.16, Apache Spark 4.0.1, Java 17 and XGBoost 2.0.3. This brings the project up to date with the latest and stable versions across the entire stack.

Version Upgrades
Core Dependencies:

Scala upgraded from 2.12.18 to 2.13.16
Apache Spark upgraded from 3.4.0 to 4.0.1
Java upgraded from 11 to 17
XGBoost upgraded from 1.7.6 to 2.0.3

Supporting Libraries:

Spring Boot upgraded to 3.2.0 for Java 17 support
Logback upgraded to 1.4.14 for Java 17/21 compatibility
Hadoop Client upgraded to 3.4.0 to match Spark 4.0 requirements
Scalameter upgraded to 0.21 for Scala 2.13 compatibility

Build and Test Summary
Build Results:

All modules compiled successfully
No compilation errors
Total build time: ~14 seconds

Test Results:

mleap-benchmark
Pass
mleap-xgboost-runtime
Tests: 22 | Pass: 22 | Fail: 0
mleap-executor-tests
Tests: 31 | Pass: 31 | Fail: 0
root tests : mleap-base, mleap-tensor, bundle-ml, mleap-core, mleap-runtime, mleap-avro, mleap-spark
Tests: 8262 | Pass: 8261 | Fail: 1 (SparkTransformBuilderSpec fails with RemoteClassLoaderError in local mode)
Aborted(All these are local test environment limitations):
- TensorFlow tests (3) - No ARM64 artifacts available
- XGBoost Spark (2) - Distributed training fails in local[4] mode only
- Spark Parity (2) - Related to local mode classloader

austinzh · 2025-12-15T20:50:32Z

Thanks for the PR.
Have you test python package also? I think the python package is still pyspark 3.4.0 and running at python 3.7 and 3.8.

Also:
We ran out of credit for Travis, We might have to move to github action for CI.

austinzh · 2025-12-16T17:48:51Z

Hey @Urvashi2303 I have created a PR to added github action and fix the python issues.
Urvashi2303#1

Copilot

Pull request overview

This PR upgrades MLeap to modern versions across the technology stack: Scala 2.13.16, Apache Spark 4.0.1, Java 17, and XGBoost 2.0.3. The changes address compatibility requirements and API changes introduced by these major version upgrades.

Key Changes:

Core dependency upgrades in build configuration (Scala, Spark, Java, XGBoost, Spring Boot, Logback, Hadoop)
Scala 2.13 collection API compatibility fixes throughout the codebase (.toSeq, .toMap conversions)
Java namespace migration from javax.* to jakarta.* for Spring Boot 3.x
XGBoost 2.0 API adjustments (file format parameters, ArrayRow constructor changes)

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
project/Dependencies.scala	Updated version numbers for core dependencies and supporting libraries
project/Common.scala	Changed Scala version to 2.13.16 and added Java 17 compiler options
mleap-xgboost-spark test files	Refactored implicit imports to avoid Spark 4.0 compilation issues
mleap-xgboost-runtime/CachedDatasetUtils.scala	Added XGBoost 2.0 format parameter and fixed ArrayRow constructor
mleap-tensorflow files	Added ClassTag imports and `.toSeq` conversions for Scala 2.13
mleap-spring-boot files	Migrated from javax to jakarta namespaces for Spring Boot 3.x
mleap-spark-testkit/SparkParityBase.scala	Adjusted Spark configuration and added `.toSet` conversion
mleap-spark-extension/OneVsRest.scala	Removed parallel processing from Range.map
mleap-spark-base files	Updated DataFrame API usage for Spark 4.0
mleap-runtime files	Extensive collection API updates for Scala 2.13 compatibility
mleap-core files	Added `.toMap` conversions and type annotations for Scala 2.13
mleap-benchmark/SparkTransformBenchmark.scala	Removed deprecated config parameter
mleap-avro files	Collection API conversions throughout value and schema converters
bundle-ml files	Collection API fixes and empty parameter list syntax updates
bundle-hdfs/HadoopBundleFileSystem.scala	Added `.toSeq` conversion for config list
README.md	Updated artifact IDs and added version compatibility row

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-16T19:39:52Z

bundle-ml/src/main/scala/ml/combust/bundle/serializer/ModelSerializer.scala

 /** Object for serializing/deserializing model definitions with JSON.
  */
-case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
+case class JsonFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer {


Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry)

Copilot · 2025-12-16T19:39:53Z

bundle-ml/src/main/scala/ml/combust/bundle/serializer/ModelSerializer.scala

 /** Object for serializing/deserializing model definitions with Protobuf.
  */
-case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
+case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer {


Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry)

Suggested change

case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer {

case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer {

Copilot · 2025-12-16T19:39:53Z

bundle-ml/src/main/scala/ml/combust/bundle/util/FileUtil.scala

            .collect(Collectors.toList())
            .asScala
-            .map(removeElement)
+            .map(removeElement(_))


The explicit underscore in .map(removeElement(_)) is redundant. Scala can infer this automatically, so it can be simplified to .map(removeElement)

Suggested change

.map(removeElement(_))

.map(removeElement)

Copilot · 2025-12-16T19:39:53Z

mleap-core/src/main/scala/ml/combust/mleap/core/feature/VectorSlicerModel.scala

                             namedIndices: Array[(String, Int)] = Array(),
                            inputSize: Int) extends Model {
-  val allIndices: Array[Int] = indices.union(namedIndices.map(_._2))
+  val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray


The parentheses around indices.union(namedIndices.map(_._2)) are unnecessary. The .toArray call can be applied directly: val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray

Suggested change

val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray

val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray

Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3

605ecf1

austinzh requested review from austinzh and Copilot December 16, 2025 19:37

Copilot AI reviewed Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3 #887

Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3 #887

Uh oh!

Urvashi2303 commented Nov 23, 2025 •

edited

Loading

Uh oh!

austinzh commented Dec 15, 2025

Uh oh!

austinzh commented Dec 16, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Copilot AI Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer {
	case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer {

	val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray
	val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray

Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3 #887

Are you sure you want to change the base?

Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3 #887

Uh oh!

Conversation

Urvashi2303 commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

austinzh commented Dec 15, 2025

Uh oh!

austinzh commented Dec 16, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Urvashi2303 commented Nov 23, 2025 •

edited

Loading