-
Notifications
You must be signed in to change notification settings - Fork 315
Upgrade to Scala 2.13.16, Spark 4.0.1, Java 17, and XGBoost 2.0.3 #887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thanks for the PR. Also: |
|
Hey @Urvashi2303 I have created a PR to added github action and fix the python issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR upgrades MLeap to modern versions across the technology stack: Scala 2.13.16, Apache Spark 4.0.1, Java 17, and XGBoost 2.0.3. The changes address compatibility requirements and API changes introduced by these major version upgrades.
Key Changes:
- Core dependency upgrades in build configuration (Scala, Spark, Java, XGBoost, Spring Boot, Logback, Hadoop)
- Scala 2.13 collection API compatibility fixes throughout the codebase (
.toSeq,.toMapconversions) - Java namespace migration from
javax.*tojakarta.*for Spring Boot 3.x - XGBoost 2.0 API adjustments (file format parameters, ArrayRow constructor changes)
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| project/Dependencies.scala | Updated version numbers for core dependencies and supporting libraries |
| project/Common.scala | Changed Scala version to 2.13.16 and added Java 17 compiler options |
| mleap-xgboost-spark test files | Refactored implicit imports to avoid Spark 4.0 compilation issues |
| mleap-xgboost-runtime/CachedDatasetUtils.scala | Added XGBoost 2.0 format parameter and fixed ArrayRow constructor |
| mleap-tensorflow files | Added ClassTag imports and .toSeq conversions for Scala 2.13 |
| mleap-spring-boot files | Migrated from javax to jakarta namespaces for Spring Boot 3.x |
| mleap-spark-testkit/SparkParityBase.scala | Adjusted Spark configuration and added .toSet conversion |
| mleap-spark-extension/OneVsRest.scala | Removed parallel processing from Range.map |
| mleap-spark-base files | Updated DataFrame API usage for Spark 4.0 |
| mleap-runtime files | Extensive collection API updates for Scala 2.13 compatibility |
| mleap-core files | Added .toMap conversions and type annotations for Scala 2.13 |
| mleap-benchmark/SparkTransformBenchmark.scala | Removed deprecated config parameter |
| mleap-avro files | Collection API conversions throughout value and schema converters |
| bundle-ml files | Collection API fixes and empty parameter list syntax updates |
| bundle-hdfs/HadoopBundleFileSystem.scala | Added .toSeq conversion for config list |
| README.md | Updated artifact IDs and added version compatibility row |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** Object for serializing/deserializing model definitions with JSON. | ||
| */ | ||
| case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer { | ||
| case class JsonFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer { |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class JsonFormatModelSerializer(implicit hr: HasBundleRegistry)
| /** Object for serializing/deserializing model definitions with Protobuf. | ||
| */ | ||
| case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer { | ||
| case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer { |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty parameter lists () on case classes are unnecessary in Scala. The empty parentheses can be removed: case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry)
| case class ProtoFormatModelSerializer()(implicit hr: HasBundleRegistry) extends FormatModelSerializer { | |
| case class ProtoFormatModelSerializer(implicit hr: HasBundleRegistry) extends FormatModelSerializer { |
| .collect(Collectors.toList()) | ||
| .asScala | ||
| .map(removeElement) | ||
| .map(removeElement(_)) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explicit underscore in .map(removeElement(_)) is redundant. Scala can infer this automatically, so it can be simplified to .map(removeElement)
| .map(removeElement(_)) | |
| .map(removeElement) |
| namedIndices: Array[(String, Int)] = Array(), | ||
| inputSize: Int) extends Model { | ||
| val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)) | ||
| val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parentheses around indices.union(namedIndices.map(_._2)) are unnecessary. The .toArray call can be applied directly: val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray
| val allIndices: Array[Int] = (indices.union(namedIndices.map(_._2))).toArray | |
| val allIndices: Array[Int] = indices.union(namedIndices.map(_._2)).toArray |
Summary
This PR upgrades MLeap to Scala 2.13.16, Apache Spark 4.0.1, Java 17 and XGBoost 2.0.3. This brings the project up to date with the latest and stable versions across the entire stack.
Version Upgrades
Core Dependencies:
Supporting Libraries:
Build and Test Summary
Build Results:
Test Results:
mleap-benchmark
Pass
mleap-xgboost-runtime
Tests: 22 | Pass: 22 | Fail: 0
mleap-executor-tests
Tests: 31 | Pass: 31 | Fail: 0
root tests : mleap-base, mleap-tensor, bundle-ml, mleap-core, mleap-runtime, mleap-avro, mleap-spark
Tests: 8262 | Pass: 8261 | Fail: 1 (SparkTransformBuilderSpec fails with RemoteClassLoaderError in local mode)
Aborted(All these are local test environment limitations):