Skip to content

Conversation

@baranylcn
Copy link
Member

@baranylcn baranylcn commented Nov 2, 2025

  • Implemented automatic target role detection from resume text.
  • Added competency gap analysis comparing resume content against role-specific requirements.
  • Displayed missing skills, mismatched experience, and overall suitability insights in the Streamlit UI.
  • Updated prompt structure and UI layout for clearer, structured feedback.

Summary by CodeRabbit

  • New Features

    • Tabbed interface for resume analysis (Summary, Fit & Gaps, Competencies, Domains, Insights, Recommendations).
    • End-to-end PDF resume analysis flow with language selection and Analyze action.
    • Expanded analysis output including missing skills, experience mismatches, and role suitability details.
  • Improvements

    • Enhanced role suitability scoring and tabular summary presentation.
    • More robust PDF text extraction, JSON parsing, error messages, and safer handling of missing data.

@coderabbitai
Copy link

coderabbitai bot commented Nov 2, 2025

Walkthrough

Refactors levelup/app.py to add JSON extraction from LLM output, a new analyzecv_pdf_withllm pipeline, and modular tabbed UI renderers; enhances PDF per-page text extraction, safe nested-access helpers, and explicit Dict[str, Any] typing. Updates levelup/prompts.py to a concise prompt builder and expanded JSON schema (missing_skills, mismatched_experience, role_suitability).

Changes

Cohort / File(s) Summary
Core app refactor
levelup/app.py
Added _extract_json_block and _safe_dict; introduced analyzecv_pdf_withllm(text, report_language) that calls the LLM, extracts/parses JSON, and returns `Dict[str, Any]
Prompt structure update
levelup/prompts.py
Rewrote get_resume_analysis_prompt docstring and prompt body to be a concise prompt builder. Expanded the JSON output schema to include missing_skills (with priority), mismatched_experience, and role_suitability (roles + scores). Reorganized sections (language detection, domain matching, competencies, insights, development recommendations, COMPARATIVE BENCHMARKING, overall summary) and enforced strict JSON-only output in the prompt.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant UI as Streamlit UI
    participant App as app.py
    participant LLM as LLM Service
    participant Parser as JSON Extractor

    User->>UI: Upload PDF + Select Language
    UI->>App: Call analyzecv_pdf_withllm(text, language)
    App->>LLM: Send crafted prompt with CV text
    LLM-->>App: Return raw LLM output
    App->>Parser: _extract_json_block(raw_output)
    Parser-->>App: Extracted JSON string
    App->>App: Parse & validate JSON -> Dict[str, Any]
    alt Valid JSON
        App-->>UI: Return parsed result
        UI->>App: Call display_analysis_tabs(result)
        App->>UI: Render display_summary_block()
        App->>UI: Create tabs and render:
        App->>UI: - Fit & Gaps
        App->>UI: - Competencies
        App->>UI: - Domains
        App->>UI: - Insights
        App->>UI: - Recommendations
    else Invalid JSON
        App-->>UI: Return None + error message
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Files needing focused review:
    • levelup/app.py: JSON extraction/parsing correctness, LLM invocation, error handling, and tab orchestration.
    • levelup/prompts.py: Ensure prompt schema matches parser expectations (field names/types) and that strict JSON instructions are consistent.
  • Also verify DataFrame rendering in the summary block, safe-dict helper behavior, and PDF per-page text accumulation edge cases.

Poem

🐰 I hopped through pages, parsed each line,
JSON carrots stacked in tidy design,
Tabs unroll where insights glow,
Gaps and strengths now clearly show,
A little hop — your resume grows! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes: role detection, gap analysis, and UI optimization are all prominently implemented in the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/role-gap-analysis-ui

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b906b5 and 257609f.

📒 Files selected for processing (2)
  • levelup/app.py (4 hunks)
  • levelup/prompts.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
levelup/app.py (1)
levelup/prompts.py (1)
  • get_resume_analysis_prompt (1-107)

Comment on lines +185 to +191
df = pd.DataFrame(
[
{"Role": r.get("role", ""), "Score": r.get("score", "")}
for r in role_suit
]
)
st.dataframe(df, width="stretch")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix invalid st.dataframe width usage

Passing width="stretch" causes StreamlitAPIException because the width parameter only accepts an integer or None. Right now the app will crash whenever these DataFrames render. Switch to use_container_width=True (or drop the argument) everywhere this pattern appears.

-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)
@@
-            st.dataframe(pd.DataFrame(rows), width="stretch")
+            st.dataframe(pd.DataFrame(rows), use_container_width=True)
@@
-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)
@@
-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)

Also applies to: 200-208, 232-244, 258-268

🤖 Prompt for AI Agents
In levelup/app.py around lines 185-191 (and also update the other affected
occurrences at 200-208, 232-244, 258-268), the st.dataframe calls pass
width="stretch" which raises StreamlitAPIException; remove the invalid width
argument and replace it with use_container_width=True (or omit the width
argument entirely) for each DataFrame rendering so the calls become
st.dataframe(df, use_container_width=True).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
levelup/app.py (4)

175-181: Fix the width="stretch" parameter (duplicate issue).

This width="stretch" parameter on st.dataframe was already flagged in previous reviews and will cause a StreamlitAPIException at runtime. This specific instance at line 181 was noted in the prior comment covering lines 175-181.

Apply this diff:

-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)

190-197: Fix the width="stretch" parameter (duplicate issue).

Another instance of the width="stretch" parameter that will crash at runtime. This was flagged in previous reviews covering lines 200-208.

Apply this diff:

-            st.dataframe(pd.DataFrame(rows), width="stretch")
+            st.dataframe(pd.DataFrame(rows), use_container_width=True)

222-233: Fix the width="stretch" parameter (duplicate issue).

Yet another width="stretch" that needs correction. This was flagged in the prior review covering lines 232-244.

Apply this diff:

-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)

248-258: Fix the width="stretch" parameter (duplicate issue).

Final instance of the width="stretch" error. This specific line 258 was in the range 258-268 flagged previously.

Apply this diff:

-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 257609f and 386a015.

📒 Files selected for processing (1)
  • levelup/app.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
levelup/app.py (1)
levelup/prompts.py (1)
  • get_resume_analysis_prompt (1-107)
🪛 GitHub Actions: CI
levelup/app.py

[error] 59-59: mypy: Return value expected. Found 1 error in 1 file (checked 13 source files).

🔇 Additional comments (8)
levelup/app.py (8)

24-33: LGTM: Improved text extraction with newline separators.

The refactored PDF extraction now joins page text with newlines, which better preserves document structure for downstream analysis.


36-43: LGTM: Robust JSON extraction strategy.

The two-stage fallback (fenced block → raw braces) handles common LLM output variations effectively.


69-71: LGTM: Useful defensive helper.

The _safe_dict utility provides safe nested dictionary access with appropriate fallback.


74-149: LGTM: Enhanced type safety and defensive data access.

The updated display functions now have explicit type signatures and robust defaults for missing or malformed data, improving reliability when handling LLM responses.


263-271: LGTM: Well-structured insights layout.

The 3:2 column ratio and string validation for benchmarking data provide a clean presentation.


274-281: LGTM: Clear recommendations display.

Simple and effective presentation with appropriate fallback messaging.


284-304: LGTM: Clean tabbed UI orchestration.

The modular tab structure makes the UI easy to extend and maintain, with clear separation between the summary block and tabbed sections.


307-331: LGTM: Intuitive main UI flow.

The upload → configure → analyze → display flow is clear and user-friendly, with appropriate loading feedback.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
levelup/app.py (5)

57-61: Fix the bare return statement (critical mypy error).

Line 59 still has a bare return statement, but the function signature requires dict[str, Any] | None. This was flagged in a previous review but remains unfixed, causing the mypy pipeline failure.

Apply this diff:

     if not isinstance(data := json.loads(json_str), dict):
         st.error("Invalid JSON object.")
-        return
+        return None

Additionally, the cast on line 61 is redundant since you've already verified data is a dict via isinstance on line 57:

-        return cast(dict[str, Any], data)
+        return data

172-181: Fix invalid width="stretch" parameter (critical Streamlit error).

Line 181 uses width="stretch", which causes a StreamlitAPIException because st.dataframe only accepts integers or None for the width parameter. This was flagged in a previous review but remains unfixed and will crash the app when role suitability data is displayed.

Apply this diff:

-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)

197-197: Fix invalid width="stretch" parameter (critical Streamlit error).

Line 197 uses width="stretch", causing a StreamlitAPIException. This was flagged in a previous review (lines 200-208 reference) but remains unfixed.

Apply this diff:

-            st.dataframe(pd.DataFrame(rows), width="stretch")
+            st.dataframe(pd.DataFrame(rows), use_container_width=True)

233-233: Fix invalid width="stretch" parameter (critical Streamlit error).

Line 233 uses width="stretch", causing a StreamlitAPIException. This was flagged in a previous review (lines 232-244 reference) but remains unfixed.

Apply this diff:

-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)

258-258: Fix invalid width="stretch" parameter (critical Streamlit error).

Line 258 uses width="stretch", causing a StreamlitAPIException. This was flagged in a previous review (lines 258-268 reference) but remains unfixed.

Apply this diff:

-        st.dataframe(df, width="stretch")
+        st.dataframe(df, use_container_width=True)
🧹 Nitpick comments (1)
levelup/app.py (1)

69-72: LGTM with optional enhancement suggestion.

The helper correctly guards against missing or non-dict values. If you need flexibility in the default return value, consider adding a parameter:

-def _safe_dict(obj: dict[str, Any], key: str) -> dict[str, Any]:
+def _safe_dict(obj: dict[str, Any], key: str, default: dict[str, Any] | None = None) -> dict[str, Any]:
+    if default is None:
+        default = {}
     val = obj.get(key, {})
-    return val if isinstance(val, dict) else {}
+    return val if isinstance(val, dict) else default
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 386a015 and 0aca3d1.

📒 Files selected for processing (1)
  • levelup/app.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
levelup/app.py (1)
levelup/prompts.py (1)
  • get_resume_analysis_prompt (1-107)
🔇 Additional comments (12)
levelup/app.py (12)

4-4: LGTM!

The new imports are appropriate for the added functionality: cast for type narrowing and pandas for DataFrame rendering. The type ignore comment on pandas is correct since it lacks inline type stubs.

Also applies to: 7-7


27-30: LGTM!

The refactored per-page text extraction with list accumulation and newline joining is more efficient and preserves page structure better than string concatenation.


36-43: LGTM!

The JSON extraction logic correctly handles both fenced code blocks (with optional json tag) and bare JSON objects. The two-tier matching strategy (fenced first, then bare braces) provides good robustness for various LLM output formats.


79-92: LGTM!

The display functions demonstrate good defensive programming:

  • Consistent use of .get() with sensible defaults
  • The or [] pattern safely handles None values from JSON
  • Line 126's type check prevents displaying non-string benchmarking data

Also applies to: 95-109, 117-120, 123-126, 129-139


152-171: LGTM!

The summary block layout is well-structured:

  • Clean three-column metrics display
  • Good use of _safe_dict for safe access
  • Conditional markdown rendering for strengths/areas avoids empty lists

184-196: LGTM with good defensive programming.

The two-column layout clearly separates missing skills from mismatched experience. Line 193's str() cast is a good defensive measure to handle non-string priority values.

Also applies to: 198-208


210-232: LGTM with excellent defensive programming.

The competencies tab demonstrates robust error handling:

  • Lines 215-219: Safe float conversion with try-except fallback
  • Line 221: Sorting by score improves readability
  • Line 235: Helpful info message when no data is available

Also applies to: 234-236


238-257: LGTM with robust error handling.

The domains tab handles sorting failures gracefully (lines 243-247), falling back to the original order if score extraction fails. The info message on line 260 provides clear feedback when data is absent.

Also applies to: 259-261


263-272: LGTM!

The insights tab has a clean two-column layout with appropriate proportions. Line 271's defensive check ensures only valid, non-empty strings are displayed.


274-282: LGTM!

The recommendations tab has a clean, straightforward implementation with appropriate fallback messaging.


284-304: LGTM!

The tab orchestration is well-structured with clear separation of concerns. Displaying the summary block before the tabs provides users with a high-level overview before diving into details.


309-331: LGTM!

The main application flow is clean and intuitive:

  • Sequential steps guide the user through the process
  • Spinner provides feedback during analysis
  • Conditional result display ensures results are only shown when available

@baranylcn baranylcn merged commit 8115e53 into main Nov 4, 2025
6 checks passed
@baranylcn baranylcn deleted the feat/role-gap-analysis-ui branch November 23, 2025 09:27
@coderabbitai coderabbitai bot mentioned this pull request Dec 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants