feat: Implement Sample Paper Generator with format detection and question generation by shubhamxdd · Pull Request #8 · shubhamxdd/pyq-appl

shubhamxdd · 2026-05-18T20:01:25Z

Added new papers router for handling paper-related endpoints.
Introduced detect-format endpoint to analyze past papers and extract question formats.
Created generate_paper_task for background processing of paper generation using LLM.
Developed frontend components for paper generation, including resource selection and format detection.
Integrated new API calls in the frontend for managing papers and outputs.
Enhanced UI with loading states and user feedback for paper generation process.
Added Pydantic schemas for paper creation and output management.

Summary by CodeRabbit

New Features
- Sample paper generator with background generation, job status, rename/delete, and recent generations history
- AI-driven format detection for past papers; quick auto-populate when creating papers
- PDF exports: full study guide and questions-only downloads
- Generator page (protected) with paper creation, polling, and paper output viewing
UX
- Toggle controls to show/hide answers and explanations; new checkbox component
- Persistent resource selections and URL deep-linking for solver sessions
Limits
- Free-plan monthly quotas enforced with clear UI behavior and messages

…tion generation - Added new `papers` router for handling paper-related endpoints. - Introduced `detect-format` endpoint to analyze past papers and extract question formats. - Created `generate_paper_task` for background processing of paper generation using LLM. - Developed frontend components for paper generation, including resource selection and format detection. - Integrated new API calls in the frontend for managing papers and outputs. - Enhanced UI with loading states and user feedback for paper generation process. - Added Pydantic schemas for paper creation and output management.

coderabbitai · 2026-05-18T20:01:35Z

📝 Walkthrough

Walkthrough

This PR delivers Phase 5 Sample Paper Generator: adds LLM prompts and schemas, new /papers REST API and format-detection endpoint, background Arq worker to generate and persist paper outputs, DB migration and model changes for selected resources, PDF generation/upload, resources quota/enqueue updates, and a new React Generator page plus frontend client and routing.

Changes

Paper Generation Feature

Layer / File(s)	Summary
LLM prompts & paper schemas `backend/app/llm/client.py`, `backend/app/llm/prompts.py`, `backend/app/schemas/paper.py`	Adds `DETECT_FORMAT_PROMPT` and `GENERATE_PAPER_PROMPT` and Pydantic models for paper creation, resource linking, format detection, outputs, and question structure.
Papers API endpoints `backend/app/main.py`, `backend/app/routers/papers.py`	Registers `papers` router and implements endpoints: POST `/papers/detect-format`, POST `/papers` (create + enqueue + Job), GET `/papers`, GET `/papers/{id}`, PATCH `/papers/{id}`, DELETE `/papers/{id}`, GET `/papers/{id}/output`, PATCH `/papers/{id}/output`, and GET `/papers/{id}/pdf` with ownership checks and error handling.
Background worker & jobs `backend/app/workers/arq_worker.py`, `backend/app/workers/tasks.py`	Registers `generate_paper_task` and extends `extraction_task`/`generate_paper_task` to accept `job_id`, update Job lifecycle (queued→running→done/failed), stream LLM output, parse JSON, persist `PaperOutput`, and handle cancellation/errors.
PDF service & template `backend/app/services/pdf.py`, `backend/app/templates/paper.html`	Adds Jinja2 + WeasyPrint-based `generate_paper_pdf` and `paper.html` template; includes option-letter filter and returns PDF bytes for upload.
DB migration, model & solver sync `backend/alembic/versions/983ad2983d5c_add_selected_resource_ids_to_chat_.py`, `backend/app/models/chat_session.py`, `backend/app/routers/solver.py`, `backend/app/schemas/chat_session.py`	Adds `selected_resource_ids` JSON column, model field, and persists/exposes selection in solver endpoints/schemas.
Resources enqueue & quota `backend/app/routers/resources.py`	Enforces per-user resource quota for free plan on upload, creates ingestion Job records, enqueues `extraction_task` with `job_id`, and ensures Redis pool cleanup; retry path updated similarly.
Frontend Generator & client `frontend/src/App.tsx`, `frontend/src/api/papers.ts`, `frontend/src/components/ui/checkbox.tsx`, `frontend/src/pages/Generator.tsx`	Adds `Generator` page and route, `papersApi` client, a `Checkbox` component, Generator UI with detect-format, create dialog, polling, recent generations sidebar, output toggles, and PDF export/regenerate flows.
Frontend Solver updates `frontend/src/api/solver.ts`, `frontend/src/pages/Solver.tsx`	Extends `ChatSession` type and `updateSession` to include `selected_resource_ids`, derives active session from URL, navigates on create/delete, syncs/persists selected resources.
Config, docs & UI utility `backend/app/config.py`, `context/*`, `frontend/src/index.css`, `todo.md`	Adds `PAPERS_LIMIT`/`RESOURCES_LIMIT`, updates architecture/progress/docs, adds `.no-scrollbar` CSS utility, and updates todo and progress tracker content.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

shubhamxdd/pyq-appl#3: Related resource extraction task and pipeline that this PR consumes for paper generation.
shubhamxdd/pyq-appl#4: Related OpenRouter streaming integration referenced by LLM client usage in this PR.

Poem

🐰 I nibble prompts and stitch the paper tight,
From context chunks I bloom questions in the night,
Background hops and queues spin out the draft,
Toggle answers on — the classroom laughs,
Phase Five hops forward; soon we ship with light.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 21.88% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main feature added: a Sample Paper Generator with format detection and question generation capabilities, which aligns with the substantial new papers router, LLM-based generation task, and related frontend components.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/paper-gen

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (1)

backend/app/schemas/paper.py (1)

8-8: ⚡ Quick win

Constrain contract strings to explicit enums/literals.

role, delivery_mode, status, and question type are open strings right now. Invalid values can pass validation and break worker/API assumptions later. Use Literal[...]/Enums at schema level to fail fast on bad input.

Proposed contract tightening

+from typing import Literal
+
 class PaperResourceLink(BaseModel):
     resource_id: UUID
-    role: str # syllabus | past_paper | notes
+    role: Literal["syllabus", "past_paper", "notes"]

 class PaperCreate(BaseModel):
@@
-    delivery_mode: str = "background"
+    delivery_mode: Literal["background"] = "background"

 class PaperUpdate(BaseModel):
     title: Optional[str] = None
-    status: Optional[str] = None
+    status: Optional[Literal["pending", "generating", "done", "failed"]] = None
@@
 class PaperQuestion(BaseModel):
-    type: str # mcq | short | long
+    type: Literal["mcq", "short", "long"]

Also applies to: 14-14, 18-18, 37-37

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/schemas/paper.py` at line 8, The schema currently uses free-form
strings for role, delivery_mode, status and question type (e.g., the field
"role: str" in the Paper model) which lets invalid values pass; change those
fields to constrained types by replacing str with either typing.Literal[...]
(e.g., Literal["syllabus","past_paper","notes"]) or a small Enum class and
update the corresponding fields in the Paper and Question schemas (look for
symbols like Paper, Question, delivery_mode, status, type) to use the new
Literal/Enum types; also add the necessary imports (from typing import Literal
or from enum import Enum) and run quick validation tests to ensure invalid
values now raise schema errors.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/routers/papers.py`:
- Around line 67-68: Replace the current except block that does raise
HTTPException(status_code=500, detail=f"Failed to detect format: {str(e)}") with
logic that logs the full exception server-side (e.g., logger.exception(...) or
use the app logger) and raises an HTTPException with a generic client-facing
message like "Failed to detect file format" or "Internal server error" so
internal exception text is not returned; update the except handling in the
papers.py handler where the exception variable e is caught to call
logger.exception(e) and then raise HTTPException(status_code=500, detail="Failed
to detect file format").
- Around line 88-95: The loop currently silently skips invalid resources
(res_result.scalar_one_or_none()) which lets papers be created without required
context; change this to fail fast by raising an HTTPException(status_code=400)
with a clear message when a requested Resource (Resource.id ==
res_link.resource_id, Resource.user_id == current_user.id) is not found/owned,
and after processing data.resources ensure at least one valid resource
remains—if none, raise HTTPException(400) as well; make these checks inside the
paper creation handler where db.execute/select and res_link, res_result are used
so invalid input is rejected immediately.
- Around line 109-111: Wrap the ephemeral Redis pool usage (create_pool and
RedisSettings.from_dsn/ settings.REDIS_URL) around a try/except/finally: call
redis.enqueue_job("generate_paper_task", str(new_paper.id)) inside try, catch
enqueue exceptions to update the paper's status from "pending" to a failure
state (e.g., "failed" or "error") and persist that change, then re-raise or log
as appropriate; in finally ensure the Redis pool is properly closed/shut down
(close + await wait_closed or the equivalent on the pool object) so no
per-request pool leak occurs.

In `@backend/app/workers/tasks.py`:
- Around line 251-266: The parsed questions JSON (questions) must be
structurally validated before creating PaperOutput to avoid persisting malformed
data; before the PaperOutput(...) call, add an explicit validation step that
ensures questions is a list and that each question object contains the required
keys and types (e.g., common fields like id/stem/type, and type-specific fields
such as choices for "multiple_choice" or answer/explanation formats), and fail
fast (raise/log and do not persist) if validation fails; implement this using a
JSON Schema check with jsonschema.validate or a small Pydantic model for
Question and QuestionList and run it in the same scope as tasks.py parsing logic
so only validated data is passed into PaperOutput.
- Around line 288-289: Replace the bare "except: pass" used when marking papers
as failed in backend/app/workers/tasks.py with a proper exception handler: catch
"Exception as err" instead of a bare except and log the error (using the
module's logger or logger.exception/logger.error) including the exception
message and stack trace so failures aren't silently swallowed; do not suppress
the exception silently and only swallow it if there's a clear, logged reason.

In `@frontend/src/pages/Generator.tsx`:
- Around line 103-107: The invalidation in toggleSettingsMutation's onSuccess
uses the closed-over activePaperId which can be stale; change onSuccess to use
the mutation's variables (the second onSuccess arg) instead and call
queryClient.invalidateQueries with ['paper-output', variables.id] (or
variables?.id) so the cache for the actually-updated paper is invalidated
reliably; locate the useMutation call (toggleSettingsMutation, mutationFn) and
update its onSuccess signature to accept (_, variables) and use variables.id for
the queryKey.
- Around line 485-493: The generating placeholder is shown whenever activeOutput
is falsy, which wrongly includes failed papers; update the render condition in
Generator.tsx (the block using outputLoading, activeOutput) to exclude failed
outputs by checking the failure flag/status (e.g., add && !failed or check
activeOutput?.status !== 'failed') so the generating message only appears when
there is no output and the job hasn't failed, and ensure failed cases render the
existing failure UI instead.

---

Nitpick comments:
In `@backend/app/schemas/paper.py`:
- Line 8: The schema currently uses free-form strings for role, delivery_mode,
status and question type (e.g., the field "role: str" in the Paper model) which
lets invalid values pass; change those fields to constrained types by replacing
str with either typing.Literal[...] (e.g.,
Literal["syllabus","past_paper","notes"]) or a small Enum class and update the
corresponding fields in the Paper and Question schemas (look for symbols like
Paper, Question, delivery_mode, status, type) to use the new Literal/Enum types;
also add the necessary imports (from typing import Literal or from enum import
Enum) and run quick validation tests to ensure invalid values now raise schema
errors.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5795654d-275d-4aca-a761-2eb5de6e1bdf

📥 Commits

Reviewing files that changed from the base of the PR and between 342801b and 3c34532.

📒 Files selected for processing (12)

backend/app/llm/client.py
backend/app/llm/prompts.py
backend/app/main.py
backend/app/routers/papers.py
backend/app/schemas/paper.py
backend/app/workers/arq_worker.py
backend/app/workers/tasks.py
context/progress-tracker.md
frontend/src/App.tsx
frontend/src/api/papers.ts
frontend/src/components/ui/checkbox.tsx
frontend/src/pages/Generator.tsx

coderabbitai · 2026-05-18T20:06:14Z

+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Failed to detect format: {str(e)}")


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid leaking internal exception text in API error responses.

At Line 68, returning str(e) can expose provider/internal details. Return a generic message to clients and log the detailed error server-side.

🧰 Tools

🪛 Ruff (0.15.12)

[warning] 67-67: Do not catch blind exception: Exception

(BLE001)

[warning] 68-68: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

[warning] 68-68: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/papers.py` around lines 67 - 68, Replace the current except block that does raise HTTPException(status_code=500, detail=f"Failed to detect format: {str(e)}") with logic that logs the full exception server-side (e.g., logger.exception(...) or use the app logger) and raises an HTTPException with a generic client-facing message like "Failed to detect file format" or "Internal server error" so internal exception text is not returned; update the except handling in the papers.py handler where the exception variable e is caught to call logger.exception(e) and then raise HTTPException(status_code=500, detail="Failed to detect file format").

coderabbitai · 2026-05-18T20:06:14Z

+            try:
+                questions = json.loads(clean_json)
+                print(f"✨ [DEBUG] Successfully parsed {len(questions)} questions.")
+            except json.JSONDecodeError as je:
+                print(f"❌ [PARSE ERROR] Failed to parse LLM response as JSON.")
+                print(f"🔍 [DEBUG] Raw response snippet: {clean_json[:500]}...")
+                raise je
+
+            # 5. Save Output
+            print(f"💾 [DEBUG] Saving paper output to database...")
+            new_output = PaperOutput(
+                paper_id=paper.id,
+                questions=questions,
+                include_answers=True,
+                include_explanations=True
+            )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate generated JSON schema before persisting.

At Line 252 onward, parsed JSON is saved directly. If the model returns malformed shape, bad data is persisted and UI contracts break downstream. Add explicit structural validation (array, required fields per question type) before PaperOutput insert.

🧰 Tools

🪛 Ruff (0.15.12)

[error] 255-255: f-string without any placeholders

Remove extraneous f prefix

(F541)

[error] 260-260: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/workers/tasks.py` around lines 251 - 266, The parsed questions JSON (questions) must be structurally validated before creating PaperOutput to avoid persisting malformed data; before the PaperOutput(...) call, add an explicit validation step that ensures questions is a list and that each question object contains the required keys and types (e.g., common fields like id/stem/type, and type-specific fields such as choices for "multiple_choice" or answer/explanation formats), and fail fast (raise/log and do not persist) if validation fails; implement this using a JSON Schema check with jsonschema.validate or a small Pydantic model for Question and QuestionList and run it in the same scope as tasks.py parsing logic so only validated data is passed into PaperOutput.

coderabbitai · 2026-05-18T20:06:14Z

+            except:
+                pass


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove bare except: pass in failure handling.

At Line 288–289, this suppresses errors when marking papers as failed, making stuck states hard to diagnose. Catch Exception as err and log it.

Minimal fix

- except: - pass + except Exception as err: + logger.exception("Failed to mark paper %s as failed: %s", paper_id, err)

🧰 Tools

🪛 Ruff (0.15.12)

[error] 288-288: Do not use bare except

(E722)

[error] 288-289: try-except-pass detected, consider logging the exception

(S110)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/workers/tasks.py` around lines 288 - 289, Replace the bare "except: pass" used when marking papers as failed in backend/app/workers/tasks.py with a proper exception handler: catch "Exception as err" instead of a bare except and log the error (using the module's logger or logger.exception/logger.error) including the exception message and stack trace so failures aren't silently swallowed; do not suppress the exception silently and only swallow it if there's a clear, logged reason.

…unctionality

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/routers/solver.py`:
- Around line 176-185: Persist the validated resource IDs instead of the raw
request list: replace uses of data.resource_ids when assigning
session.selected_resource_ids and when constructing new ChatSession (new_sess)
so they store the filtered/validated IDs that were actually used in the query
(e.g., the variable holding the authorized/exists IDs from the earlier filter),
converted to strings; update the assignment in the existing-session branch
(session.selected_resource_ids = ...) and the ChatSession constructor
(selected_resource_ids=...) to use that validated_id_list rather than
data.resource_ids to avoid saving deleted or unauthorized IDs.

In `@frontend/src/pages/Generator.tsx`:
- Around line 377-405: The output toggle buttons (the Button components that
reference activeOutput and call toggleSettingsMutation.mutate) should be gated
so they cannot fire when no output record exists; check the paper status and
output presence (use condition activePaper?.status === 'done' && !!activeOutput)
and apply it to both the onClick behavior and the disabled prop (or
early-return) so the PATCH is not called for pending/failed papers; update the
Button instances that reference activeOutput, activePaperId, and
toggleSettingsMutation to respect this guard and reflect disabled state
visually.
- Around line 113-123: When the user toggles a resource or changes its role, the
component must clear any previously detected document format so Step 2 won't
remain READY for a stale format; update handleResourceToggle and
handleRoleChange to call the state reset for the detected format (e.g.,
setDetectedFormat(null) or the equivalent state updater you use for the learned
format) after updating selectedResources, ensuring the component also clears any
related "Step 2 ready" status if you have a separate flag (e.g., setStep2Status
or similar). This ensures toggling or role changes invalidate prior detection
before createPaper can submit.
- Around line 217-239: The Detect Pattern button is currently gated by the
original resource type (res.type) so it doesn't appear when the user changes the
role to "past_paper" via the select; update the conditional that renders the
button to check the selected role in selectedResources instead (e.g.,
selectedResources.find(r => r.id === res.id)?.role === 'past_paper') so the
button appears whenever the selected role is past_paper, and ensure
handleDetectFormat(res.id) and isDetecting behavior remain the same.

In `@frontend/src/pages/Solver.tsx`:
- Around line 138-148: The effect that syncs selected resources leaves stale
selections when activeSessionId is set but sessions doesn't yet contain that
session; update the useEffect body (the effect that references activeSessionId,
sessions and calls setSelectedResources) so that when activeSessionId and
sessions are present but sessions.find(...) (activeSession) returns undefined
you explicitly call setSelectedResources([]) — i.e., change the branch for
activeSession being falsy to clear selections instead of doing nothing.
- Around line 68-71: The session switch helper handleSelectSession should be
used as the single source of truth for session navigation instead of directly
calling setActiveSessionId; update both places that render session list buttons
(where they currently call setActiveSessionId(sess.id)) to call
handleSelectSession(sess.id) (which performs navigate(`/solver/${id}`)) so the
URL is updated on click and becomes the canonical session state used by the
component.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1d1444b3-40d5-49b4-88b0-63e607158631

📥 Commits

Reviewing files that changed from the base of the PR and between 3c34532 and 334514d.

📒 Files selected for processing (9)

backend/alembic/versions/983ad2983d5c_add_selected_resource_ids_to_chat_.py
backend/app/models/chat_session.py
backend/app/routers/papers.py
backend/app/routers/solver.py
backend/app/schemas/chat_session.py
frontend/src/App.tsx
frontend/src/api/solver.ts
frontend/src/pages/Generator.tsx
frontend/src/pages/Solver.tsx

✅ Files skipped from review due to trivial changes (1)

backend/alembic/versions/983ad2983d5c_add_selected_resource_ids_to_chat_.py

🚧 Files skipped from review as they are similar to previous changes (1)

frontend/src/App.tsx

coderabbitai · 2026-05-18T21:14:06Z

+        # Update persistent resource selection if changed
+        new_resource_ids = [str(rid) for rid in data.resource_ids]
+        if session.selected_resource_ids != new_resource_ids:
+            session.selected_resource_ids = new_resource_ids
    else:
        # Auto-create session if none provided
-        new_sess = ChatSession(user_id=current_user.id, title=data.content[:30] + "...")
+        new_sess = ChatSession(
+            user_id=current_user.id, 
+            title=data.content[:30] + "...",
+            selected_resource_ids=[str(rid) for rid in data.resource_ids]


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Persist only the validated resource IDs.

data.resource_ids can still contain deleted or unauthorized IDs that were filtered out by the query above. Saving the raw request back into session.selected_resource_ids keeps stale IDs in the session and desyncs the solver UI from the actual context used for the answer.

Proposed fix

- new_resource_ids = [str(rid) for rid in data.resource_ids] + new_resource_ids = [str(res.id) for res in resources] if session.selected_resource_ids != new_resource_ids: session.selected_resource_ids = new_resource_ids @@ new_sess = ChatSession( user_id=current_user.id, title=data.content[:30] + "...", - selected_resource_ids=[str(rid) for rid in data.resource_ids] + selected_resource_ids=[str(res.id) for res in resources] )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/solver.py` around lines 176 - 185, Persist the validated resource IDs instead of the raw request list: replace uses of data.resource_ids when assigning session.selected_resource_ids and when constructing new ChatSession (new_sess) so they store the filtered/validated IDs that were actually used in the query (e.g., the variable holding the authorized/exists IDs from the earlier filter), converted to strings; update the assignment in the existing-session branch (session.selected_resource_ids = ...) and the ChatSession constructor (selected_resource_ids=...) to use that validated_id_list rather than data.resource_ids to avoid saving deleted or unauthorized IDs.

coderabbitai · 2026-05-18T21:14:06Z

+  const handleResourceToggle = (id: string) => {
+    setSelectedResources(prev => {
+      const exists = prev.find(r => r.id === id);
+      if (exists) return prev.filter(r => r.id !== id);
+      return [...prev, { id, role: 'notes' }];
+    });
+  };
+
+  const handleRoleChange = (id: string, role: string) => {
+    setSelectedResources(prev => prev.map(r => r.id === id ? { ...r, role } : r));
+  };


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reset the detected format when the selected source changes.

Once pattern detection succeeds, deselecting that resource or changing its role leaves Step 2 in a READY state and createPaper can submit a format learned from a different document.

Proposed fix

const handleResourceToggle = (id: string) => { + setFormatConfig(null); setSelectedResources(prev => { const exists = prev.find(r => r.id === id); if (exists) return prev.filter(r => r.id !== id); return [...prev, { id, role: 'notes' }]; }); }; const handleRoleChange = (id: string, role: string) => { + setFormatConfig(null); setSelectedResources(prev => prev.map(r => r.id === id ? { ...r, role } : r)); };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const handleResourceToggle = (id: string) => {

setSelectedResources(prev => {

const exists = prev.find(r => r.id === id);

if (exists) return prev.filter(r => r.id !== id);

return [...prev, { id, role: 'notes' }];

});

};

const handleRoleChange = (id: string, role: string) => {

setSelectedResources(prev => prev.map(r => r.id === id ? { ...r, role } : r));

};

const handleResourceToggle = (id: string) => {

setFormatConfig(null);

setSelectedResources(prev => {

const exists = prev.find(r => r.id === id);

if (exists) return prev.filter(r => r.id !== id);

return [...prev, { id, role: 'notes' }];

});

};

const handleRoleChange = (id: string, role: string) => {

setFormatConfig(null);

setSelectedResources(prev => prev.map(r => r.id === id ? { ...r, role } : r));

};

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/pages/Generator.tsx` around lines 113 - 123, When the user toggles a resource or changes its role, the component must clear any previously detected document format so Step 2 won't remain READY for a stale format; update handleResourceToggle and handleRoleChange to call the state reset for the detected format (e.g., setDetectedFormat(null) or the equivalent state updater you use for the learned format) after updating selectedResources, ensuring the component also clears any related "Step 2 ready" status if you have a separate flag (e.g., setStep2Status or similar). This ensures toggling or role changes invalidate prior detection before createPaper can submit.

coderabbitai · 2026-05-18T21:14:06Z

+  // Sync selected resources from active session
+  useEffect(() => {
+    if (activeSessionId && sessions) {
+      const activeSession = sessions.find((s: any) => s.id === activeSessionId);
+      if (activeSession) {
+        setSelectedResources(activeSession.selected_resource_ids || []);
+      }
+    } else {
+      setSelectedResources([]);
+    }
+  }, [activeSessionId, sessions]);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clear stale selections when the active session is not in the cached list yet.

Right after navigating to a newly-created session, activeSessionId is set before sessions contains that record. In that window this effect preserves the previous session's selectedResources, so the next question can be sent with the wrong documents.

Proposed fix

useEffect(() => { if (activeSessionId && sessions) { const activeSession = sessions.find((s: any) => s.id === activeSessionId); if (activeSession) { setSelectedResources(activeSession.selected_resource_ids || []); + } else { + setSelectedResources([]); } } else { setSelectedResources([]); } }, [activeSessionId, sessions]);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Sync selected resources from active session

useEffect(() => {

if (activeSessionId && sessions) {

const activeSession = sessions.find((s: any) => s.id === activeSessionId);

if (activeSession) {

setSelectedResources(activeSession.selected_resource_ids || []);

}

} else {

setSelectedResources([]);

}

}, [activeSessionId, sessions]);

// Sync selected resources from active session

useEffect(() => {

if (activeSessionId && sessions) {

const activeSession = sessions.find((s: any) => s.id === activeSessionId);

if (activeSession) {

setSelectedResources(activeSession.selected_resource_ids || []);

} else {

setSelectedResources([]);

}

} else {

setSelectedResources([]);

}

}, [activeSessionId, sessions]);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/pages/Solver.tsx` around lines 138 - 148, The effect that syncs selected resources leaves stale selections when activeSessionId is set but sessions doesn't yet contain that session; update the useEffect body (the effect that references activeSessionId, sessions and calls setSelectedResources) so that when activeSessionId and sessions are present but sessions.find(...) (activeSession) returns undefined you explicitly call setSelectedResources([]) — i.e., change the branch for activeSession being falsy to clear selections instead of doing nothing.

…amic badges

… logic for different modes in this commit i added feature in which it only adds questions to pdf not answer !

…e with URL like in chat we have different url (id) for each chat i did similar thing here for each sample paper

… status updates

…e generation

…bar styling in the frontend

…ting, dual PDF export (Full/Questions), abort/delete functionality, paper renaming, and dashboard metrics.

… pattern validation

… users

…counts

…n papers

…nd background job tracking

shubhamxdd · 2026-05-18T23:54:07Z

@CodeRabbit review

coderabbitai · 2026-05-18T23:54:12Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/app/routers/resources.py (1)
102-130: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Commit Job records before handing their IDs to ARQ.

Line 116 and Line 226 enqueue new_job.id before that transaction commits. A fast worker can start, fail to load the row, and then blow up when extraction_task later touches job.status; the enqueue-error path can also commit a flushed queued job that never actually ran. Persist the job first, then enqueue, and move the resource/job into a terminal failure state in a follow-up transaction if the queue write fails.

Also applies to: 213-239
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/routers/resources.py` around lines 102 - 130, The code enqueues
new_job.id into ARQ before the DB transaction is committed; commit and persist
the Job first, then enqueue. Change the sequence so that after creating new_job
and awaiting db.flush() you await db.commit() and
db.refresh(new_job)/db.refresh(new_resource) to ensure new_job.id is durable,
then create the Redis pool and call redis.enqueue_job('extraction_task',
str(new_resource.id), str(new_job.id)). If enqueue fails, start a new DB
transaction to set new_job.status = 'failed' (and new_resource.status or cleanup
as appropriate) and commit that follow-up transaction, then raise the
HTTPException; always close the redis pool in finally.

🧹 Nitpick comments (2)

frontend/src/api/papers.ts (1)

19-21: ⚡ Quick win

Replace any types with proper interfaces for type safety.

The create and toggleOutput methods use any for their data parameters, which bypasses TypeScript's type checking. This makes the API less self-documenting and more error-prone.

♻️ Proposed type-safe refactor

// Add interface definitions at the top of the file
interface CreatePaperRequest {
  title: string;
  resource_ids: string[];
  format_config: {
    duration_minutes: number;
    total_marks: number;
    sections: Array<{
      name: string;
      question_count: number;
      marks_per_question: number;
      type: 'mcq' | 'short' | 'long';
    }>;
  };
}

interface ToggleOutputRequest {
  include_answers?: boolean;
  include_explanations?: boolean;
}

// Update the method signatures
export const papersApi = {
  // ... other methods ...
  
  create: async (data: CreatePaperRequest) => {
    const response = await api.post('/papers', data);
    return response.data;
  },
  
  // ... other methods ...
  
  toggleOutput: async (id: string, data: ToggleOutputRequest) => {
    const response = await api.patch(`/papers/${id}/output`, data);
    return response.data;
  },
  
  // ... other methods ...
};

Also applies to: 39-41

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/api/papers.ts` around lines 19 - 21, Replace the loose any types
used in the create and toggleOutput methods by adding explicit request
interfaces (e.g., CreatePaperRequest and ToggleOutputRequest) at the top of the
file and updating signatures: change create: async (data: any) => ... to create:
async (data: CreatePaperRequest) => ... and toggleOutput: async (id: string,
data: any) => ... to toggleOutput: async (id: string, data: ToggleOutputRequest)
=> ...; ensure CreatePaperRequest models title, resource_ids, format_config
(with duration_minutes, total_marks, sections array and section fields name,
question_count, marks_per_question, type) and ToggleOutputRequest includes
optional include_answers and include_explanations so callers get proper type
checking and auto-complete.

backend/app/services/pdf.py (1)

28-30: ⚡ Quick win

Consider defensive validation for question structure.

The code assumes all questions have a type field. If a malformed question is passed without this field, a KeyError will be raised. Using .get() with a default would make the code more resilient.

♻️ Proposed defensive refactor

-    mcqs = [q for q in questions if q.get('type') == 'mcq']
-    shorts = [q for q in questions if q.get('type') == 'short']
-    longs = [q for q in questions if q.get('type') == 'long']
+    # Use .get() for safety, but questions without 'type' will be silently excluded
+    mcqs = [q for q in questions if q.get('type') == 'mcq']
+    shorts = [q for q in questions if q.get('type') == 'short']
+    longs = [q for q in questions if q.get('type') == 'long']
+    
+    # Optionally, log a warning if any questions were excluded
+    total_categorized = len(mcqs) + len(shorts) + len(longs)
+    if total_categorized < len(questions):
+        print(f"Warning: {len(questions) - total_categorized} questions had invalid or missing 'type' field")

Note: The code already uses .get('type') so it won't crash, but questions with missing or invalid types will be silently excluded from all sections. Consider adding logging to catch data quality issues.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/services/pdf.py` around lines 28 - 30, The three list
comprehensions (mcqs, shorts, longs) filter the questions list by q.get('type')
but silently drop entries missing or with invalid types—add defensive validation
and logging: iterate over questions, read q_type = q.get('type', None), if
q_type not in {'mcq','short','long'} log a warning (using the module's logger or
create one) including the offending question identifier/summary, and append
valid items to mcqs, shorts, or longs accordingly; this preserves current
behavior for valid types while surfacing malformed question data for
investigation.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/routers/papers.py`:
- Around line 250-260: The current deletion path removes the Paper (await
db.delete(paper); await db.commit()) even if storage deletions via
storage_service.delete_file for output.pdf_url or output.question_pdf_url fail,
risking orphaned billable files; modify the logic in the papers deletion flow to
check the result/raise on failure of storage_service.delete_file (or catch and
log+raise) for both output.pdf_url and output.question_pdf_url (use the same
pattern as the resource delete flow), and only proceed to call db.delete(paper)
and db.commit() when both deletions succeed (or surface the cleanup failure to
the caller), referencing output.pdf_url, output.question_pdf_url,
storage_service.delete_file, db.delete(paper), and db.commit() to locate the
code to change.
- Around line 141-167: The Job row is committed only after enqueue, so the
worker may see a missing Job id; move the DB commit for the newly created Job
(the new_job created and currently flushed via await db.flush()) to occur before
calling redis.enqueue_job for "generate_paper_task" so the worker always
observes the row; additionally, update the exception path around
redis.enqueue_job to remove or mark and commit the new_job consistently (e.g.,
delete the phantom new_job or set its status to "failed" and commit) to avoid
leaving a queued Job that never ran; reference new_job, await db.flush(), await
db.commit(), redis.enqueue_job("generate_paper_task") and generate_paper_task
when making the change.
- Around line 117-136: The loop in routers/papers.py currently only checks
ownership of each res_link; update the existence query to also require the
resource is ready and has non-empty extracted text before inserting into
paper_resources. Specifically, change the select(...) used for res_result to
include Resource.status == "ready" and Resource.extracted_text is not null/empty
(or the equivalent check your Resource model uses), then if scalar_one_or_none()
returns None raise HTTPException with a message like "Resource {id} not ready or
has no extracted text." Use the same symbols (data.resources,
res_link.resource_id, current_user.id, Resource, res_result,
paper_resources.insert(), new_paper.id) to locate and modify the code.

In `@backend/app/services/pdf.py`:
- Line 9: The Jinja2 Environment is created without autoescaping which can cause
XSS when rendering user content; update the Environment instantiation (the env
variable created with Environment(loader=FileSystemLoader(template_dir))) to
enable autoescape for HTML templates (e.g., pass autoescape=True or use
select_autoescape for HTML/XML templates) so user-provided question content is
escaped by default when rendering PDFs.
- Around line 11-13: The map_option_letter function needs input validation: in
map_option_letter convert the incoming index to an int inside a try/except and
raise a clear ValueError for non-numeric input, then check that the int is
within the allowed range (0 through 3) and raise a ValueError if out of bounds;
only after those checks return chr(65 + idx). Use the function name
map_option_letter in your changes so the reviewer can find the updated logic.

In `@frontend/src/App.tsx`:
- Around line 229-275: The Card components used for navigation (the Card
elements with onClick={() => navigate('/resources')},
onClick={()=>navigate('/solver')} and onClick={()=>navigate('/generator')}
inside App.tsx) are not keyboard-accessible; replace these clickable-only
containers with semantic interactive elements by rendering the Card as a real
Link or button (or use the Card's asChild prop to wrap a react-router <Link
to="..."> or a <button>) so they receive focus, support keyboard activation, and
preserve the existing styling and CardHeader/CardTitle/CardDescription layout;
repeat the same change for the other Card blocks noted around lines 289-333.

In `@frontend/src/pages/Generator.tsx`:
- Around line 238-246: The client-side quota check in Generator.tsx currently
hardcodes the monthly limit as 3 (see monthlyPapers, papers, user) which can
diverge from backend settings; replace this by either fetching the canonical
limit from the backend (e.g., call the existing settings or config API to get
PAPERS_LIMIT before comparing monthlyPapers.length) and use that value instead
of 3, or remove the preflight check entirely and handle the server-side
403/error from the generation API (showing toast.error based on the API
response). Ensure you use the same identifiers (monthlyPapers, papers, user,
toast) so the change integrates with the existing logic.

---

Outside diff comments:
In `@backend/app/routers/resources.py`:
- Around line 102-130: The code enqueues new_job.id into ARQ before the DB
transaction is committed; commit and persist the Job first, then enqueue. Change
the sequence so that after creating new_job and awaiting db.flush() you await
db.commit() and db.refresh(new_job)/db.refresh(new_resource) to ensure
new_job.id is durable, then create the Redis pool and call
redis.enqueue_job('extraction_task', str(new_resource.id), str(new_job.id)). If
enqueue fails, start a new DB transaction to set new_job.status = 'failed' (and
new_resource.status or cleanup as appropriate) and commit that follow-up
transaction, then raise the HTTPException; always close the redis pool in
finally.

---

Nitpick comments:
In `@backend/app/services/pdf.py`:
- Around line 28-30: The three list comprehensions (mcqs, shorts, longs) filter
the questions list by q.get('type') but silently drop entries missing or with
invalid types—add defensive validation and logging: iterate over questions, read
q_type = q.get('type', None), if q_type not in {'mcq','short','long'} log a
warning (using the module's logger or create one) including the offending
question identifier/summary, and append valid items to mcqs, shorts, or longs
accordingly; this preserves current behavior for valid types while surfacing
malformed question data for investigation.

In `@frontend/src/api/papers.ts`:
- Around line 19-21: Replace the loose any types used in the create and
toggleOutput methods by adding explicit request interfaces (e.g.,
CreatePaperRequest and ToggleOutputRequest) at the top of the file and updating
signatures: change create: async (data: any) => ... to create: async (data:
CreatePaperRequest) => ... and toggleOutput: async (id: string, data: any) =>
... to toggleOutput: async (id: string, data: ToggleOutputRequest) => ...;
ensure CreatePaperRequest models title, resource_ids, format_config (with
duration_minutes, total_marks, sections array and section fields name,
question_count, marks_per_question, type) and ToggleOutputRequest includes
optional include_answers and include_explanations so callers get proper type
checking and auto-complete.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5013e668-2445-4be8-ae98-bf096bf74db4

📥 Commits

Reviewing files that changed from the base of the PR and between 334514d and cfc15d2.

⛔ Files ignored due to path filters (1)

frontend/i7.png is excluded by !**/*.png

📒 Files selected for processing (21)

backend/alembic/versions/ab47c0990b86_add_question_pdf_url_to_paper_outputs.py
backend/app/config.py
backend/app/models/paper_output.py
backend/app/routers/papers.py
backend/app/routers/resources.py
backend/app/schemas/paper.py
backend/app/services/pdf.py
backend/app/templates/paper.html
backend/app/workers/arq_worker.py
backend/app/workers/tasks.py
context/architecture.md
context/code-standards.md
context/progress-tracker.md
context/project-overview.md
context/ui-context.md
frontend/src/App.tsx
frontend/src/api/papers.ts
frontend/src/index.css
frontend/src/pages/Generator.tsx
frontend/src/pages/Solver.tsx
todo.md

✅ Files skipped from review due to trivial changes (5)

backend/alembic/versions/ab47c0990b86_add_question_pdf_url_to_paper_outputs.py
todo.md
context/code-standards.md
context/project-overview.md
context/progress-tracker.md

🚧 Files skipped from review as they are similar to previous changes (1)

backend/app/schemas/paper.py

coderabbitai · 2026-05-18T23:59:39Z

+    # 3. Link Resources
+    for res_link in data.resources:
+        # Verify resource exists and belongs to user
+        res_result = await db.execute(
+            select(Resource).where(Resource.id == res_link.resource_id, Resource.user_id == current_user.id)
+        )
+        if not res_result.scalar_one_or_none():
+            raise HTTPException(
+                status_code=status.HTTP_400_BAD_REQUEST,
+                detail=f"Resource {res_link.resource_id} not found or unauthorized."
+            )
+
+        # Insert into association table
+        await db.execute(
+            paper_resources.insert().values(
+                paper_id=new_paper.id,
+                resource_id=res_link.resource_id,
+                resource_role=res_link.role
+            )
+        )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject resources that aren't ready yet.

This loop only checks ownership. A direct API caller can attach processing or failed resources, and the worker then burns a queue slot before dying with "No context found in selected resources." Require status == "ready" and non-empty extracted text here.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/papers.py` around lines 117 - 136, The loop in routers/papers.py currently only checks ownership of each res_link; update the existence query to also require the resource is ready and has non-empty extracted text before inserting into paper_resources. Specifically, change the select(...) used for res_result to include Resource.status == "ready" and Resource.extracted_text is not null/empty (or the equivalent check your Resource model uses), then if scalar_one_or_none() returns None raise HTTPException with a message like "Resource {id} not ready or has no extracted text." Use the same symbols (data.resources, res_link.resource_id, current_user.id, Resource, res_result, paper_resources.insert(), new_paper.id) to locate and modify the code.

coderabbitai · 2026-05-18T23:59:39Z

+    from ..models.job import Job
+    new_job = Job(
+        user_id=current_user.id,
+        job_type="generate_paper",
+        status="queued",
+        ref_id=new_paper.id
+    )
+    db.add(new_job)
+    await db.flush()
+
+    # 4. Enqueue background task
+    redis = await create_pool(RedisSettings.from_dsn(settings.REDIS_URL))
+    try:
+        job = await redis.enqueue_job("generate_paper_task", str(new_paper.id), str(new_job.id))
+        if job is None:
+            raise RuntimeError("Failed to enqueue generate_paper_task")
+        await db.commit()
+    except Exception as e:
+        logger.error(f"Redis enqueue error: {e}")
+        new_paper.status = "failed"
+        await db.commit()
+        raise HTTPException(
+            status_code=status.HTTP_503_SERVICE_UNAVAILABLE, 
+            detail="Paper generation queued failed, please retry."
+        )
+    finally:
+        await redis.close()


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Commit the Job row before passing its ID to generate_paper_task.

Line 154 hands ARQ an uncommitted new_job.id. If the worker starts before the transaction commits, generate_paper_task can observe no row and crash when it later dereferences job. The exception path also commits the flushed queued job, leaving a phantom job for work that never ran.

🧰 Tools

🪛 Ruff (0.15.13)

[warning] 158-158: Do not catch blind exception: Exception

(BLE001)

[warning] 162-165: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/papers.py` around lines 141 - 167, The Job row is committed only after enqueue, so the worker may see a missing Job id; move the DB commit for the newly created Job (the new_job created and currently flushed via await db.flush()) to occur before calling redis.enqueue_job for "generate_paper_task" so the worker always observes the row; additionally, update the exception path around redis.enqueue_job to remove or mark and commit the new_job consistently (e.g., delete the phantom new_job or set its status to "failed" and commit) to avoid leaving a queued Job that never ran; reference new_job, await db.flush(), await db.commit(), redis.enqueue_job("generate_paper_task") and generate_paper_task when making the change.

coderabbitai · 2026-05-18T23:59:39Z

+    if output:
+        if output.pdf_url:
+            object_name = output.pdf_url.replace(f"{settings.SPACES_PUBLIC_URL}/", "")
+            storage_service.delete_file(object_name)
+        if output.question_pdf_url:
+            object_name = output.question_pdf_url.replace(f"{settings.SPACES_PUBLIC_URL}/", "")
+            storage_service.delete_file(object_name)
+
+    # Deleting the paper will cascade and delete PaperOutput and paper_resources automatically
+    await db.delete(paper)
+    await db.commit()


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't drop the paper record if PDF cleanup fails.

Both storage deletions ignore their return value. If either object stays in Spaces, the paper disappears from the app but its generated PDFs remain accessible and billable. Mirror the resource delete flow: fail the request or at least surface/log cleanup failure before committing.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/papers.py` around lines 250 - 260, The current deletion path removes the Paper (await db.delete(paper); await db.commit()) even if storage deletions via storage_service.delete_file for output.pdf_url or output.question_pdf_url fail, risking orphaned billable files; modify the logic in the papers deletion flow to check the result/raise on failure of storage_service.delete_file (or catch and log+raise) for both output.pdf_url and output.question_pdf_url (use the same pattern as the resource delete flow), and only proceed to call db.delete(paper) and db.commit() when both deletions succeed (or surface the cleanup failure to the caller), referencing output.pdf_url, output.question_pdf_url, storage_service.delete_file, db.delete(paper), and db.commit() to locate the code to change.

coderabbitai · 2026-05-18T23:59:39Z

+
+# Set up Jinja2 environment
+template_dir = os.path.join(os.path.dirname(os.path.dirname(__file__)), "templates")
+env = Environment(loader=FileSystemLoader(template_dir))


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Enable autoescape to prevent XSS in generated PDFs.

The Jinja2 Environment is created without autoescape=True, which means user-provided content in questions could contain malicious HTML or JavaScript that would be rendered directly into the PDF. If generated PDFs are viewed in browsers, this creates an XSS vulnerability.

🔒 Proposed fix to enable autoescape

-env = Environment(loader=FileSystemLoader(template_dir)) +env = Environment(loader=FileSystemLoader(template_dir), autoescape=True)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

env = Environment(loader=FileSystemLoader(template_dir))

env = Environment(loader=FileSystemLoader(template_dir), autoescape=True)

🧰 Tools

🪛 Ruff (0.15.13)

[error] 9-9: By default, jinja2 sets autoescape to False. Consider using autoescape=True or the select_autoescape function to mitigate XSS vulnerabilities.

(S701)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/services/pdf.py` at line 9, The Jinja2 Environment is created without autoescaping which can cause XSS when rendering user content; update the Environment instantiation (the env variable created with Environment(loader=FileSystemLoader(template_dir))) to enable autoescape for HTML templates (e.g., pass autoescape=True or use select_autoescape for HTML/XML templates) so user-provided question content is escaped by default when rendering PDFs.

coderabbitai · 2026-05-18T23:59:39Z

+def map_option_letter(index: str) -> str:
+    """Helper to map 0, 1, 2, 3 to A, B, C, D"""
+    return chr(65 + int(index))


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add bounds checking to prevent invalid option letters.

The map_option_letter function converts a string index to a letter without validating bounds. If the index is outside 0-3 range (for typical A-D options), it will produce unexpected characters or raise an error for non-numeric input.

🛡️ Proposed fix with bounds validation

def map_option_letter(index: str) -> str: - """Helper to map 0, 1, 2, 3 to A, B, C, D""" - return chr(65 + int(index)) + """Helper to map 0, 1, 2, 3 to A, B, C, D""" + try: + idx = int(index) + if idx < 0 or idx > 25: + raise ValueError(f"Index {idx} out of range for option letters") + return chr(65 + idx) + except (ValueError, TypeError) as e: + raise ValueError(f"Invalid option index: {index}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def map_option_letter(index: str) -> str:

"""Helper to map 0, 1, 2, 3 to A, B, C, D"""

return chr(65 + int(index))

def map_option_letter(index: str) -> str:

"""Helper to map 0, 1, 2, 3 to A, B, C, D"""

try:

idx = int(index)

if idx < 0 or idx > 25:

raise ValueError(f"Index {idx} out of range for option letters")

return chr(65 + idx)

except (ValueError, TypeError) as e:

raise ValueError(f"Invalid option index: {index}") from e

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/services/pdf.py` around lines 11 - 13, The map_option_letter function needs input validation: in map_option_letter convert the incoming index to an int inside a try/except and raise a clear ValueError for non-numeric input, then check that the int is within the allowed range (0 through 3) and raise a ValueError if out of bounds; only after those checks return chr(65 + idx). Use the function name map_option_letter in your changes so the reviewer can find the updated logic.

coderabbitai · 2026-05-18T23:59:39Z

+             <Card 
+                onClick={() => navigate('/resources')}
+                className="border-dashed bg-muted/20 hover:bg-muted/30 transition-all cursor-pointer group hover:border-primary/50"
+              >
+                <CardHeader className="flex flex-row items-center gap-4 py-6">
+                   <div className="size-10 bg-background rounded-lg flex items-center justify-center shadow-sm border border-border/50 group-hover:scale-110 transition-transform">
+                      <FileText className="size-5 text-primary" />
+                   </div>
+                   <div className="space-y-0.5">
+                      <CardTitle className="text-base">Upload Resources</CardTitle>
+                      <CardDescription className="text-xs">Add new study materials</CardDescription>
+                   </div>
+                   <ArrowRight className="size-4 ml-auto text-muted-foreground group-hover:text-primary transition-colors" />
+                </CardHeader>
+             </Card>
+
+             <Card 
+                onClick={() => navigate('/solver')}
+                className="border-dashed bg-muted/20 hover:bg-muted/30 transition-all cursor-pointer group hover:border-yellow-500/50"
+              >
+                <CardHeader className="flex flex-row items-center gap-4 py-6">
+                   <div className="size-10 bg-background rounded-lg flex items-center justify-center shadow-sm border border-border/50 group-hover:scale-110 transition-transform">
+                      <Zap className="size-5 text-yellow-500" />
+                   </div>
+                   <div className="space-y-0.5">
+                      <CardTitle className="text-base">Ask AI Tutor</CardTitle>
+                      <CardDescription className="text-xs">Get answers from your notes</CardDescription>
+                   </div>
+                   <ArrowRight className="size-4 ml-auto text-muted-foreground group-hover:text-yellow-500 transition-colors" />
+                </CardHeader>
+             </Card>
+
+             <Card 
+                onClick={() => navigate('/generator')}
+                className="border-dashed bg-muted/20 hover:bg-muted/30 transition-all cursor-pointer group hover:border-purple-500/50"
+              >
+                <CardHeader className="flex flex-row items-center gap-4 py-6">
+                   <div className="size-10 bg-background rounded-lg flex items-center justify-center shadow-sm border border-border/50 group-hover:scale-110 transition-transform">
+                      <FileEdit className="size-5 text-purple-500" />
+                   </div>
+                   <div className="space-y-0.5">
+                      <CardTitle className="text-base">Generate Paper</CardTitle>
+                      <CardDescription className="text-xs">Create a new mock exam</CardDescription>
+                   </div>
+                   <ArrowRight className="size-4 ml-auto text-muted-foreground group-hover:text-purple-500 transition-colors" />
+                </CardHeader>
+             </Card>


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use real links/buttons for these clickable cards.

These Card wrappers only handle pointer clicks. Keyboard users can't tab to them or activate them, so the dashboard's main navigation is inaccessible. Render them as <Link>/<button> (or use asChild with one of those) instead of bare clickable containers.

Also applies to: 289-333

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/App.tsx` around lines 229 - 275, The Card components used for navigation (the Card elements with onClick={() => navigate('/resources')}, onClick={()=>navigate('/solver')} and onClick={()=>navigate('/generator')} inside App.tsx) are not keyboard-accessible; replace these clickable-only containers with semantic interactive elements by rendering the Card as a real Link or button (or use the Card's asChild prop to wrap a react-router <Link to="..."> or a <button>) so they receive focus, support keyboard activation, and preserve the existing styling and CardHeader/CardTitle/CardDescription layout; repeat the same change for the other Card blocks noted around lines 289-333.

coderabbitai · 2026-05-18T23:59:39Z

+    // --- QUOTA CHECK ---
+    const now = new Date();
+    const startOfMonth = new Date(now.getFullYear(), now.getMonth(), 1);
+    const monthlyPapers = papers?.filter((p: any) => new Date(p.created_at) >= startOfMonth) || [];
+
+    if (user?.plan === 'free' && monthlyPapers.length >= 3) {
+      toast.error('limit exceed, upgrade to continue');
+      return;
+    }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't hardcode the monthly paper limit in the client.

The backend now enforces settings.PAPERS_LIMIT, but this preflight still blocks at 3. Any env override will make the UI reject valid generations or display a different quota than the API actually uses. Fetch the limit from the backend or rely on the server-side 403 instead.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/pages/Generator.tsx` around lines 238 - 246, The client-side quota check in Generator.tsx currently hardcodes the monthly limit as 3 (see monthlyPapers, papers, user) which can diverge from backend settings; replace this by either fetching the canonical limit from the backend (e.g., call the existing settings or config API to get PAPERS_LIMIT before comparing monthlyPapers.length) and use that value instead of 3, or remove the preflight check entirely and handle the server-side 403/error from the generation API (showing toast.error based on the API response). Ensure you use the same identifiers (monthlyPapers, papers, user, toast) so the change integrates with the existing logic.

open an issue for it for later fixing

@CodeRabbit open an issue for it for later fixing

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash gh issue create \ --repo shubhamxdd/pyq-appl \ --title "fix: Remove hardcoded monthly paper limit (3) from client-side quota check in Generator.tsx" \ --body "## Problem In \`frontend/src/pages/Generator.tsx\` (lines 238–246), the client-side quota preflight check hardcodes the monthly paper limit as \`3\`: \`\`\`ts if (user?.plan === 'free' && monthlyPapers.length >= 3) { toast.error('limit exceed, upgrade to continue'); return; } \`\`\` However, the backend enforces this limit via \`settings.PAPERS_LIMIT\` (currently defaulting to \`3\` in \`backend/app/config.py\`), which can be overridden through environment variables. If the backend value is changed, the UI will still block or allow generation incorrectly, leading to a mismatch between client and server behaviour. ## Expected Behaviour The UI quota check should reflect the same limit as the backend. Either: - Fetch the canonical limit from the backend (e.g., a config/settings endpoint) before comparing \`monthlyPapers.length\`, or - Remove the preflight check entirely and handle the server-side \`403\`/error response from the paper creation API (surfacing the error via \`toast.error\` based on \`error.response?.data?.detail\`). ## References - PR: https://github.com/shubhamxdd/pyq-appl/pull/8 - Review comment: https://github.com/shubhamxdd/pyq-appl/pull/8#discussion_r3262861973 /cc `@shubhamxdd`" \ --assignee shubhamxdd

Length of output: 48

🏁 Script executed:

#!/bin/bash gh issue list --repo shubhamxdd/pyq-appl --state open --limit 5

Length of output: 339

@shubhamxdd Done! I've opened issue #9 — fix: Remove hardcoded monthly paper limit (3) from client-side quota check in Generator.tsx — assigned to you for later fixing.

🧩 Analysis chain

@shubhamxdd, I've opened a GitHub issue to track this for later. You can find it above once the script runs — it's assigned to you and links back to this PR and review comment.

🏁 Script executed:

#!/bin/bash gh issue create \ --repo shubhamxdd/pyq-appl \ --title "fix: Remove hardcoded monthly paper limit from client-side quota check in Generator.tsx" \ --assignee shubhamxdd \ --body "## Problem In \`frontend/src/pages/Generator.tsx\` (lines 238–246), the client-side quota preflight check hardcodes the monthly paper limit as \`3\`: \`\`\`ts if (user?.plan === 'free' && monthlyPapers.length >= 3) { toast.error('limit exceed, upgrade to continue'); return; } \`\`\` The backend already enforces this limit via \`settings.PAPERS_LIMIT\` (defaulting to \`3\` in \`backend/app/config.py\`), but any environment override on the server side will cause the UI to silently reject valid generation requests or display a quota that differs from what the API actually enforces. ## Suggested Fix Either: 1. Expose \`PAPERS_LIMIT\` from a backend config/settings endpoint and fetch it before the comparison, or 2. Remove the preflight check entirely and rely on the server-side \`403\` / error response, surfacing it via \`toast.error\` from the \`onError\` handler of \`createPaperMutation\`. ## References - PR: https://github.com/shubhamxdd/pyq-appl/pull/8 - Review comment: https://github.com/shubhamxdd/pyq-appl/pull/8#discussion_r3262861973 Flagged by `@coderabbitai`, requested to track by `@shubhamxdd`."

Length of output: 43

🏁 Script executed:

#!/bin/bash gh api \ --method POST \ -H "Accept: application/vnd.github+json" \ /repos/shubhamxdd/pyq-appl/issues \ -f title="fix: Remove hardcoded monthly paper limit from client-side quota check in Generator.tsx" \ -f assignees[]="shubhamxdd" \ -f body="## Problem In \`frontend/src/pages/Generator.tsx\` (lines 238–246), the client-side quota preflight check hardcodes the monthly paper limit as \`3\`: \`\`\`ts if (user?.plan === 'free' && monthlyPapers.length >= 3) { toast.error('limit exceed, upgrade to continue'); return; } \`\`\` The backend already enforces this limit via \`settings.PAPERS_LIMIT\` (defaulting to \`3\` in \`backend/app/config.py\`), but any environment override on the server side will cause the UI to silently reject valid generation requests or display a quota that differs from what the API actually enforces. ## Suggested Fix Either: 1. Expose \`PAPERS_LIMIT\` from a backend config/settings endpoint and fetch it before the comparison, or 2. Remove the preflight check entirely and rely on the server-side \`403\` / error response, surfacing it via \`toast.error\` from the \`onError\` handler of \`createPaperMutation\`. ## References - PR: https://github.com/shubhamxdd/pyq-appl/pull/8 - Review comment: https://github.com/shubhamxdd/pyq-appl/pull/8#discussion_r3262861973 Flagged by `@coderabbitai`, requested to track by `@shubhamxdd`."

Length of output: 5515

@shubhamxdd, issue #10 has been created and assigned to you: fix: Remove hardcoded monthly paper limit from client-side quota check in Generator.tsx. You can address it in a future PR. 🐇

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

shubhamxdd added 3 commits May 19, 2026 01:58

feat: Enhance paper creation with resource validation and error handling

ea8adb4

feat: Add selected_resource_ids to chat_sessions and update related f…

334514d

…unctionality

feat: Improve session selection handling and update navigation logic

811d17c

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

shubhamxdd added 15 commits May 19, 2026 02:51

feat: Enhance paper format configuration UI with input fields and dyn…

8fd6005

…amic badges

feat: Implement PDF generation for papers with download functionality

8c0f79b

feat: Add question_pdf_url to paper_outputs and update PDF generation…

3dad1ef

… logic for different modes in this commit i added feature in which it only adds questions to pdf not answer !

feat: Update generator route to accept optional paperId and sync stat…

dd3fee4

…e with URL like in chat we have different url (id) for each chat i did similar thing here for each sample paper

feat: Enhance job tracking for paper and resource processing with job…

6cf4aeb

… status updates

feat: Implement quota enforcement for free users on paper and resourc…

271a024

…e generation

feat: Implement monthly quota reset for free users and enhance scroll…

dc4b94a

…bar styling in the frontend

Finalize Paper Generator: implement persistent context, URL-based rou…

7ed7b85

…ting, dual PDF export (Full/Questions), abort/delete functionality, paper renaming, and dashboard metrics.

error fix

fde1188

feat: Update paper creation button logic to disable based on question…

c1eecaf

… pattern validation

feat: Add user query and implement monthly paper limit check for free…

81da731

… users

refactor: Remove unused imports from Generator component

c8139f8

feat: Enhance paper listing and dashboard with resource and question …

61f54fe

…counts

feat: Implement PDF export functionality for study guides and questio…

bce477e

…n papers

docs: Update architecture and standards for improved PDF generation a…

cfc15d2

…nd background job tracking

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

coderabbitai Bot mentioned this pull request May 19, 2026

fix: Remove hardcoded monthly paper limit (3) from client-side quota check in Generator.tsx #9

Open

shubhamxdd merged commit e2b36c3 into main May 19, 2026
1 check passed

This was referenced May 19, 2026

fix: Remove hardcoded monthly paper limit from client-side quota check in Generator.tsx #10

Closed

Posthog analytics #14

Merged

		except Exception as e:
		raise HTTPException(status_code=500, detail=f"Failed to detect format: {str(e)}")

	env = Environment(loader=FileSystemLoader(template_dir))
	env = Environment(loader=FileSystemLoader(template_dir), autoescape=True)

Conversation

shubhamxdd commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

shubhamxdd commented May 18, 2026

Uh oh!

coderabbitai Bot commented May 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shubhamxdd May 19, 2026

Choose a reason for hiding this comment

Uh oh!

shubhamxdd May 19, 2026

Choose a reason for hiding this comment

Uh oh!

shubhamxdd commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

coderabbitai Bot May 18, 2026 •

edited

Loading