Feature/pyq solver#4
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughImplements Phase 4 (PYQ Solver): adds chat sessions and session-linked questions, an OpenRouter streaming LLM client with prompts, a solver FastAPI router with SSE streaming and session management, resource upload/stop/update endpoints and worker progress, and a full React Solver UI with resource selection. ChangesAI-Powered Question Answering with Document Context and Streaming
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 16
🧹 Nitpick comments (3)
backend/app/routers/resources.py (1)
62-66: ⚡ Quick winAdd exception chaining for proper error propagation.
The catch-all exception handler should use
from eto preserve the original exception context, making debugging easier when PDF parsing fails.🔗 Proposed fix
except Exception as e: raise HTTPException( status_code=status.HTTP_400_BAD_REQUEST, - detail=f"Invalid or corrupted PDF file: {str(e)}" - ) + detail=f"Invalid or corrupted PDF file: {e}" + ) from e🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/resources.py` around lines 62 - 66, In the except Exception as e block where you raise HTTPException with status.HTTP_400_BAD_REQUEST and detail "Invalid or corrupted PDF file: {str(e)}", preserve the original exception context by re-raising the HTTPException using exception chaining (i.e., raise ... from e) so the original traceback is retained; update the raise statement that constructs the HTTPException to include "from e" while keeping the same status and detail values.backend/app/llm/client.py (1)
41-47: ⚡ Quick winNarrow the parser exception handling and keep failure visibility.
Line 46 catches all exceptions and silently drops them, which can mask protocol/schema drift and truncate answers without signals.
Proposed fix
+import logging @@ +logger = logging.getLogger(__name__) @@ - except Exception: - continue + except (json.JSONDecodeError, KeyError, IndexError, TypeError) as exc: + logger.warning("Failed to parse OpenRouter SSE chunk: %s", exc) + continue🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/llm/client.py` around lines 41 - 47, The current broad except hides parsing/schema issues; narrow it to catch json.JSONDecodeError, KeyError and TypeError around json.loads(data_str) and accessing data['choices'][0]['delta'], log the failure (include data_str or chunk context) so truncation/protocol drift is visible, and re-raise or let other unexpected exceptions propagate; update the try/except around data_str, data and chunk extraction (the variables data_str, data, chunk) to implement these specific catches and logging.backend/alembic/versions/d3dcf6b6fe3f_add_chat_sessions_and_processing_.py (1)
24-35: ⚡ Quick winAdd indexes for
chat_sessions.user_idandquestions.session_idto protect chat history/session queries.These keys are likely on the read path for listing sessions and session history, so leaving them unindexed will degrade quickly with data growth.
Proposed fix
op.create_table('chat_sessions', @@ ) + op.create_index("ix_chat_sessions_user_id", "chat_sessions", ["user_id"], unique=False) op.add_column('questions', sa.Column('session_id', sa.UUID(), nullable=True)) @@ op.create_foreign_key( @@ ) + op.create_index("ix_questions_session_id", "questions", ["session_id"], unique=False) @@ def downgrade() -> None: @@ + op.drop_index("ix_questions_session_id", table_name="questions") op.drop_constraint("fk_questions_session_id_chat_sessions", 'questions', type_='foreignkey') op.drop_column('questions', 'session_id') + op.drop_index("ix_chat_sessions_user_id", table_name="chat_sessions") op.drop_table('chat_sessions')Also applies to: 42-45
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/alembic/versions/d3dcf6b6fe3f_add_chat_sessions_and_processing_.py` around lines 24 - 35, The migration created chat_sessions.user_id and questions.session_id but did not add indexes, which will slow session listing and history queries; update the Alembic revision (in the same file where op.create_table('chat_sessions') and op.add_column('questions', ... 'session_id')) to call op.create_index for chat_sessions.user_id and for questions.session_id (use clear names like ix_chat_sessions_user_id and ix_questions_session_id) so those FK columns are indexed; also add the corresponding op.drop_index calls in the downgrade if present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/alembic/versions/d3dcf6b6fe3f_add_chat_sessions_and_processing_.py`:
- Line 34: The migration creates an unnamed FK with op.create_foreign_key(None,
'questions', 'chat_sessions', ['session_id'], ['id'], ondelete='CASCADE'), which
makes downgrade unreliable; change that call to provide an explicit constraint
name (e.g. 'fk_questions_session_id_chat_sessions_id') and update the
corresponding op.drop_constraint(...) in the downgrade to use that exact name so
the constraint can be deterministically dropped. Ensure both places reference
the same unique name and keep the ondelete='CASCADE' behavior.
In `@backend/app/llm/client.py`:
- Around line 27-30: In stream_chat (backend/app/llm/client.py) don't yield
non-200 HTTP responses as content; instead read the response body (await
response.aread()), decode it and raise a RuntimeError containing the status code
and error text so upstream event_generator() in solver.py can catch it and mark
the answer as failed; replace the yield-and-return branch that yields f"Error:
{response.status_code} - {error_text.decode()}" with a raised RuntimeError
carrying the same message.
In `@backend/app/models/chat_session.py`:
- Line 17: The relationship on ChatSession named "questions" currently uses
cascade="all, delete-orphan" which will delete Question rows when they are
removed from the session collection even though Question.session_id is nullable;
remove the "delete-orphan" behavior by updating the ChatSession.questions
relationship (keep or set cascade to "all" or omit cascade entirely) so
Questions can be orphaned instead of being deleted; verify the related
back_populates ("session") remains unchanged and that Question.session_id stays
nullable.
In `@backend/app/routers/resources.py`:
- Around line 219-221: Before setting resource.status = "failed", check the
resource's current status and only change it if it's still in progress (e.g.,
resource.status == "processing"); if it's already "ready" (or any terminal
state) do not overwrite it. Locate the place where resource.status is assigned
(the line setting resource.status = "failed") and wrap it in a conditional that
validates the current status, optionally logging or returning a no-op when the
resource is not "processing", and ensure any DB/session commit only happens when
the status is actually changed.
In `@backend/app/routers/solver.py`:
- Around line 131-132: The quota check and increment must be made atomic to
avoid race conditions: replace the separate check of current_user.questions_used
and the later increment with a single conditional database update that increases
questions_used only if it's still below the limit and then check the number of
rows affected; if the update affects 0 rows, raise the HTTPException.
Concretely, in the handler that currently reads current_user.plan and
current_user.questions_used and later increments questions_used, perform an
UPDATE (or SQLAlchemy .filter(...).update(...)) against the user row (filtering
by the user id and questions_used < 10 for the "free" plan) to increment
questions_used by 1, commit, and if the update returned 0 rows affected raise
the same HTTPException that you currently raise when the quota is exceeded. Use
the existing symbols current_user.questions_used, current_user.plan and the
raise HTTPException logic to preserve behavior.
- Around line 161-168: When a caller supplies data.session_id you must verify
the session belongs to current_user before using it; instead of blindly
assigning session_id = data.session_id, query ChatSession for that id and
user_id == current_user.id and if not found return/raise an authorization error
(or create a new session only when session_id is empty). Keep the existing
creation flow for when session_id is falsy (use
ChatSession(user_id=current_user.id, title=...), db.add, await db.flush(),
session_id = new_sess.id), but do not persist or use a caller-supplied
session_id unless the ownership check on ChatSession passes. Ensure checks are
applied wherever session_id is consumed (the code around variables session_id,
data.session_id, ChatSession, db.add, await db.flush, new_sess.id).
- Line 174: The code currently hardcodes delivery_mode="stream" when saving the
question; change it to persist the delivery_mode provided by the incoming model
(use QuestionCreate.delivery_mode) so the stored record reflects the requested
mode. Locate the save/create call that contains delivery_mode="stream" (in the
handler that accepts a QuestionCreate payload) and replace the literal with the
payload's delivery_mode attribute (falling back to the existing default only if
delivery_mode is None). Ensure any type/conversion matches the storage field
used by the model/class handling persistence.
- Around line 115-116: The update currently skips any falsy title so sending
{"title": ""} silently no-ops; change the check to "if data.title is not None"
in the function that updates the session (the block assigning session.title =
data.title) so explicit null means "no change" and empty string is handled;
additionally validate and reject blank titles (e.g., if data.title == "" raise a
400/BadRequest with a clear message) before assigning to session.title to ensure
deterministic behavior.
- Around line 211-213: The SSE except block currently yields str(e) to clients
which can leak internals; change the yield in the except block to send a generic
error payload (e.g. {"error":"Internal server error"} or a short safe message)
instead of str(e), and record the full exception details server-side using your
logger (e.g. logger.exception or similar) before opening the async DB
SessionLocal block. Update the exception handler where you currently yield
f"data: {json.dumps({'error': str(e)})}\n\n" to yield a non-sensitive message
and add a server-side log of the real exception for debugging.
In `@backend/app/schemas/question.py`:
- Line 10: The delivery_mode field is currently an unconstrained str
(delivery_mode) and must be restricted to supported values; update both
occurrences of delivery_mode in this schema to use a constrained type (e.g.,
typing.Literal["stream","background"] or a small Enum class) instead of plain
str, and ensure the Pydantic model uses that type so invalid values are
rejected/validated on input and persisted correctly.
In `@backend/app/workers/tasks.py`:
- Line 41: Remove the unnecessary f-string prefixes on the plain string print
calls: locate the print statements that use f-strings with no placeholders (for
example print(f"📥 [STEP 1/3] Downloading PDF from Storage..."), and the similar
prints at the other occurrences: print(f"📤 [STEP 2/3] Extracting text from
PDF..."), print(f"✅ [STEP 3/3] Uploading results..."), and print(f"⚠️ No text
extracted from PDF, aborting.") and change them to regular string literals
(print("...")) so they are not formatted f-strings; keep the message text
unchanged.
- Around line 64-70: The current cancellation check uses check_res.scalar_one()
which raises if the Resource was deleted mid-processing; change this to
check_res.scalar_one_or_none() and handle the None case by aborting the loop
gracefully (e.g., log/print the cancellation and return). Update the block using
SessionLocal(), select(Resource).where(Resource.id == resource_id), check_res,
and db_res to use scalar_one_or_none(), check if db_res is None, and if so
perform the same abort/return behavior instead of allowing an exception to
propagate.
In `@frontend/src/api/solver.ts`:
- Around line 39-45: The fetch call in frontend/src/api/solver.ts currently
hardcodes 'http://127.0.0.1:8001/api/solver/ask' and blindly sets
'Authorization: Bearer ${token}' where token may be null; replace the hardcoded
URL with the configured API base URL + solver path (use your shared API
config/constant used elsewhere) when calling fetch, and only include the
Authorization header if token is non-null/defined (or fallback to the configured
auth path/mechanism). Locate the token variable and the fetch(...) invocation in
this file and update them to use the shared API base URL and a conditional
Authorization header.
In `@frontend/src/pages/Solver.tsx`:
- Around line 118-124: The SSE parsing must buffer partial chunks instead of
assuming each read contains full newline-delimited events: introduce a
persistent buffer (e.g., sseBuffer) outside the read loop, append
decoder.decode(value) to it, split sseBuffer by '\n', process all lines except
the final fragment (keep that final fragment in sseBuffer for the next read),
and only attempt JSON.parse on complete lines that start with 'data: '; on parse
errors, only suppress if the line is the trailing fragment (i.e., incomplete),
otherwise log/handle the error; also ensure you treat "data: [DONE]" as a
terminal event. Reference symbols to update: decoder, value, chunk, lines, line,
and the read loop handling in Solver.tsx.
- Around line 127-130: The update in setMessages mutates
newMessages[newMessages.length - 1] without ensuring there is a last element,
which can throw if messages were replaced mid-stream; update the callback in
setMessages (the state updater used when streaming responses in Solver.tsx) to
first check prev.length > 0 (or find the last message safely) before mutating,
and if no last message exists, append a new message object or return prev
unchanged so the stream update becomes a no-op; ensure you reference setMessages
and the streaming update logic when applying this guard.
- Around line 52-55: The onSuccess handler for the session-delete mutation
currently clears activeSessionId unconditionally; update the handler in
Solver.tsx (the onSuccess passed to the delete mutation) to only call
setActiveSessionId(null) when the deleted session id matches the current
activeSessionId (i.e., get the deleted session id from the mutation response or
context, compare it to activeSessionId, and only clear if equal), while still
invalidating the ['sessions'] query and resetting messages as before.
---
Nitpick comments:
In `@backend/alembic/versions/d3dcf6b6fe3f_add_chat_sessions_and_processing_.py`:
- Around line 24-35: The migration created chat_sessions.user_id and
questions.session_id but did not add indexes, which will slow session listing
and history queries; update the Alembic revision (in the same file where
op.create_table('chat_sessions') and op.add_column('questions', ...
'session_id')) to call op.create_index for chat_sessions.user_id and for
questions.session_id (use clear names like ix_chat_sessions_user_id and
ix_questions_session_id) so those FK columns are indexed; also add the
corresponding op.drop_index calls in the downgrade if present.
In `@backend/app/llm/client.py`:
- Around line 41-47: The current broad except hides parsing/schema issues;
narrow it to catch json.JSONDecodeError, KeyError and TypeError around
json.loads(data_str) and accessing data['choices'][0]['delta'], log the failure
(include data_str or chunk context) so truncation/protocol drift is visible, and
re-raise or let other unexpected exceptions propagate; update the try/except
around data_str, data and chunk extraction (the variables data_str, data, chunk)
to implement these specific catches and logging.
In `@backend/app/routers/resources.py`:
- Around line 62-66: In the except Exception as e block where you raise
HTTPException with status.HTTP_400_BAD_REQUEST and detail "Invalid or corrupted
PDF file: {str(e)}", preserve the original exception context by re-raising the
HTTPException using exception chaining (i.e., raise ... from e) so the original
traceback is retained; update the raise statement that constructs the
HTTPException to include "from e" while keeping the same status and detail
values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a927a476-c749-4ef5-b6b7-ae73ea6a85b4
⛔ Files ignored due to path filters (1)
frontend/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (24)
backend/alembic/versions/d3dcf6b6fe3f_add_chat_sessions_and_processing_.pybackend/app/llm/client.pybackend/app/llm/prompts.pybackend/app/main.pybackend/app/models/__init__.pybackend/app/models/chat_session.pybackend/app/models/question.pybackend/app/models/resource.pybackend/app/models/user.pybackend/app/routers/resources.pybackend/app/routers/solver.pybackend/app/schemas/chat_session.pybackend/app/schemas/question.pybackend/app/schemas/resource.pybackend/app/workers/tasks.pycontext/progress-tracker.mddocker-compose.ymlfrontend/package.jsonfrontend/src/App.tsxfrontend/src/api/resources.tsfrontend/src/api/solver.tsfrontend/src/components/Layout.tsxfrontend/src/pages/Resources.tsxfrontend/src/pages/Solver.tsx
| if data.title: | ||
| session.title = data.title |
There was a problem hiding this comment.
Handle explicit title updates deterministically.
Line 115 only updates for truthy values, so {"title": ""} silently no-ops with 200. Use an explicit is not None check and reject blank titles.
Proposed fix
- if data.title:
- session.title = data.title
+ if data.title is not None:
+ title = data.title.strip()
+ if not title:
+ raise HTTPException(status_code=400, detail="Title cannot be empty")
+ session.title = title🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/routers/solver.py` around lines 115 - 116, The update currently
skips any falsy title so sending {"title": ""} silently no-ops; change the check
to "if data.title is not None" in the function that updates the session (the
block assigning session.title = data.title) so explicit null means "no change"
and empty string is handled; additionally validate and reject blank titles
(e.g., if data.title == "" raise a 400/BadRequest with a clear message) before
assigning to session.title to ensure deterministic behavior.
| const chunk = decoder.decode(value); | ||
| const lines = chunk.split('\n'); | ||
|
|
||
| for (const line of lines) { | ||
| if (line.startsWith('data: ')) { | ||
| try { | ||
| const data = JSON.parse(line.slice(6)); |
There was a problem hiding this comment.
SSE chunk parsing is not frame-safe and can drop streamed content.
Line 119 assumes each read contains full newline-delimited events; partial JSON across chunks is discarded in the catch block on Line 138, causing missing tokens.
Suggested fix
- const decoder = new TextDecoder();
+ const decoder = new TextDecoder();
+ let sseBuffer = '';
...
- const chunk = decoder.decode(value);
- const lines = chunk.split('\n');
+ sseBuffer += decoder.decode(value, { stream: true });
+ const lines = sseBuffer.split('\n');
+ sseBuffer = lines.pop() ?? '';
...
- if (line.startsWith('data: ')) {
+ if (line.startsWith('data: ')) {
try {
const data = JSON.parse(line.slice(6));Also applies to: 138-139
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@frontend/src/pages/Solver.tsx` around lines 118 - 124, The SSE parsing must
buffer partial chunks instead of assuming each read contains full
newline-delimited events: introduce a persistent buffer (e.g., sseBuffer)
outside the read loop, append decoder.decode(value) to it, split sseBuffer by
'\n', process all lines except the final fragment (keep that final fragment in
sseBuffer for the next read), and only attempt JSON.parse on complete lines that
start with 'data: '; on parse errors, only suppress if the line is the trailing
fragment (i.e., incomplete), otherwise log/handle the error; also ensure you
treat "data: [DONE]" as a terminal event. Reference symbols to update: decoder,
value, chunk, lines, line, and the read loop handling in Solver.tsx.
| setMessages(prev => { | ||
| const newMessages = [...prev]; | ||
| newMessages[newMessages.length - 1].content = accumulatedResponse; | ||
| return newMessages; |
There was a problem hiding this comment.
Guard streamed message update before mutating the last item.
Line 129 assumes the last message always exists; if messages are replaced mid-stream (e.g., session switch/history sync), this can throw at runtime.
Suggested fix
setMessages(prev => {
const newMessages = [...prev];
- newMessages[newMessages.length - 1].content = accumulatedResponse;
+ const last = newMessages[newMessages.length - 1];
+ if (!last || last.role !== 'assistant') return prev;
+ last.content = accumulatedResponse;
return newMessages;
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| setMessages(prev => { | |
| const newMessages = [...prev]; | |
| newMessages[newMessages.length - 1].content = accumulatedResponse; | |
| return newMessages; | |
| setMessages(prev => { | |
| const newMessages = [...prev]; | |
| const last = newMessages[newMessages.length - 1]; | |
| if (!last || last.role !== 'assistant') return prev; | |
| last.content = accumulatedResponse; | |
| return newMessages; | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@frontend/src/pages/Solver.tsx` around lines 127 - 130, The update in
setMessages mutates newMessages[newMessages.length - 1] without ensuring there
is a last element, which can throw if messages were replaced mid-stream; update
the callback in setMessages (the state updater used when streaming responses in
Solver.tsx) to first check prev.length > 0 (or find the last message safely)
before mutating, and if no last message exists, append a new message object or
return prev unchanged so the stream update becomes a no-op; ensure you reference
setMessages and the streaming update logic when applying this guard.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (2)
frontend/src/pages/Solver.tsx (2)
129-132:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard the streamed assistant update before mutating the tail message.
This still assumes the last entry exists and is the placeholder assistant message. If history sync or a session change replaces
messagesmid-stream,newMessages[newMessages.length - 1]can beundefinedand throw here.Suggested fix
setMessages(prev => { const newMessages = [...prev]; - newMessages[newMessages.length - 1].content = accumulatedResponse; + const last = newMessages[newMessages.length - 1]; + if (!last || last.role !== 'assistant') return prev; + last.content = accumulatedResponse; return newMessages; });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/pages/Solver.tsx` around lines 129 - 132, The update to the streamed assistant message assumes a tail message exists and is the assistant placeholder; change the setMessages callback (the block that creates newMessages and assigns newMessages[newMessages.length - 1].content = accumulatedResponse) to first guard that newMessages.length > 0 and that the last message is the assistant placeholder (e.g., check lastMessage.role === 'assistant' or a specific placeholder id); only mutate content when that guard passes, otherwise either append a new assistant message with accumulatedResponse or skip the mutation to avoid accessing undefined. Ensure you reference the same symbols (setMessages, newMessages, accumulatedResponse) when making the check and update.
120-140:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBuffer SSE frames across reads instead of parsing each chunk in isolation.
This is still not frame-safe. A JSON event or UTF-8 sequence split across
reader.read()calls gets dropped by the catch block, which can silently lose streamed tokens. Keep a persistent buffer, decode with{ stream: true }, and handledata: [DONE]explicitly.Suggested fix
const reader = response.body?.getReader(); const decoder = new TextDecoder(); let accumulatedResponse = ''; + let sseBuffer = ''; if (reader) { while (true) { const { done, value } = await reader.read(); if (done) break; - const chunk = decoder.decode(value); - const lines = chunk.split('\n'); + sseBuffer += decoder.decode(value, { stream: true }); + const lines = sseBuffer.split('\n'); + sseBuffer = lines.pop() ?? ''; for (const line of lines) { + if (line === 'data: [DONE]') break; if (line.startsWith('data: ')) { try { const data = JSON.parse(line.slice(6)); ... - } catch (e) { /* ignore partial */ } + } catch (e) { + console.error('Failed to parse SSE frame', line, e); + } } } } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/pages/Solver.tsx` around lines 120 - 140, The stream parsing currently decodes each chunk in isolation (decoder.decode(value)) and swallows partial JSON/UTF-8 fragments in the catch, causing dropped tokens; instead introduce a persistent string buffer (e.g., streamBuffer) that appends decoder.decode(value, {stream: true}) on each reader.read() and only split by '\n' to parse complete "data: ..." lines, leaving any trailing partial line in streamBuffer for the next read; ensure you explicitly handle the "data: [DONE]" event, and continue to update accumulatedResponse, setMessages (updating the last message's content), setActiveSessionId and queryClient.invalidateQueries when complete events arrive, and surface data.error via toast.error rather than discarding parse errors.
🧹 Nitpick comments (2)
backend/app/routers/resources.py (1)
60-66: ⚡ Quick winChain the exception to preserve the original traceback.
When catching a generic exception and re-raising a different one, use
raise ... from e(to show cause) orraise ... from None(to suppress chain) so the original context isn't lost during debugging.♻️ Proposed fix
except HTTPException: raise except Exception as e: raise HTTPException( status_code=status.HTTP_400_BAD_REQUEST, - detail=f"Invalid or corrupted PDF file: {str(e)}" - ) + detail=f"Invalid or corrupted PDF file: {e!s}" + ) from None🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/resources.py` around lines 60 - 66, The except block that catches Exception and re-raises HTTPException is losing the original traceback; update the re-raise to chain the original exception by using "raise HTTPException(... ) from e" (keep the same status_code=status.HTTP_400_BAD_REQUEST and detail formatting) so the original exception context is preserved for debugging (refer to the except Exception as e handler and the HTTPException being raised).backend/app/workers/tasks.py (1)
135-135: 💤 Low valueRemove extraneous
fprefix.This string has no placeholders.
🧹 Proposed fix
- print(f"💾 [STEP 3/3] Saving extracted text to DB...") + print("💾 [STEP 3/3] Saving extracted text to DB...")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/workers/tasks.py` at line 135, The print statement currently uses an unnecessary f-string prefix: change the print call printing "💾 [STEP 3/3] Saving extracted text to DB..." by removing the leading "f" so it is a plain string; locate and update the print(...) invocation in backend/app/workers/tasks.py (the line that prints "💾 [STEP 3/3] Saving extracted text to DB...") to eliminate the extraneous f prefix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/llm/client.py`:
- Around line 40-46: The current try/except around json.loads(data_str) and
extracting chunk silently swallows all exceptions; change it to only ignore JSON
parse errors (catch json.JSONDecodeError and continue) but let schema-related
problems bubble up (do not catch KeyError, TypeError, IndexError) so malformed
frames that violate expected structure (accessing
data['choices'][0]['delta']['content']) fail the answer flow; keep the existing
behavior of yielding non-empty chunk values and consider logging the schema
error before re-raising to aid debugging.
In `@backend/app/routers/solver.py`:
- Around line 138-147: The current DB query in the solver route
(select(Resource).where(...)) allows a subset match so missing or unauthorized
resource_ids in data.resource_ids silently pass; change the guard to verify that
all requested IDs were returned by comparing the set/length of resources to
data.resource_ids (e.g., build a set of returned Resource.id from resources and
ensure it equals the set of data.resource_ids) and if any are missing raise
HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="No valid
resources selected." or a clearer message listing missing ids); update the check
that now uses resources to compute the missing_ids and reject the request if
missing_ids is non-empty so the request never runs with fewer sources than
requested.
- Around line 160-195: When creating a new Question/Answer you must also "touch"
the parent ChatSession so list_sessions() (which sorts by
ChatSession.updated_at) reflects activity; update the ChatSession.updated_at for
both branches: if you auto-create a new ChatSession (new_sess) set its
updated_at to the current UTC timestamp, and if using an existing session
(session_id from data.session_id) load the ChatSession (ChatSession) and assign
its updated_at to now (or execute an UPDATE on ChatSession where id ==
session_id) before committing; make this change after adding Question/Answer and
before await db.commit() so the session timestamp is persisted.
- Around line 203-226: The event_generator coroutine must explicitly handle
asyncio.CancelledError so Answer rows don't remain in "generating" when the
client disconnects: add an except asyncio.CancelledError block before the
generic except Exception in event_generator, open an async DB session (using
SessionLocal), look up the Answer by question_id (Answer.question_id ==
new_question.id) and set answer_rec.status = "failed" and commit, then re-raise
the CancelledError to allow normal cancellation propagation; keep the existing
generic exception handler for other errors.
In `@backend/app/workers/tasks.py`:
- Around line 139-146: The text-file download path currently uses
httpx.AsyncClient().get without timeout or error handling; update the block
handling resource.filename.endswith('.txt') to call
client.get(resource.file_url, timeout=45.0), call response.raise_for_status()
(or check response.status_code) before assigning resource.extracted_text, and
wrap the request in a try/except to set resource.status to an error value (and
clear/annotate extracted_text) on failure; ensure db.commit() is still called
(or committed in finally) and only set resource.status = "ready" after a
successful download/validation so the code using resource.extracted_text sees a
valid result.
In `@frontend/src/pages/Solver.tsx`:
- Around line 225-235: The icon-only buttons (e.g., the Rename button calling
startRenamingSession(sess.id, sess.title) with Edit2 and the Delete button
calling deleteSessionMutation.mutate(sess.id) with Trash2) need accessible
names: add aria-label attributes (or include visually-hidden text) to each
icon-only control (rename, delete, send, open-link) so screen readers can
announce them; update the button elements where these icons are used (also the
other occurrences around the render blocks you flagged) to include descriptive
aria-label values like "Rename chat", "Delete chat", "Send message", "Open link"
while keeping existing handlers (e.stopPropagation() and confirm logic) intact.
- Line 176: The session-changing actions (createSessionMutation.mutate(), select
session handler, delete session handler) must be blocked or cancel the active
stream when isStreaming is true to avoid appending chunks to the wrong
transcript; update each UI trigger (the create button, session select, and
delete handlers referenced around the createSessionMutation.mutate() call and
lines ~213 and ~232) to first check isStreaming and either return early (disable
the action) or cancel the active reader/stream (call reader.cancel() or the
component's stopStreaming routine and await it) before proceeding, and ensure
any UI buttons use the isStreaming flag to render disabled while streaming.
---
Duplicate comments:
In `@frontend/src/pages/Solver.tsx`:
- Around line 129-132: The update to the streamed assistant message assumes a
tail message exists and is the assistant placeholder; change the setMessages
callback (the block that creates newMessages and assigns
newMessages[newMessages.length - 1].content = accumulatedResponse) to first
guard that newMessages.length > 0 and that the last message is the assistant
placeholder (e.g., check lastMessage.role === 'assistant' or a specific
placeholder id); only mutate content when that guard passes, otherwise either
append a new assistant message with accumulatedResponse or skip the mutation to
avoid accessing undefined. Ensure you reference the same symbols (setMessages,
newMessages, accumulatedResponse) when making the check and update.
- Around line 120-140: The stream parsing currently decodes each chunk in
isolation (decoder.decode(value)) and swallows partial JSON/UTF-8 fragments in
the catch, causing dropped tokens; instead introduce a persistent string buffer
(e.g., streamBuffer) that appends decoder.decode(value, {stream: true}) on each
reader.read() and only split by '\n' to parse complete "data: ..." lines,
leaving any trailing partial line in streamBuffer for the next read; ensure you
explicitly handle the "data: [DONE]" event, and continue to update
accumulatedResponse, setMessages (updating the last message's content),
setActiveSessionId and queryClient.invalidateQueries when complete events
arrive, and surface data.error via toast.error rather than discarding parse
errors.
---
Nitpick comments:
In `@backend/app/routers/resources.py`:
- Around line 60-66: The except block that catches Exception and re-raises
HTTPException is losing the original traceback; update the re-raise to chain the
original exception by using "raise HTTPException(... ) from e" (keep the same
status_code=status.HTTP_400_BAD_REQUEST and detail formatting) so the original
exception context is preserved for debugging (refer to the except Exception as e
handler and the HTTPException being raised).
In `@backend/app/workers/tasks.py`:
- Line 135: The print statement currently uses an unnecessary f-string prefix:
change the print call printing "💾 [STEP 3/3] Saving extracted text to DB..." by
removing the leading "f" so it is a plain string; locate and update the
print(...) invocation in backend/app/workers/tasks.py (the line that prints "💾
[STEP 3/3] Saving extracted text to DB...") to eliminate the extraneous f
prefix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 03e7d944-e699-40b6-a965-93450efa2ba0
📒 Files selected for processing (8)
backend/alembic/versions/d3dcf6b6fe3f_add_chat_sessions_and_processing_.pybackend/app/llm/client.pybackend/app/models/chat_session.pybackend/app/routers/resources.pybackend/app/routers/solver.pybackend/app/workers/tasks.pyfrontend/src/api/solver.tsfrontend/src/pages/Solver.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/alembic/versions/d3dcf6b6fe3f_add_chat_sessions_and_processing_.py
- frontend/src/api/solver.ts
| try: | ||
| data = json.loads(data_str) | ||
| chunk = data['choices'][0]['delta'].get('content', "") | ||
| if chunk: | ||
| yield chunk | ||
| except Exception: | ||
| continue |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
git ls-files | grep -E "(client\.py|solver\.py)" | head -20Repository: shubhamxdd/pyq-appl
Length of output: 119
🏁 Script executed:
wc -l backend/app/llm/client.pyRepository: shubhamxdd/pyq-appl
Length of output: 92
🏁 Script executed:
cat -n backend/app/llm/client.py | sed -n '30,60p'Repository: shubhamxdd/pyq-appl
Length of output: 928
🏁 Script executed:
cat -n backend/app/llm/client.py | head -80Repository: shubhamxdd/pyq-appl
Length of output: 2237
🏁 Script executed:
rg "class.*Client" backend/app/llm/client.py -A 5Repository: shubhamxdd/pyq-appl
Length of output: 350
🏁 Script executed:
rg "stream_chat" backend/ -B 3 -A 15Repository: shubhamxdd/pyq-appl
Length of output: 2865
🏁 Script executed:
rg "stream_chat" backend/ -B 2 -A 10Repository: shubhamxdd/pyq-appl
Length of output: 1942
🏁 Script executed:
fd "test" backend/ -type f | head -20Repository: shubhamxdd/pyq-appl
Length of output: 234
🏁 Script executed:
find backend -name "*test*" -type f 2>/dev/nullRepository: shubhamxdd/pyq-appl
Length of output: 45
🏁 Script executed:
cat -n backend/app/routers/solver.py | sed -n '60,120p'Repository: shubhamxdd/pyq-appl
Length of output: 2465
🏁 Script executed:
rg "def event_generator" backend/app/routers/solver.py -A 40Repository: shubhamxdd/pyq-appl
Length of output: 1304
Narrow exception handling to distinguish protocol errors from schema mismatches.
The bare except Exception: continue silently drops malformed frames, allowing truncated responses to be marked status="done" upstream. If a frame fails to parse, it's lost silently rather than surfacing the error. Distinguish JSON parse failures (which should be skipped) from unexpected schema structure (which should fail the answer):
- try:
- data = json.loads(data_str)
- chunk = data['choices'][0]['delta'].get('content', "")
- if chunk:
- yield chunk
- except Exception:
- continue
+ try:
+ data = json.loads(data_str)
+ except json.JSONDecodeError:
+ continue
+
+ if "error" in data:
+ raise RuntimeError(f"OpenRouter stream error: {data['error']}")
+
+ try:
+ chunk = data["choices"][0]["delta"].get("content", "")
+ except (KeyError, IndexError, TypeError) as exc:
+ raise RuntimeError(f"Unexpected stream chunk: {data}") from exc
+
+ if chunk:
+ yield chunk📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| data = json.loads(data_str) | |
| chunk = data['choices'][0]['delta'].get('content', "") | |
| if chunk: | |
| yield chunk | |
| except Exception: | |
| continue | |
| try: | |
| data = json.loads(data_str) | |
| except json.JSONDecodeError: | |
| continue | |
| if "error" in data: | |
| raise RuntimeError(f"OpenRouter stream error: {data['error']}") | |
| try: | |
| chunk = data["choices"][0]["delta"].get("content", "") | |
| except (KeyError, IndexError, TypeError) as exc: | |
| raise RuntimeError(f"Unexpected stream chunk: {data}") from exc | |
| if chunk: | |
| yield chunk |
🧰 Tools
🪛 Ruff (0.15.12)
[error] 45-46: try-except-continue detected, consider logging the exception
(S112)
[warning] 45-45: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/llm/client.py` around lines 40 - 46, The current try/except
around json.loads(data_str) and extracting chunk silently swallows all
exceptions; change it to only ignore JSON parse errors (catch
json.JSONDecodeError and continue) but let schema-related problems bubble up (do
not catch KeyError, TypeError, IndexError) so malformed frames that violate
expected structure (accessing data['choices'][0]['delta']['content']) fail the
answer flow; keep the existing behavior of yielding non-empty chunk values and
consider logging the schema error before re-raising to aid debugging.
| result = await db.execute( | ||
| select(Resource).where(Resource.id.in_(data.resource_ids), Resource.user_id == current_user.id) | ||
| ) | ||
| resources = result.scalars().all() | ||
|
|
||
| if not resources: | ||
| raise HTTPException( | ||
| status_code=status.HTTP_400_BAD_REQUEST, | ||
| detail="No valid resources selected." | ||
| ) |
There was a problem hiding this comment.
Fail when any selected resource is missing.
This query accepts any owned subset of data.resource_ids. If one selected document is missing or belongs to another user, the request still runs with incomplete context and the model answers from fewer sources than the client asked for.
Suggested guard
resources = result.scalars().all()
+ found_resource_ids = {resource.id for resource in resources}
+ if found_resource_ids != set(data.resource_ids):
+ raise HTTPException(
+ status_code=status.HTTP_400_BAD_REQUEST,
+ detail="One or more selected resources are unavailable."
+ )
if not resources:
raise HTTPException(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/routers/solver.py` around lines 138 - 147, The current DB query
in the solver route (select(Resource).where(...)) allows a subset match so
missing or unauthorized resource_ids in data.resource_ids silently pass; change
the guard to verify that all requested IDs were returned by comparing the
set/length of resources to data.resource_ids (e.g., build a set of returned
Resource.id from resources and ensure it equals the set of data.resource_ids)
and if any are missing raise
HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="No valid
resources selected." or a clearer message listing missing ids); update the check
that now uses resources to compute the missing_ids and reject the request if
missing_ids is non-empty so the request never runs with fewer sources than
requested.
| # 3. Handle Session | ||
| if data.session_id: | ||
| # Verify ownership of existing session | ||
| sess_result = await db.execute( | ||
| select(ChatSession).where(ChatSession.id == data.session_id, ChatSession.user_id == current_user.id) | ||
| ) | ||
| if not sess_result.scalar_one_or_none(): | ||
| raise HTTPException(status_code=404, detail="Chat session not found") | ||
| session_id = data.session_id | ||
| else: | ||
| # Auto-create session if none provided | ||
| new_sess = ChatSession(user_id=current_user.id, title=data.content[:30] + "...") | ||
| db.add(new_sess) | ||
| await db.flush() | ||
| session_id = new_sess.id | ||
|
|
||
| # 4. Create Question and Answer records | ||
| new_question = Question( | ||
| user_id=current_user.id, | ||
| session_id=session_id, | ||
| content=data.content, | ||
| delivery_mode=data.delivery_mode | ||
| ) | ||
| db.add(new_question) | ||
| await db.flush() | ||
|
|
||
| new_answer = Answer( | ||
| question_id=new_question.id, | ||
| content="", | ||
| status="generating" | ||
| ) | ||
| db.add(new_answer) | ||
|
|
||
| # Update quota | ||
| current_user.questions_used += 1 | ||
| await db.commit() |
There was a problem hiding this comment.
Touch the session when a new question is added.
list_sessions() sorts by ChatSession.updated_at, but this handler only inserts Question/Answer rows. Existing sessions never get their timestamp bumped, so active chats stop floating to the top after the first message.
Suggested fix
+from datetime import datetime
@@
if data.session_id:
# Verify ownership of existing session
sess_result = await db.execute(
select(ChatSession).where(ChatSession.id == data.session_id, ChatSession.user_id == current_user.id)
)
- if not sess_result.scalar_one_or_none():
+ session = sess_result.scalar_one_or_none()
+ if not session:
raise HTTPException(status_code=404, detail="Chat session not found")
- session_id = data.session_id
+ session.updated_at = datetime.utcnow()
+ session_id = session.id📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # 3. Handle Session | |
| if data.session_id: | |
| # Verify ownership of existing session | |
| sess_result = await db.execute( | |
| select(ChatSession).where(ChatSession.id == data.session_id, ChatSession.user_id == current_user.id) | |
| ) | |
| if not sess_result.scalar_one_or_none(): | |
| raise HTTPException(status_code=404, detail="Chat session not found") | |
| session_id = data.session_id | |
| else: | |
| # Auto-create session if none provided | |
| new_sess = ChatSession(user_id=current_user.id, title=data.content[:30] + "...") | |
| db.add(new_sess) | |
| await db.flush() | |
| session_id = new_sess.id | |
| # 4. Create Question and Answer records | |
| new_question = Question( | |
| user_id=current_user.id, | |
| session_id=session_id, | |
| content=data.content, | |
| delivery_mode=data.delivery_mode | |
| ) | |
| db.add(new_question) | |
| await db.flush() | |
| new_answer = Answer( | |
| question_id=new_question.id, | |
| content="", | |
| status="generating" | |
| ) | |
| db.add(new_answer) | |
| # Update quota | |
| current_user.questions_used += 1 | |
| await db.commit() | |
| # 3. Handle Session | |
| if data.session_id: | |
| # Verify ownership of existing session | |
| sess_result = await db.execute( | |
| select(ChatSession).where(ChatSession.id == data.session_id, ChatSession.user_id == current_user.id) | |
| ) | |
| session = sess_result.scalar_one_or_none() | |
| if not session: | |
| raise HTTPException(status_code=404, detail="Chat session not found") | |
| session.updated_at = datetime.utcnow() | |
| session_id = session.id | |
| else: | |
| # Auto-create session if none provided | |
| new_sess = ChatSession(user_id=current_user.id, title=data.content[:30] + "...") | |
| db.add(new_sess) | |
| await db.flush() | |
| session_id = new_sess.id | |
| # 4. Create Question and Answer records | |
| new_question = Question( | |
| user_id=current_user.id, | |
| session_id=session_id, | |
| content=data.content, | |
| delivery_mode=data.delivery_mode | |
| ) | |
| db.add(new_question) | |
| await db.flush() | |
| new_answer = Answer( | |
| question_id=new_question.id, | |
| content="", | |
| status="generating" | |
| ) | |
| db.add(new_answer) | |
| # Update quota | |
| current_user.questions_used += 1 | |
| await db.commit() |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/routers/solver.py` around lines 160 - 195, When creating a new
Question/Answer you must also "touch" the parent ChatSession so list_sessions()
(which sorts by ChatSession.updated_at) reflects activity; update the
ChatSession.updated_at for both branches: if you auto-create a new ChatSession
(new_sess) set its updated_at to the current UTC timestamp, and if using an
existing session (session_id from data.session_id) load the ChatSession
(ChatSession) and assign its updated_at to now (or execute an UPDATE on
ChatSession where id == session_id) before committing; make this change after
adding Question/Answer and before await db.commit() so the session timestamp is
persisted.
| async def event_generator(): | ||
| full_response = "" | ||
| try: | ||
| async for chunk in open_router_client.stream_chat(messages): | ||
| full_response += chunk | ||
| yield f"data: {json.dumps({'chunk': chunk, 'session_id': str(session_id)})}\n\n" | ||
|
|
||
| # Finalize Answer in DB | ||
| async with SessionLocal() as async_db: | ||
| result = await async_db.execute(select(Answer).where(Answer.question_id == new_question.id)) | ||
| answer_rec = result.scalar_one_or_none() | ||
| if answer_rec: | ||
| answer_rec.content = full_response | ||
| answer_rec.status = "done" | ||
| await async_db.commit() | ||
| except Exception as e: | ||
| yield f"data: {json.dumps({'error': str(e)})}\n\n" | ||
| async with SessionLocal() as async_db: | ||
| result = await async_db.execute(select(Answer).where(Answer.question_id == new_question.id)) | ||
| answer_rec = result.scalar_one_or_none() | ||
| if answer_rec: | ||
| answer_rec.status = "failed" | ||
| await async_db.commit() | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
sed -n '203,226p' backend/app/routers/solver.py
python - <<'PY'
import asyncio
print(asyncio.CancelledError.__mro__)
PYRepository: shubhamxdd/pyq-appl
Length of output: 1313
🏁 Script executed:
# Check if asyncio is imported in the file
rg -A 20 "^import|^from" backend/app/routers/solver.py | head -30Repository: shubhamxdd/pyq-appl
Length of output: 1158
🏁 Script executed:
# Find the Answer model definition
fd -t f "*.py" | xargs rg -l "class Answer" | head -5Repository: shubhamxdd/pyq-appl
Length of output: 394
🏁 Script executed:
# Read the Answer model
cat backend/app/models/answer.pyRepository: shubhamxdd/pyq-appl
Length of output: 871
🏁 Script executed:
# Check if asyncio has been added to imports
rg "import asyncio" backend/app/routers/solver.pyRepository: shubhamxdd/pyq-appl
Length of output: 45
🏁 Script executed:
# Get broader context around the event_generator function
sed -n '1,20p' backend/app/routers/solver.pyRepository: shubhamxdd/pyq-appl
Length of output: 872
🏁 Script executed:
# Find where Answer is created in the event_generator context
sed -n '180,230p' backend/app/routers/solver.pyRepository: shubhamxdd/pyq-appl
Length of output: 1869
Add explicit handler for asyncio.CancelledError to prevent answers remaining stuck in generating status.
When a client disconnects or the stream task is cancelled, asyncio.CancelledError is raised, which inherits from BaseException and bypasses the except Exception block. This leaves the persisted Answer row stuck in generating status indefinitely.
Add a dedicated handler for asyncio.CancelledError before the generic exception handler:
Suggested fix
+import asyncio
+
async def event_generator():
full_response = ""
try:
async for chunk in open_router_client.stream_chat(messages):
full_response += chunk
yield f"data: {json.dumps({'chunk': chunk, 'session_id': str(session_id)})}\n\n"
# Finalize Answer in DB
async with SessionLocal() as async_db:
result = await async_db.execute(select(Answer).where(Answer.question_id == new_question.id))
answer_rec = result.scalar_one_or_none()
if answer_rec:
answer_rec.content = full_response
answer_rec.status = "done"
await async_db.commit()
+ except asyncio.CancelledError:
+ async with SessionLocal() as async_db:
+ result = await async_db.execute(select(Answer).where(Answer.question_id == new_question.id))
+ answer_rec = result.scalar_one_or_none()
+ if answer_rec:
+ answer_rec.status = "failed"
+ await async_db.commit()
+ raise
except Exception as e:
yield f"data: {json.dumps({'error': str(e)})}\n\n"
async with SessionLocal() as async_db:
result = await async_db.execute(select(Answer).where(Answer.question_id == new_question.id))
answer_rec = result.scalar_one_or_none()
if answer_rec:
answer_rec.status = "failed"
await async_db.commit()🧰 Tools
🪛 Ruff (0.15.12)
[warning] 218-218: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/routers/solver.py` around lines 203 - 226, The event_generator
coroutine must explicitly handle asyncio.CancelledError so Answer rows don't
remain in "generating" when the client disconnects: add an except
asyncio.CancelledError block before the generic except Exception in
event_generator, open an async DB session (using SessionLocal), look up the
Answer by question_id (Answer.question_id == new_question.id) and set
answer_rec.status = "failed" and commit, then re-raise the CancelledError to
allow normal cancellation propagation; keep the existing generic exception
handler for other errors.
| elif resource.filename.lower().endswith('.txt'): | ||
| print("DEBUG: Processing text file...") | ||
| print("📝 [TEXT] Extracting plain text content...") | ||
| async with httpx.AsyncClient() as client: | ||
| response = await client.get(resource.file_url) | ||
| resource.extracted_text = response.text | ||
| resource.status = "ready" | ||
| print("DEBUG: Text extraction complete.") | ||
| await db.commit() | ||
| print("🏁 [TASK COMPLETE] Text file is READY.\n") |
There was a problem hiding this comment.
Text file download lacks timeout and error handling.
Unlike the PDF download (line 46, timeout=45.0) and status check (line 48), the text file path has no timeout and doesn't verify the response succeeded. A slow or failed download will hang indefinitely or silently store error HTML as extracted text.
🛡️ Proposed fix
elif resource.filename.lower().endswith('.txt'):
print("📝 [TEXT] Extracting plain text content...")
- async with httpx.AsyncClient() as client:
+ async with httpx.AsyncClient(timeout=45.0) as client:
response = await client.get(resource.file_url)
+ if response.status_code != 200:
+ raise Exception(f"Storage download failed: HTTP {response.status_code}")
resource.extracted_text = response.text
resource.status = "ready"
await db.commit()
print("🏁 [TASK COMPLETE] Text file is READY.\n")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/workers/tasks.py` around lines 139 - 146, The text-file download
path currently uses httpx.AsyncClient().get without timeout or error handling;
update the block handling resource.filename.endswith('.txt') to call
client.get(resource.file_url, timeout=45.0), call response.raise_for_status()
(or check response.status_code) before assigning resource.extracted_text, and
wrap the request in a try/except to set resource.status to an error value (and
clear/annotate extracted_text) on failure; ensure db.commit() is still called
(or committed in finally) and only set resource.status = "ready" after a
successful download/validation so the code using resource.extracted_text sees a
valid result.
| <div className="w-72 flex flex-col bg-white dark:bg-gray-800 border-r border-gray-200 dark:border-gray-700"> | ||
| <div className="p-4 border-b border-gray-100 dark:border-gray-700"> | ||
| <button | ||
| onClick={() => createSessionMutation.mutate()} |
There was a problem hiding this comment.
Lock session-changing actions while a response is streaming.
The read loop keeps appending into shared messages, so creating/selecting/deleting sessions mid-stream can send the remaining chunks into the wrong transcript or clear the active one. Disable these actions while isStreaming is true, or cancel the active reader before allowing a session change.
Suggested fix
<button
onClick={() => createSessionMutation.mutate()}
+ disabled={isStreaming}
className="w-full flex items-center justify-center gap-2 bg-blue-600 hover:bg-blue-700 text-white py-2.5 rounded-lg text-sm font-semibold transition-colors shadow-sm"
>
...
<button
onClick={() => setActiveSessionId(sess.id)}
+ disabled={isStreaming}
className={clsx(
"w-full text-left px-3 py-3 rounded-lg text-sm transition-all flex items-start gap-3",
...
<button
onClick={(e) => { e.stopPropagation(); if(window.confirm('Delete chat?')) deleteSessionMutation.mutate(sess.id); }}
+ disabled={isStreaming}
className="p-1.5 text-gray-400 hover:text-red-500"
>Also applies to: 213-213, 232-232
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@frontend/src/pages/Solver.tsx` at line 176, The session-changing actions
(createSessionMutation.mutate(), select session handler, delete session handler)
must be blocked or cancel the active stream when isStreaming is true to avoid
appending chunks to the wrong transcript; update each UI trigger (the create
button, session select, and delete handlers referenced around the
createSessionMutation.mutate() call and lines ~213 and ~232) to first check
isStreaming and either return early (disable the action) or cancel the active
reader/stream (call reader.cancel() or the component's stopStreaming routine and
await it) before proceeding, and ensure any UI buttons use the isStreaming flag
to render disabled while streaming.
| <button | ||
| onClick={() => startRenamingSession(sess.id, sess.title)} | ||
| className="p-1.5 text-gray-400 hover:text-blue-500" | ||
| > | ||
| <Edit2 className="w-3.5 h-3.5" /> | ||
| </button> | ||
| <button | ||
| onClick={(e) => { e.stopPropagation(); if(window.confirm('Delete chat?')) deleteSessionMutation.mutate(sess.id); }} | ||
| className="p-1.5 text-gray-400 hover:text-red-500" | ||
| > | ||
| <Trash2 className="w-3.5 h-3.5" /> |
There was a problem hiding this comment.
Add accessible names to the icon-only controls.
Rename, delete, send, and open-link actions currently render only icons, so assistive tech has no reliable label for them. Add aria-label attributes so these controls remain usable with screen readers.
Suggested fix
<button
onClick={() => startRenamingSession(sess.id, sess.title)}
+ aria-label={`Rename chat ${sess.title}`}
className="p-1.5 text-gray-400 hover:text-blue-500"
>
...
<button
onClick={(e) => { e.stopPropagation(); if(window.confirm('Delete chat?')) deleteSessionMutation.mutate(sess.id); }}
+ aria-label={`Delete chat ${sess.title}`}
className="p-1.5 text-gray-400 hover:text-red-500"
>
...
<button
type="submit"
disabled={isStreaming || !question.trim() || selectedResources.length === 0}
+ aria-label="Send question"
className="absolute right-3.5 bottom-3.5 p-2.5 bg-blue-600 text-white rounded-xl hover:bg-blue-700 disabled:opacity-30 disabled:grayscale transition-all shadow-md active:scale-95"
>
...
<a
href={res.file_url}
target="_blank"
rel="noopener noreferrer"
+ aria-label={`Open ${res.filename} in a new tab`}
className="absolute right-2.5 top-1/2 -translate-y-1/2 p-1 text-gray-300 hover:text-blue-500 opacity-0 group-hover:opacity-100 transition-opacity"
>Also applies to: 327-332, 371-378
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@frontend/src/pages/Solver.tsx` around lines 225 - 235, The icon-only buttons
(e.g., the Rename button calling startRenamingSession(sess.id, sess.title) with
Edit2 and the Delete button calling deleteSessionMutation.mutate(sess.id) with
Trash2) need accessible names: add aria-label attributes (or include
visually-hidden text) to each icon-only control (rename, delete, send,
open-link) so screen readers can announce them; update the button elements where
these icons are used (also the other occurrences around the render blocks you
flagged) to include descriptive aria-label values like "Rename chat", "Delete
chat", "Send message", "Open link" while keeping existing handlers
(e.stopPropagation() and confirm logic) intact.
|
@CodeRabbit its ready to be merged on main? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me look at the key files before giving you a proper answer. 🏁 Script executed: Length of output: 23138
❌ Blocking Issue1. Hardcoded localhost URL in
|
| # | File | Issue |
|---|---|---|
| 5 | backend/app/models/chat_session.py |
datetime.utcnow is deprecated in Python 3.12+ — use datetime.now(timezone.utc) |
| 6 | backend/app/routers/solver.py |
data.content[:30] + "..." always appends "..." even for short content — use data.content[:30] + ("..." if len(data.content) > 30 else "") |
| 7 | backend/app/routers/solver.py |
SSE stream never emits a terminal event: done signal; the frontend has to rely on connection close |
| 8 | backend/app/routers/solver.py |
Free-tier limit 10 is hardcoded — worth moving to a settings.FREE_TIER_QUESTION_LIMIT config value |
| 9 | General | Note that Owl Alpha logs prompts and completions, which may be used to improve the model — worth disclosing to users if academic content is sensitive |
Summary: Fix the hardcoded URL (#1) as a minimum before merging — that's a hard blocker. Issues #2 and #3 are data-integrity bugs worth addressing too. The rest can be follow-up issues if you're in a hurry.
Want me to open GitHub issues for any of these so they're tracked?
|
@CodeRabbit fixed #1 now ready for merge i think |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 9569
return fetch(`${API_URL}/solver/ask`, { ... });That unblocks any deployed environment. ✅ However, issues
|
|
@CodeRabbit yeah open issues for #2 and #3 for later fixing |
|
Review rate limit: 0/1 reviews remaining, refill in 38 minutes and 38 seconds. |
Summary by CodeRabbit
New Features
Improvements