Track social engagement for partner platforms#3542
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a social-platforms adapter layer (X/YouTube/Instagram/TikTok), replaces legacy scrape-creators modules, introduces cron queueing and per-partner social engagement syncs, computes/persists partner baselines, and integrates bounty submission fraud detection with schema and DB fields. Changes
Sequence Diagram(s)sequenceDiagram
participant Cron as Cron Scheduler (Vercel)
participant QStash as Queue Manager (qstash)
participant Server as App Server (withCron)
participant DB as Prisma
participant Adapter as Platform Adapter
participant API as Social Media API
Cron->>QStash: trigger /api/cron/queue-sync-social-engagement
QStash->>Server: GET queue-sync-social-engagement
Server->>DB: query eligible PartnerPlatform ids
Server->>QStash: enqueue job per partnerPlatformId (sync-social-engagement)
QStash->>Server: POST /api/cron/sync-social-engagement { partnerPlatformId }
Server->>DB: load PartnerPlatform baseline & metadata
Server->>Adapter: fetchPosts(platformId, startTime, endTime)
Adapter->>API: request posts/metrics
API-->>Adapter: posts + metrics
Adapter->>Server: posts
Server->>DB: upsert PartnerPlatformPost(s) and PartnerPlatformEngagement(s)
Server->>DB: prune old data and update median/avg baselines
sequenceDiagram
participant User as User/System
participant Submission as Bounty Submission Flow
participant DB as Prisma
participant Baseline as PartnerPlatform Baseline
participant Fraud as Fraud Detector
User->>Submission: create or sync submission (may include socialMetricCount)
Submission->>DB: read submission (includes partnerId)
Submission->>Baseline: fetch partner baseline metrics
Baseline-->>Submission: medianViews/likes/comments, avgEngagementRate, subscribers
Submission->>Fraud: detectBountySubmissionFraud(socialMetricCount, bountyMetric, baseline)
Fraud-->>Submission: { fraudRiskLevel, fraudFlags }
Submission->>DB: update BountySubmission with fraud fields and status/metrics
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Remove debug console.log from X adapter fetchPosts - Remove unused impliedViewsForMedianLikes variable in fraud detection - Fix post pruning to use ID-based NOT IN instead of publishedAt cutoff - Batch daily and post upserts with prisma.$transaction() - Add FraudFlag union type and strengthen BountyFraudResult types - Use camelCase for fraud flag values - Add stack to withCron error serialization - Remove .default([]) from fraudFlags schema (null = not run, [] = no flags)
Call detectBountySubmissionFraud wherever socialMetricCount is set: batch cron sync, manual single-submission sync, and partner submission creation.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/web/lib/cron/with-cron.ts (1)
59-78:⚠️ Potential issue | 🟠 MajorDon't send serialized stacks back through
logAndRespond.
errorMessagenow includesstack, and the same value is returned on Line 78. That leaks internal traces to the caller and anything that stores response bodies. Keep the rich payload in server logs, but return a generic failure string from the wrapper.🛡️ Proposed fix
- const errorMessage = - error instanceof Error - ? JSON.stringify({ - ...error, - name: error.name, - message: error.message, - stack: error.stack, - }) - : String(error); + const publicErrorMessage = "Cron job failed during execution."; + const detailedErrorMessage = + error instanceof Error + ? JSON.stringify({ + ...error, + name: error.name, + message: error.message, + stack: error.stack, + }) + : String(error); // Send error to Axiom - logger.error(errorMessage, error); + logger.error(detailedErrorMessage, error); await logger.flush(); await log({ - message: `Cron job "${url.pathname}" failed during execution. Error: ${errorMessage}`, + message: `Cron job "${url.pathname}" failed during execution. Error: ${detailedErrorMessage}`, type: "errors", }); - return logAndRespond(errorMessage, { status: 500 }); + return logAndRespond(publicErrorMessage, { status: 500 });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/cron/with-cron.ts` around lines 59 - 78, The wrapper currently returns the serialized error (errorMessage) via logAndRespond, leaking stacks to callers; keep the detailed payload for server logs (logger.error and log({...}) which can retain errorMessage) but change the response to a generic failure string: call logAndRespond with a non-sensitive message like "Internal server error" (or similar) instead of errorMessage; keep logger.error(errorMessage, error) and the log(...) call as-is so server logging keeps full details, but ensure the value passed into logAndRespond is the generic message and not errorMessage (references: errorMessage, logger.error, log, logAndRespond, url.pathname).
🧹 Nitpick comments (5)
apps/web/.env.example (1)
172-173: Standardize the X/Twitter env naming and setup note.This entry is easy to miss because the rest of the file still uses
TWITTER_*for the same platform family. MixingTWITTER_*andX_*in.env.examplemakes secret mapping and onboarding more error-prone. I’d either align the prefix convention or add a short note here that this is the X/Twitter app-only bearer token and where it should be created.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/.env.example` around lines 172 - 173, Rename and standardize the X/Twitter env var to match existing TWITTER_* conventions: replace X_API_BEARER_TOKEN with TWITTER_BEARER_TOKEN in the .env.example and update any code references that read X_API_BEARER_TOKEN to use TWITTER_BEARER_TOKEN (search for X_API_BEARER_TOKEN). Also add a short inline note next to TWITTER_BEARER_TOKEN indicating this is the app-only bearer token (where to create it in the Twitter/X developer portal) so onboarding and secret mapping are clear.packages/prisma/schema/platform.prisma (1)
39-55: Add a composite index for per-platform post history reads.
@@unique([partnerPlatformId, postId])protects upserts, and@@index(publishedAt)helps global scans, but neither helps the likely “recent posts for one platform” read pattern. Add@@index([partnerPlatformId, publishedAt])before this table grows.📈 Suggested index
model PartnerPlatformPost { id String `@id` `@default`(cuid()) partnerPlatformId String postId String // Tweet ID, YouTube video ID, etc. publishedAt DateTime title String? `@db.Text` // Video title (YT), tweet text preview (X) views BigInt `@default`(0) // impressions (X), views (YT) likes BigInt `@default`(0) comments BigInt `@default`(0) // replies (X), comments (YT) engagementRate Float `@default`(0) createdAt DateTime `@default`(now()) updatedAt DateTime `@updatedAt` partnerPlatform PartnerPlatform `@relation`(fields: [partnerPlatformId], references: [id], onDelete: Cascade) @@unique([partnerPlatformId, postId]) @@index(publishedAt) + @@index([partnerPlatformId, publishedAt]) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/prisma/schema/platform.prisma` around lines 39 - 55, The PartnerPlatformPost model lacks a composite index to optimize queries that fetch recent posts for a single platform; update the Prisma model PartnerPlatformPost by adding a composite index definition @@index([partnerPlatformId, publishedAt]) (placed alongside the existing @@unique and @@index declarations) so reads like “recent posts for one partnerPlatformId ordered by publishedAt” are served by an index.apps/web/lib/social-platforms/scrape-creators.ts (1)
39-48: ValidateSCRAPECREATORS_API_KEYup front.The non-null assertion only silences TypeScript. If the env var is missing, this client is still created and every lookup fails later with a much less actionable upstream error.
Possible guard
+const scrapeCreatorsApiKey = process.env.SCRAPECREATORS_API_KEY; + +if (!scrapeCreatorsApiKey) { + throw new Error("SCRAPECREATORS_API_KEY is not configured"); +} + export const scrapeCreatorsFetch = createFetch({ baseURL: "https://api.scrapecreators.com", @@ headers: { - "x-api-key": process.env.SCRAPECREATORS_API_KEY!, + "x-api-key": scrapeCreatorsApiKey, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/social-platforms/scrape-creators.ts` around lines 39 - 48, The code creates scrapeCreatorsFetch with a non-null assertion on process.env.SCRAPECREATORS_API_KEY which hides missing-key errors; update the module to validate SCRAPECREATORS_API_KEY before calling createFetch (e.g., check process.env.SCRAPECREATORS_API_KEY at module init or inside the exported scrapeCreatorsFetch factory), and throw a clear Error or log and exit if it's missing so createFetch is never constructed with an undefined API key; reference the scrapeCreatorsFetch export and the createFetch invocation and use the SCRAPECREATORS_API_KEY env var name in the error message.apps/web/lib/social-platforms/get-social-content.ts (1)
93-101: Remove implicit return fromforEachcallback.The
deletemethod returns a boolean, whichforEachdoesn't expect. Use a block body to avoid the implicit return.Suggested fix
// Remove tracking params - [ + const trackingParams = [ "utm_source", "utm_medium", "utm_campaign", "si", "feature", "igshid", "t", - ].forEach((p) => url.searchParams.delete(p)); + ]; + + for (const p of trackingParams) { + url.searchParams.delete(p); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/social-platforms/get-social-content.ts` around lines 93 - 101, The forEach callback currently uses a concise arrow body that implicitly returns the boolean from url.searchParams.delete; change the callback to a block body so there is no implicit return (e.g., replace (p) => url.searchParams.delete(p) with (p) => { url.searchParams.delete(p); }) when iterating the array of params passed to Array.forEach; update the invocation that uses the array and forEach to use the block-bodied callback around url.searchParams.delete to avoid returning a value from the callback.apps/web/lib/social-platforms/x-adapter.ts (1)
104-135: Bearer token accessed at module load time.
X_API_BEARER_TOKENis read when the module loads. If the environment variable is unset or changes at runtime, this won't reflect it. Consider lazy evaluation or validation.Suggested approach
If the token must be present at startup, add validation:
const X_API_BEARER_TOKEN = process.env.X_API_BEARER_TOKEN; if (!X_API_BEARER_TOKEN) { throw new Error("X_API_BEARER_TOKEN environment variable is required"); }Alternatively, if lazy evaluation is preferred, move the token access inside a function or use a getter.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/social-platforms/x-adapter.ts` around lines 104 - 135, The Authorization header currently captures process.env.X_API_BEARER_TOKEN at module load inside the xFetch createFetch call, so changes or missing tokens at runtime aren't handled; either validate the env at startup by reading process.env.X_API_BEARER_TOKEN into a constant and throwing if falsy before constructing xFetch (so createFetch always has a valid token), or switch to lazy evaluation by replacing the static headers value with a function/getter that reads process.env.X_API_BEARER_TOKEN at request time (update the createFetch headers logic used by xFetch accordingly) and ensure onError/logging still works with the chosen approach.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/web/app/`(ee)/api/cron/bounties/sync-social-metrics/route.ts:
- Around line 106-121: The query that fetches partner platform medians
(prisma.partnerPlatform.findMany) must exclude unverified/ghosted platform rows
so detectBountySubmissionFraud() doesn't use stale baselines; add a where clause
checking verifiedAt is not null (or otherwise ensure platform is verified) when
querying partnerPlatforms and keep the same select fields (medianViews,
medianLikes, medianComments, medianEngagementRate, subscribers) so only
currently verified accounts contribute to fraud baselines.
In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts:
- Around line 58-75: Replace the current binary logic that sets startTime based
on existingCount with a watermark-based approach: query
prisma.partnerPlatformEngagement for the latest sync timestamp for this
partnerPlatformId (e.g., max of syncedAt/updatedAt or latest row), fall back to
startOfDay(subDays(now, 30)) if none, then set startTime =
startOfDay(subDays(latestTimestamp, OVERLAP_DAYS)) where OVERLAP_DAYS is a small
refresh window (e.g., 1–3 days) to allow healing of missed runs, keep endTime =
todayStart, and pass those startTime/endTime into platform.fetchPosts; update
any variable names (existingCount, startTime) accordingly and ensure the
aggregate/lookup uses partnerPlatformId to locate the correct watermark.
In `@apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts`:
- Around line 86-123: The engagementRateAnomaly branch in
detect-bounty-submission-fraud.ts currently infers anomalous engagement from
view growth alone (using bountyMetric, socialMetricCount, partnerPlatform,
ENGAGEMENT_RATE_MULTIPLIER and flags.push("engagementRateAnomaly")), which can
mislabel viral posts; update the branch to require and use the submission's
current engagement counts (e.g., currentLikes/currentComments/currentEngagement)
to compute a current engagement rate and compare that rate against
partnerPlatform.medianEngagementRate before pushing the "engagementRateAnomaly"
flag, or remove the rule for the case when only views are provided so it only
runs when explicit engagement counts are present.
- Around line 61-74: The scoring code should treat incomplete platform baselines
as unknown: before computing median or using subscribers, verify the required
baseline fields are present for the given bountyMetric. Update the early-return
guard around partnerPlatform so that for bountyMetric === "views" you require
partnerPlatform.medianViews and partnerPlatform.subscribers to be non-null, and
for bountyMetric === "likes" require partnerPlatform.medianLikes and
partnerPlatform.subscribers to be non-null; if any required field is null,
return { fraudRiskLevel: null, fraudFlags: [] }. Apply the same null-check
pattern to the later block around lines 128-133 (where similar scoring uses
median/ subscribers) to avoid treating missing data as zero and misflagging
lowFollowerHighEngagement.
In `@apps/web/lib/social-platforms/instagram-adapter.ts`:
- Around line 145-151: The unwrap logic for raw into unwrapped can throw when
raw.data is null because typeof null === "object"; update the condition in the
expression that assigns unwrapped (the check that currently uses typeof (raw as
any).data === "object" and "xdt_shortcode_media" in (raw as any).data) to also
verify (raw as any).data !== null before using the in operator, so the ternary
only tries to access xdt_shortcode_media when data is a non-null object.
In `@apps/web/scripts/migrations/backfill-twitter-platform-ids.ts`:
- Around line 67-85: The loop that processes Promise.allSettled results advances
the cursor (startingAfter = partnerPlatforms[partnerPlatforms.length - 1].id)
even when failures occur; change the logic in the results-handling block (the
loop over results, variables results/totalUpdated/totalSkipped and error
instanceof AccountNotFoundError) so that transient failures (any rejected result
that is NOT AccountNotFoundError, e.g., ScrapeCreators/Prisma errors) do not
cause advancing the cursor: either retry those failed partnerPlatforms IDs
(requeue them and attempt up to N retries) or abort the batch by throwing the
error so the script can be retried externally; only increment totalSkipped and
advance startingAfter for true AccountNotFoundError cases or after exhausting
retries. Ensure startingAfter update happens only when the entire batch is
successfully processed or non-retriable skips are confirmed.
- Line 96: Wrap the fire-and-forget main() invocation in a promise handler that
catches errors and always disconnects Prisma: call main().catch(err => {
console.error(err); process.exitCode = 1 }).finally(() => prisma.$disconnect()).
Also ensure a successful path sets process.exitCode = 0 (or leave unset) after
disconnect so the migration exits cleanly; reference the main() function and the
prisma client instance when making these changes.
---
Outside diff comments:
In `@apps/web/lib/cron/with-cron.ts`:
- Around line 59-78: The wrapper currently returns the serialized error
(errorMessage) via logAndRespond, leaking stacks to callers; keep the detailed
payload for server logs (logger.error and log({...}) which can retain
errorMessage) but change the response to a generic failure string: call
logAndRespond with a non-sensitive message like "Internal server error" (or
similar) instead of errorMessage; keep logger.error(errorMessage, error) and the
log(...) call as-is so server logging keeps full details, but ensure the value
passed into logAndRespond is the generic message and not errorMessage
(references: errorMessage, logger.error, log, logAndRespond, url.pathname).
---
Nitpick comments:
In `@apps/web/.env.example`:
- Around line 172-173: Rename and standardize the X/Twitter env var to match
existing TWITTER_* conventions: replace X_API_BEARER_TOKEN with
TWITTER_BEARER_TOKEN in the .env.example and update any code references that
read X_API_BEARER_TOKEN to use TWITTER_BEARER_TOKEN (search for
X_API_BEARER_TOKEN). Also add a short inline note next to TWITTER_BEARER_TOKEN
indicating this is the app-only bearer token (where to create it in the
Twitter/X developer portal) so onboarding and secret mapping are clear.
In `@apps/web/lib/social-platforms/get-social-content.ts`:
- Around line 93-101: The forEach callback currently uses a concise arrow body
that implicitly returns the boolean from url.searchParams.delete; change the
callback to a block body so there is no implicit return (e.g., replace (p) =>
url.searchParams.delete(p) with (p) => { url.searchParams.delete(p); }) when
iterating the array of params passed to Array.forEach; update the invocation
that uses the array and forEach to use the block-bodied callback around
url.searchParams.delete to avoid returning a value from the callback.
In `@apps/web/lib/social-platforms/scrape-creators.ts`:
- Around line 39-48: The code creates scrapeCreatorsFetch with a non-null
assertion on process.env.SCRAPECREATORS_API_KEY which hides missing-key errors;
update the module to validate SCRAPECREATORS_API_KEY before calling createFetch
(e.g., check process.env.SCRAPECREATORS_API_KEY at module init or inside the
exported scrapeCreatorsFetch factory), and throw a clear Error or log and exit
if it's missing so createFetch is never constructed with an undefined API key;
reference the scrapeCreatorsFetch export and the createFetch invocation and use
the SCRAPECREATORS_API_KEY env var name in the error message.
In `@apps/web/lib/social-platforms/x-adapter.ts`:
- Around line 104-135: The Authorization header currently captures
process.env.X_API_BEARER_TOKEN at module load inside the xFetch createFetch
call, so changes or missing tokens at runtime aren't handled; either validate
the env at startup by reading process.env.X_API_BEARER_TOKEN into a constant and
throwing if falsy before constructing xFetch (so createFetch always has a valid
token), or switch to lazy evaluation by replacing the static headers value with
a function/getter that reads process.env.X_API_BEARER_TOKEN at request time
(update the createFetch headers logic used by xFetch accordingly) and ensure
onError/logging still works with the chosen approach.
In `@packages/prisma/schema/platform.prisma`:
- Around line 39-55: The PartnerPlatformPost model lacks a composite index to
optimize queries that fetch recent posts for a single platform; update the
Prisma model PartnerPlatformPost by adding a composite index definition
@@index([partnerPlatformId, publishedAt]) (placed alongside the existing
@@unique and @@index declarations) so reads like “recent posts for one
partnerPlatformId ordered by publishedAt” are served by an index.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c4749e90-d70b-4fdf-ae37-4c2f0f7403d9
📒 Files selected for processing (35)
apps/web/.env.exampleapps/web/app/(ee)/api/bounties/[bountyId]/sync-social-metrics/route.tsapps/web/app/(ee)/api/cron/bounties/sync-social-metrics/route.tsapps/web/app/(ee)/api/cron/partner-platforms/route.tsapps/web/app/(ee)/api/cron/queue-sync-social-engagement/route.tsapps/web/app/(ee)/api/cron/sync-social-engagement/route.tsapps/web/app/(ee)/api/partner-profile/programs/[programId]/bounties/[bountyId]/social-content-stats/route.tsapps/web/app/(ee)/api/partners/platforms/callback/route.tsapps/web/app/api/ai/sync-embeddings/route.tsapps/web/lib/actions/partners/create-bounty-submission.tsapps/web/lib/actions/partners/verify-social-account-by-code.tsapps/web/lib/api/scrape-creators/client.tsapps/web/lib/api/scrape-creators/get-social-content.tsapps/web/lib/api/scrape-creators/get-social-profile.tsapps/web/lib/api/scrape-creators/schema.tsapps/web/lib/bounty/api/detect-bounty-submission-fraud.tsapps/web/lib/bounty/api/get-social-metrics-updates.tsapps/web/lib/cron/enqueue-batch-jobs.tsapps/web/lib/cron/with-cron.tsapps/web/lib/social-platforms/base-adapter.tsapps/web/lib/social-platforms/get-social-content.tsapps/web/lib/social-platforms/get-social-profile.tsapps/web/lib/social-platforms/index.tsapps/web/lib/social-platforms/instagram-adapter.tsapps/web/lib/social-platforms/rate-limiter.tsapps/web/lib/social-platforms/scrape-creators.tsapps/web/lib/social-platforms/tiktok-adapter.tsapps/web/lib/social-platforms/x-adapter.tsapps/web/lib/social-platforms/youtube-adapter.tsapps/web/lib/zod/schemas/bounties.tsapps/web/scripts/migrations/backfill-twitter-platform-ids.tsapps/web/vercel.jsonpackages/email/src/templates/unresolved-fraud-events-summary.tsxpackages/prisma/schema/bounty.prismapackages/prisma/schema/platform.prisma
💤 Files with no reviewable changes (4)
- apps/web/lib/api/scrape-creators/client.ts
- apps/web/lib/api/scrape-creators/get-social-content.ts
- apps/web/lib/api/scrape-creators/schema.ts
- apps/web/lib/api/scrape-creators/get-social-profile.ts
| // Signal: engagementRateAnomaly | ||
| // When tracking views: check if the ratio of likes-to-views implied by the | ||
| // submission is significantly higher than the partner's baseline engagement rate | ||
| if ( | ||
| bountyMetric === "views" && | ||
| partnerPlatform.medianEngagementRate !== null && | ||
| partnerPlatform.medianEngagementRate > 0 && | ||
| partnerPlatform.medianLikes !== null | ||
| ) { | ||
| const medianLikes = Number(partnerPlatform.medianLikes); | ||
| const medianViews = Number(partnerPlatform.medianViews); | ||
|
|
||
| if (medianViews > 0 && medianLikes > 0) { | ||
| // Expected likes at this view count based on baseline engagement rate | ||
| const expectedLikes = | ||
| socialMetricCount * partnerPlatform.medianEngagementRate; | ||
|
|
||
| // If views are growing but the view count implies an engagement pattern | ||
| // that would require an abnormally high rate, flag it | ||
| const currentRatio = socialMetricCount / medianViews; | ||
|
|
||
| if ( | ||
| currentRatio > ENGAGEMENT_RATE_MULTIPLIER && | ||
| expectedLikes > medianLikes * ENGAGEMENT_RATE_MULTIPLIER | ||
| ) { | ||
| console.log( | ||
| "[detectBountySubmissionFraud] engagementRateAnomaly detected", | ||
| { | ||
| currentRatio, | ||
| expectedLikes, | ||
| medianLikes, | ||
| medianEngagementRate: partnerPlatform.medianEngagementRate, | ||
| }, | ||
| ); | ||
|
|
||
| flags.push("engagementRateAnomaly"); | ||
| } | ||
| } |
There was a problem hiding this comment.
engagementRateAnomaly is currently just a second spike rule.
This branch never sees the submission's actual likes/comments, so it can't compute a current engagement rate. Both conditions are still derived from view growth against historical baselines, which means a legitimate viral post can pick up an extra fraud flag without any anomalous engagement signal. Pass current engagement counts into the detector or drop this rule for view-only inputs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts` around lines 86 -
123, The engagementRateAnomaly branch in detect-bounty-submission-fraud.ts
currently infers anomalous engagement from view growth alone (using
bountyMetric, socialMetricCount, partnerPlatform, ENGAGEMENT_RATE_MULTIPLIER and
flags.push("engagementRateAnomaly")), which can mislabel viral posts; update the
branch to require and use the submission's current engagement counts (e.g.,
currentLikes/currentComments/currentEngagement) to compute a current engagement
rate and compare that rate against partnerPlatform.medianEngagementRate before
pushing the "engagementRateAnomaly" flag, or remove the rule for the case when
only views are provided so it only runs when explicit engagement counts are
present.
…w, fraud baseline checks
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/web/app/(ee)/api/cron/bounties/sync-social-metrics/route.ts (1)
159-172:⚠️ Potential issue | 🟠 MajorClear stale fraud fields when the latest sync has no metric.
Line 160 persists
socialMetricCount = null, but Lines 164-171 only updatefraudRiskLevel/fraudFlagswhen a count is present. If a later sync can’t read the metric, the submission keeps the fraud verdict from an older run, so the stored fraud state no longer matches the current metric state. Please reset both fields in theelsepath.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/app/`(ee)/api/cron/bounties/sync-social-metrics/route.ts around lines 159 - 172, The current update block builds updateData (Prisma.BountySubmissionUpdateInput) and only sets fraudRiskLevel/fraudFlags when socialMetricCount is non-null, leaving stale fraud verdicts when a later sync has no metric; modify the logic around updateData/detectBountySubmissionFraud so that if socialMetricCount is null you explicitly clear the fields (set updateData.fraudRiskLevel = null and updateData.fraudFlags = null) in the else path instead of leaving them untouched, using the same identifiers (socialMetricCount, detectBountySubmissionFraud, fraudRiskLevel, fraudFlags, updateData) so the DB row reflects the current metric state.
♻️ Duplicate comments (2)
apps/web/app/(ee)/api/cron/sync-social-engagement/route.ts (1)
64-79:⚠️ Potential issue | 🟠 MajorUse a real successful-sync cursor here, not
MAX(date)from engagement rows.This watermark only advances when the run writes at least one
partnerPlatformEngagementrow. For inactive accounts or zero-post windows, it never moves forward, so the cron will keep refetching the same historical range on every run. Persist a separate last-successful-sync timestamp and derive the overlap window from that instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts around lines 64 - 79, The code is using prisma.partnerPlatformEngagement.aggregate and latestResult._max.date as the watermark (watermarkDate), which only advances when engagement rows are written; instead persist and use a separate last-successful-sync cursor record (e.g., a new prisma model/table like PartnerPlatformSyncCursor keyed by partnerPlatformId) and read that timestamp to derive watermarkDate and startTime (subtract OVERLAP_DAYS and INITIAL_LOOKBACK_DAYS as currently done); update the cron to: 1) read the sync cursor before fetching engagements, 2) compute startTime/endTime from that cursor (fall back to INITIAL_LOOKBACK_DAYS on first run), and 3) on successful completion update the cursor with the run's endTime so future runs resume correctly (use methods analogous to prisma.partnerPlatformEngagement.* for read/update/create).apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts (1)
31-35:⚠️ Potential issue | 🟠 Major
engagementRateAnomalystill isn't using current engagement.
DetectBountyFraudInputonly carries a single metric count, so this branch never sees the submission’s actual likes/comments. BothcurrentRatioandexpectedLikesare inferred from view growth against historical medians, which makes this a second, looser spike rule rather than an engagement-rate check. That can double-flag legitimate viral posts. Either extend the input with current engagement counts or drop this rule for view-only submissions.Also applies to: 101-139
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts` around lines 31 - 35, The engagementRateAnomaly logic is using only socialMetricCount (historical-inferred values) so it never compares the submission’s actual engagement; update DetectBountyFraudInput to carry the submission’s current engagement (e.g., currentLikes and/or currentComments) and modify engagementRateAnomaly to compute currentRatio from those current counts (use currentLikes/currentComments when bountyMetric === "views") and compute expectedLikes from historical medians as before; alternatively, if you prefer not to expand the input, change engagementRateAnomaly to early-return/skip the engagement-rate check whenever the submission does not supply current engagement (i.e., when socialMetricCount is the only metric and bountyMetric === "views"), so the rule won’t run on view-only submissions.
🧹 Nitpick comments (1)
apps/web/app/(ee)/api/cron/sync-social-engagement/route.ts (1)
150-155: Refresh post metadata in the update path too.
publishedAtdrives the top-50 pruning, but existing rows never get that value corrected once inserted. Consider updatingpublishedAthere as well, andtitletoo if you want stored post metadata to stay current.♻️ Suggested change
update: { + publishedAt: post.publishedAt, + title: post.title, views: BigInt(post.views), likes: BigInt(post.likes), comments: BigInt(post.comments), engagementRate: post.engagementRate, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts around lines 150 - 155, The update block currently only updates numeric fields (views, likes, comments, engagementRate) and omits refreshing stored post metadata; modify the update object (the one with views: BigInt(post.views), etc.) to also set publishedAt and title from post so existing rows get corrected on upsert—ensure you convert post.publishedAt to the DB-expected type (e.g., Date or ISO string) and keep title updated to the latest post.title when updating.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@apps/web/app/`(ee)/api/cron/bounties/sync-social-metrics/route.ts:
- Around line 159-172: The current update block builds updateData
(Prisma.BountySubmissionUpdateInput) and only sets fraudRiskLevel/fraudFlags
when socialMetricCount is non-null, leaving stale fraud verdicts when a later
sync has no metric; modify the logic around
updateData/detectBountySubmissionFraud so that if socialMetricCount is null you
explicitly clear the fields (set updateData.fraudRiskLevel = null and
updateData.fraudFlags = null) in the else path instead of leaving them
untouched, using the same identifiers (socialMetricCount,
detectBountySubmissionFraud, fraudRiskLevel, fraudFlags, updateData) so the DB
row reflects the current metric state.
---
Duplicate comments:
In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts:
- Around line 64-79: The code is using
prisma.partnerPlatformEngagement.aggregate and latestResult._max.date as the
watermark (watermarkDate), which only advances when engagement rows are written;
instead persist and use a separate last-successful-sync cursor record (e.g., a
new prisma model/table like PartnerPlatformSyncCursor keyed by
partnerPlatformId) and read that timestamp to derive watermarkDate and startTime
(subtract OVERLAP_DAYS and INITIAL_LOOKBACK_DAYS as currently done); update the
cron to: 1) read the sync cursor before fetching engagements, 2) compute
startTime/endTime from that cursor (fall back to INITIAL_LOOKBACK_DAYS on first
run), and 3) on successful completion update the cursor with the run's endTime
so future runs resume correctly (use methods analogous to
prisma.partnerPlatformEngagement.* for read/update/create).
In `@apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts`:
- Around line 31-35: The engagementRateAnomaly logic is using only
socialMetricCount (historical-inferred values) so it never compares the
submission’s actual engagement; update DetectBountyFraudInput to carry the
submission’s current engagement (e.g., currentLikes and/or currentComments) and
modify engagementRateAnomaly to compute currentRatio from those current counts
(use currentLikes/currentComments when bountyMetric === "views") and compute
expectedLikes from historical medians as before; alternatively, if you prefer
not to expand the input, change engagementRateAnomaly to early-return/skip the
engagement-rate check whenever the submission does not supply current engagement
(i.e., when socialMetricCount is the only metric and bountyMetric === "views"),
so the rule won’t run on view-only submissions.
---
Nitpick comments:
In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts:
- Around line 150-155: The update block currently only updates numeric fields
(views, likes, comments, engagementRate) and omits refreshing stored post
metadata; modify the update object (the one with views: BigInt(post.views),
etc.) to also set publishedAt and title from post so existing rows get corrected
on upsert—ensure you convert post.publishedAt to the DB-expected type (e.g.,
Date or ISO string) and keep title updated to the latest post.title when
updating.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7f2733b8-f004-4f67-aca2-5d0388c404e2
📒 Files selected for processing (3)
apps/web/app/(ee)/api/cron/bounties/sync-social-metrics/route.tsapps/web/app/(ee)/api/cron/sync-social-engagement/route.tsapps/web/lib/bounty/api/detect-bounty-submission-fraud.ts
…ync and API quota limiter
|
@coderabbitai full review please |
|
✅ Actions performedFull review triggered. |
The sync-social-engagement cron was hitting a 20s Vitess transaction timeout when upserting many posts in a single $transaction. Switch to individual upserts via Promise.allSettled with failure logging, and flatten the prune logic to standalone queries.
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (4)
apps/web/scripts/migrations/backfill-twitter-platform-ids.ts (2)
96-96:⚠️ Potential issue | 🟠 MajorDisconnect Prisma and fail the process loudly.
main()is still fire-and-forget, so rejected promises can exit without a non-zero status and without cleanly releasing the Prisma client.Suggested entrypoint wrapper
-main(); +main() + .catch((error) => { + console.error("Backfill failed:", error); + process.exitCode = 1; + }) + .finally(async () => { + await prisma.$disconnect(); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/scripts/migrations/backfill-twitter-platform-ids.ts` at line 96, The script currently calls main() fire-and-forget; update the entrypoint so it awaits main() and handles rejections: call main() inside an async wrapper or promise chain that catches errors, logs them, ensures the Prisma client (PrismaClient instance e.g., prisma) is disconnected via prisma.$disconnect(), and calls process.exit(1) on failure (or process.exit(0) on success); reference main() and the Prisma client variable (prisma or PrismaClient instance) so the catch block can cleanly disconnect before exiting.
67-85:⚠️ Potential issue | 🟠 MajorDon't move the cursor past transient failures.
Unexpected
getSocialProfile/Prisma errors are still counted as skipped, and Line 85 advancesstartingAfteranyway. That lets the backfill silently abandon failed rows for the rest of the run. OnlyAccountNotFoundErrorshould be skippable here; other failures should retry or abort the batch before the cursor moves.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/scripts/migrations/backfill-twitter-platform-ids.ts` around lines 67 - 85, The loop currently treats all rejected Promises as skippable and unconditionally advances the cursor via startingAfter; change the logic so that only AccountNotFoundError increments totalSkipped and is ignored, while any other rejection causes the function to abort/throw (or return an error) and NOT update startingAfter for that batch; specifically, after iterating results from Promise.allSettled, check for any result where result.status === "rejected" and !(result.reason instanceof AccountNotFoundError) and rethrow or return early instead of setting startingAfter, otherwise proceed updating totalUpdated/totalSkipped and then set startingAfter only when no non-skippable errors were encountered (symbols: getSocialProfile, AccountNotFoundError, results, startingAfter).apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts (1)
101-139:⚠️ Potential issue | 🟠 Major
engagementRateAnomalyis still a second spike rule.This branch never uses the submission's current likes/comments, so both predicates are still derived from current view growth plus historical medians. A viral post can pick up an extra fraud flag without any anomalous engagement signal. Either pass current engagement counts into this detector or skip this rule for view-only inputs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts` around lines 101 - 139, The engagementRateAnomaly branch flags posts based only on view growth and historical medians (socialMetricCount, medianLikes, medianViews, medianEngagementRate, ENGAGEMENT_RATE_MULTIPLIER) and ignores the submission's current engagement counts; update this check to either require and use the submission's current engagement value (e.g., currentLikes/currentComments) in the anomaly predicates (compare observed engagement to expectedLikes) or skip adding the "engagementRateAnomaly" flag when no current engagement metric is provided, ensuring the decision that currently pushes flags.push("engagementRateAnomaly") incorporates the actual submission engagement or early-returns for view-only inputs.apps/web/lib/social-platforms/instagram-adapter.ts (1)
145-152:⚠️ Potential issue | 🟠 MajorGuard
raw.databefore usingin.
typeof null === "object", so{ data: null }still passes the current checks and theinoperator throws before validation runs.Safe unwrap
const unwrapped = typeof raw === "object" && raw !== null && "data" in raw && typeof (raw as any).data === "object" && + (raw as any).data !== null && "xdt_shortcode_media" in (raw as any).data ? (raw as any).data.xdt_shortcode_media : raw;#!/bin/bash node - <<'NODE' const raw = { data: null }; console.log("typeof raw.data:", typeof raw.data); try { console.log("xdt_shortcode_media" in raw.data); } catch (error) { console.error(`${error.name}: ${error.message}`); } NODE🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/social-platforms/instagram-adapter.ts` around lines 145 - 152, The current unwrapping logic for variable "unwrapped" uses the "in" operator on raw.data without ensuring raw.data is a non-null object; update the condition to explicitly guard raw.data by checking typeof (raw as any).data === "object" && (raw as any).data !== null before using "in" so the ternary becomes: ensure raw is an object, raw.data exists and is a non-null object, and only then check "'xdt_shortcode_media' in (raw as any).data"; adjust the compound boolean in the unwrapped assignment accordingly to prevent a runtime TypeError when raw.data is null.
🧹 Nitpick comments (3)
apps/web/lib/social-platforms/scrape-creators.ts (1)
46-48: Non-null assertion on environment variable could cause unclear runtime errors.If
SCRAPECREATORS_API_KEYis missing, the non-null assertion (!) will passundefinedas the header value, leading to authentication failures with unhelpful error messages.Consider failing fast with a clear message
+const SCRAPECREATORS_API_KEY = process.env.SCRAPECREATORS_API_KEY; + +if (!SCRAPECREATORS_API_KEY) { + throw new Error("SCRAPECREATORS_API_KEY environment variable is required"); +} + export const scrapeCreatorsFetch = createFetch({ baseURL: "https://api.scrapecreators.com", retry: { type: "linear", attempts: 1, delay: 3000, }, headers: { - "x-api-key": process.env.SCRAPECREATORS_API_KEY!, + "x-api-key": SCRAPECREATORS_API_KEY, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/social-platforms/scrape-creators.ts` around lines 46 - 48, The code uses a non-null assertion for process.env.SCRAPECREATORS_API_KEY when building the headers object, which can pass undefined and cause unclear runtime auth failures; update the module/function that builds the headers (the headers object in apps/web/lib/social-platforms/scrape-creators.ts) to explicitly validate process.env.SCRAPECREATORS_API_KEY at startup or before the request and throw a clear Error (or return a typed failure) if it's missing, so you fail fast with a descriptive message rather than using the `!` operator.apps/web/lib/social-platforms/youtube-adapter.ts (1)
133-134: Same date parsing concern as TikTok adapter.
new Date(data.publishDateText)may produce invalid dates for unexpected formats. Consider the same validation approach suggested for the TikTok adapter.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/social-platforms/youtube-adapter.ts` around lines 133 - 134, The publishedAt assignment using new Date(data.publishDateText) can produce Invalid Date for unexpected formats; update the mapping in youtube-adapter.ts where the returned object sets publishedAt to validate and normalize the date: attempt parsing (e.g., Date.parse or a robust parser), construct a Date only if parsing succeeds, check isNaN(date.getTime()) and if invalid either try a fallback/alternate parse or set publishedAt to null/undefined, and ensure any downstream consumers handle the nullable date — mirror the validation/fallback pattern you used in the TikTok adapter for consistency.apps/web/lib/social-platforms/tiktok-adapter.ts (1)
138-139: Date parsing fromcreate_time_utcstring may produce invalid dates.If
create_time_utccontains an unexpected format,new Date()will return an invalid date object rather than throwing. Consider validating the date or using a parsing library.Consider validating the parsed date
+ const publishedAt = new Date(data.create_time_utc); + if (isNaN(publishedAt.getTime())) { + throw new ContentNotFoundError(500, "Invalid publish date format"); + } + return { - publishedAt: new Date(data.create_time_utc), + publishedAt, handle: data.author.unique_id,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/lib/social-platforms/tiktok-adapter.ts` around lines 138 - 139, The code sets publishedAt using new Date(data.create_time_utc) which can produce an invalid Date for unexpected formats; validate and handle parsing: check data.create_time_utc exists, parse it (e.g., Date.parse or a parsing library like dayjs), construct the Date only if the parse yields a valid timestamp, and verify via isNaN(date.getTime()) (or equivalent) before assigning to publishedAt; if invalid, return a safe fallback (null/undefined) or log/throw as appropriate so publishedAt (the value derived from create_time_utc) is never an unvalidated invalid Date.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/web/.env.example`:
- Around line 172-173: The X_API_BEARER_TOKEN environment variable is documented
under "Scrape Creators" but is consumed by the X adapter
(apps/web/lib/social-platforms/x-adapter.ts, around the call to
https://api.x.com/2), so move the X_API_BEARER_TOKEN entry out of the Scrape
Creators block in apps/web/.env.example and place it under a new clearly named
section like "# X API" or "# Social engagement tracking"; update the variable's
inline comment to reflect its use by the X adapter (mention X_API_BEARER_TOKEN
and the x-adapter) so credential rotation and setup are unambiguous.
In `@apps/web/app/`(ee)/api/bounties/[bountyId]/sync-social-metrics/route.ts:
- Around line 147-164: The partnerPlatform lookup in the social metrics sync
path (where socialMetricCount != null) currently uses
prisma.partnerPlatform.findUnique and does not filter by verifiedAt, allowing
unverified/stale baselines; update this logic so partnerPlatformBaseline is
fetched only for verified accounts by either switching findUnique to
prisma.partnerPlatform.findFirst with a where including partnerId, type
(bountyInfo.socialPlatform!.value) and verifiedAt: { not: null }, or keep
findUnique and add an immediate post-fetch check that rejects/ignores the
baseline if partnerPlatformBaseline.verifiedAt is null; adjust downstream logic
that uses partnerPlatformBaseline accordingly to avoid using unverified
baselines.
In `@apps/web/app/`(ee)/api/cron/queue-sync-social-engagement/route.ts:
- Around line 12-42: The current use of prisma.partnerPlatform.findMany pulls
all matching rows into memory before enqueueing, which will scale poorly;
replace it with a cursor-paginated loop that fetches a limited batch (e.g.,
100–1000) using prisma.partnerPlatform.findMany with cursor and take (or
findMany with skip-less cursor approach), map each batch to the enqueue payloads
and call enqueueBatchJobs per batch (instead of once for the whole set),
repeating until no more rows; update the logic around partnerPlatform.id
selection and the enqueueBatchJobs call in route.ts so each iteration only
enqueues that chunk and exits when the page returns zero records.
In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts:
- Around line 64-79: The watermark uses partnerPlatformEngagement._max.date to
compute startTime which can advance before partnerPlatformPost rows are
committed; change the logic so the resume watermark only advances after both
engagement and post writes are durable—either (A) compute watermarkDate as the
minimum of the latest committed partnerPlatformPost.date and
partnerPlatformEngagement._max.date (or prefer partnerPlatformPost.date if posts
are the authoritative persisted state), or (B) persist an explicit durable
watermark/state (e.g., lastSyncedDate) only after both partnerPlatformEngagement
and partnerPlatformPost writes complete (or inside a transaction/atomic step)
and use that persisted lastSyncedDate for watermarkDate/startTime; update the
code paths referencing watermarkDate, startTime, and the aggregation on
partnerPlatformEngagement to use the new durable source so the job cannot
advance the watermark before posts are persisted.
In `@apps/web/lib/social-platforms/get-social-content.ts`:
- Around line 93-101: The inline arrow callback used with .forEach returns the
value from url.searchParams.delete(p) which trips the useIterableCallbackReturn
lint rule; change the iteration to not return a value by either using a
block-bodied callback (e.g., .forEach((p) => { url.searchParams.delete(p); }))
or replace .forEach with a for...of loop over the same array and call
url.searchParams.delete(p) inside the loop so the callback does not return the
delete result; update the array iteration surrounding url.searchParams.delete to
one of these patterns to satisfy the linter.
- Around line 43-46: The cache key currently uses only the URL hash (urlHash)
and CACHE_KEY_PREFIX when calling redis.get<SocialContent>(cacheKey), but the
cached result also depends on the selected platform; include the platform
identifier in the key so entries are namespaced per adapter (e.g., incorporate
platform or platform.name/platform.id into cacheKey). Update the code that
builds cacheKey (where urlHash and CACHE_KEY_PREFIX are combined) and any
subsequent redis.get/set calls to use the new key so cached entries are not
shared across different platform adapters.
In `@apps/web/lib/social-platforms/instagram-adapter.ts`:
- Around line 112-119: The adapter currently returns platformId: null which
prevents Instagram from being included in the cron historical sync and results
in no backfilled posts (see instagram-adapter.ts and fetchPosts()); update the
adapter so the returned object sets platformId to a stable Instagram identifier
(e.g., data.data.user.id or another persistent unique id from the Instagram
response) in both places shown (lines around 112-119 and 212-213), or
alternatively add an explicit opt-out flag so sync-social-engagement/route.ts
can skip Instagram until fetchPosts() historical backfill is implemented—ensure
the change targets the returned object that includes platformId and that
fetchPosts() behavior is considered.
In `@apps/web/lib/social-platforms/rate-limiter.ts`:
- Around line 14-15: checkXApiRateLimit() currently always uses the single
"global" bucket which allows background jobs to exhaust the interactive quota;
change the implementation to split workloads by bucket or reserve headroom:
update checkXApiRateLimit to accept a workload/category param (e.g.,
"interactive" | "background") or provide two functions
(checkXApiInteractiveRateLimit and checkXApiBackgroundRateLimit) and call
xApiRateLimiter.limit() with distinct keys (e.g., "x:interactive" and
"x:background"); ensure the interactive bucket has reserved capacity (smaller
rate or separate limit) so combined usage stays under the real X cap and update
all call sites to use the appropriate new function/param.
In `@apps/web/lib/social-platforms/x-adapter.ts`:
- Around line 271-279: The mapping in the return object in x-adapter.ts
incorrectly folds quote tweets into the PartnerPlatformPost.comments field by
using m.reply_count + m.quote_count; change the mapping so comments are only
replies (use m.reply_count) and do not add m.quote_count, since quote_count is
already accounted for in engagementRate; update the return structure where
postId, publishedAt, title, views, likes, comments, engagementRate are set
(references: tweet, m, and the anonymous object returned) to reflect comments =
m.reply_count only.
---
Duplicate comments:
In `@apps/web/lib/bounty/api/detect-bounty-submission-fraud.ts`:
- Around line 101-139: The engagementRateAnomaly branch flags posts based only
on view growth and historical medians (socialMetricCount, medianLikes,
medianViews, medianEngagementRate, ENGAGEMENT_RATE_MULTIPLIER) and ignores the
submission's current engagement counts; update this check to either require and
use the submission's current engagement value (e.g.,
currentLikes/currentComments) in the anomaly predicates (compare observed
engagement to expectedLikes) or skip adding the "engagementRateAnomaly" flag
when no current engagement metric is provided, ensuring the decision that
currently pushes flags.push("engagementRateAnomaly") incorporates the actual
submission engagement or early-returns for view-only inputs.
In `@apps/web/lib/social-platforms/instagram-adapter.ts`:
- Around line 145-152: The current unwrapping logic for variable "unwrapped"
uses the "in" operator on raw.data without ensuring raw.data is a non-null
object; update the condition to explicitly guard raw.data by checking typeof
(raw as any).data === "object" && (raw as any).data !== null before using "in"
so the ternary becomes: ensure raw is an object, raw.data exists and is a
non-null object, and only then check "'xdt_shortcode_media' in (raw as
any).data"; adjust the compound boolean in the unwrapped assignment accordingly
to prevent a runtime TypeError when raw.data is null.
In `@apps/web/scripts/migrations/backfill-twitter-platform-ids.ts`:
- Line 96: The script currently calls main() fire-and-forget; update the
entrypoint so it awaits main() and handles rejections: call main() inside an
async wrapper or promise chain that catches errors, logs them, ensures the
Prisma client (PrismaClient instance e.g., prisma) is disconnected via
prisma.$disconnect(), and calls process.exit(1) on failure (or process.exit(0)
on success); reference main() and the Prisma client variable (prisma or
PrismaClient instance) so the catch block can cleanly disconnect before exiting.
- Around line 67-85: The loop currently treats all rejected Promises as
skippable and unconditionally advances the cursor via startingAfter; change the
logic so that only AccountNotFoundError increments totalSkipped and is ignored,
while any other rejection causes the function to abort/throw (or return an
error) and NOT update startingAfter for that batch; specifically, after
iterating results from Promise.allSettled, check for any result where
result.status === "rejected" and !(result.reason instanceof
AccountNotFoundError) and rethrow or return early instead of setting
startingAfter, otherwise proceed updating totalUpdated/totalSkipped and then set
startingAfter only when no non-skippable errors were encountered (symbols:
getSocialProfile, AccountNotFoundError, results, startingAfter).
---
Nitpick comments:
In `@apps/web/lib/social-platforms/scrape-creators.ts`:
- Around line 46-48: The code uses a non-null assertion for
process.env.SCRAPECREATORS_API_KEY when building the headers object, which can
pass undefined and cause unclear runtime auth failures; update the
module/function that builds the headers (the headers object in
apps/web/lib/social-platforms/scrape-creators.ts) to explicitly validate
process.env.SCRAPECREATORS_API_KEY at startup or before the request and throw a
clear Error (or return a typed failure) if it's missing, so you fail fast with a
descriptive message rather than using the `!` operator.
In `@apps/web/lib/social-platforms/tiktok-adapter.ts`:
- Around line 138-139: The code sets publishedAt using new
Date(data.create_time_utc) which can produce an invalid Date for unexpected
formats; validate and handle parsing: check data.create_time_utc exists, parse
it (e.g., Date.parse or a parsing library like dayjs), construct the Date only
if the parse yields a valid timestamp, and verify via isNaN(date.getTime()) (or
equivalent) before assigning to publishedAt; if invalid, return a safe fallback
(null/undefined) or log/throw as appropriate so publishedAt (the value derived
from create_time_utc) is never an unvalidated invalid Date.
In `@apps/web/lib/social-platforms/youtube-adapter.ts`:
- Around line 133-134: The publishedAt assignment using new
Date(data.publishDateText) can produce Invalid Date for unexpected formats;
update the mapping in youtube-adapter.ts where the returned object sets
publishedAt to validate and normalize the date: attempt parsing (e.g.,
Date.parse or a robust parser), construct a Date only if parsing succeeds, check
isNaN(date.getTime()) and if invalid either try a fallback/alternate parse or
set publishedAt to null/undefined, and ensure any downstream consumers handle
the nullable date — mirror the validation/fallback pattern you used in the
TikTok adapter for consistency.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: be05232e-ee3f-4eeb-b9aa-b7309c7006ae
📒 Files selected for processing (35)
apps/web/.env.exampleapps/web/app/(ee)/api/bounties/[bountyId]/sync-social-metrics/route.tsapps/web/app/(ee)/api/cron/bounties/sync-social-metrics/route.tsapps/web/app/(ee)/api/cron/partner-platforms/route.tsapps/web/app/(ee)/api/cron/queue-sync-social-engagement/route.tsapps/web/app/(ee)/api/cron/sync-social-engagement/route.tsapps/web/app/(ee)/api/partner-profile/programs/[programId]/bounties/[bountyId]/social-content-stats/route.tsapps/web/app/(ee)/api/partners/platforms/callback/route.tsapps/web/app/api/ai/sync-embeddings/route.tsapps/web/lib/actions/partners/create-bounty-submission.tsapps/web/lib/actions/partners/verify-social-account-by-code.tsapps/web/lib/api/scrape-creators/client.tsapps/web/lib/api/scrape-creators/get-social-content.tsapps/web/lib/api/scrape-creators/get-social-profile.tsapps/web/lib/api/scrape-creators/schema.tsapps/web/lib/bounty/api/detect-bounty-submission-fraud.tsapps/web/lib/bounty/api/get-social-metrics-updates.tsapps/web/lib/cron/enqueue-batch-jobs.tsapps/web/lib/cron/with-cron.tsapps/web/lib/social-platforms/base-adapter.tsapps/web/lib/social-platforms/get-social-content.tsapps/web/lib/social-platforms/get-social-profile.tsapps/web/lib/social-platforms/index.tsapps/web/lib/social-platforms/instagram-adapter.tsapps/web/lib/social-platforms/rate-limiter.tsapps/web/lib/social-platforms/scrape-creators.tsapps/web/lib/social-platforms/tiktok-adapter.tsapps/web/lib/social-platforms/x-adapter.tsapps/web/lib/social-platforms/youtube-adapter.tsapps/web/lib/zod/schemas/bounties.tsapps/web/scripts/migrations/backfill-twitter-platform-ids.tsapps/web/vercel.jsonpackages/email/src/templates/unresolved-fraud-events-summary.tsxpackages/prisma/schema/bounty.prismapackages/prisma/schema/platform.prisma
💤 Files with no reviewable changes (4)
- apps/web/lib/api/scrape-creators/get-social-profile.ts
- apps/web/lib/api/scrape-creators/get-social-content.ts
- apps/web/lib/api/scrape-creators/client.ts
- apps/web/lib/api/scrape-creators/schema.ts
| return { | ||
| description: data.data.user.biography, | ||
| platformId: null, | ||
| subscribers: BigInt(data.data.user.edge_followed_by.count), | ||
| posts: BigInt(data.data.user.edge_owner_to_timeline_media.count), | ||
| views: BigInt(0), | ||
| avatarUrl: data.data.user.profile_pic_url, | ||
| }; |
There was a problem hiding this comment.
Instagram can't participate in the new historical sync path yet.
In this PR's cron flow, sync-social-engagement/route.ts skips partner platforms without a platformId, so returning platformId: null here makes Instagram unsyncable. Even if that precondition changes later, fetchPosts() still resolves to [], so the sync would succeed with no backfilled posts or daily engagement. Either populate a stable Instagram ID here or keep Instagram out of the cron sync path until historical post fetch is implemented.
Also applies to: 212-213
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/web/lib/social-platforms/instagram-adapter.ts` around lines 112 - 119,
The adapter currently returns platformId: null which prevents Instagram from
being included in the cron historical sync and results in no backfilled posts
(see instagram-adapter.ts and fetchPosts()); update the adapter so the returned
object sets platformId to a stable Instagram identifier (e.g., data.data.user.id
or another persistent unique id from the Instagram response) in both places
shown (lines around 112-119 and 212-213), or alternatively add an explicit
opt-out flag so sync-social-engagement/route.ts can skip Instagram until
fetchPosts() historical backfill is implemented—ensure the change targets the
returned object that includes platformId and that fetchPosts() behavior is
considered.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
apps/web/app/(ee)/api/cron/sync-social-engagement/route.ts (1)
64-79:⚠️ Potential issue | 🟠 MajorDo not resume from
partnerPlatformEngagementalone.This watermark advances before
partnerPlatformPostis guaranteed to be durable. If a run fails after the engagement writes, the retry resumes from the newer engagement date and posts older than the 2-day overlap never get backfilled. Use a watermark that only moves after both tables succeed, or persist an explicitlastSyncedDateat the end of a successful sync.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts around lines 64 - 79, The code currently sets the watermark from partnerPlatformEngagement alone (latestResult -> latestEngagementDate) which can advance before partnerPlatformPost is durable; change the watermark logic so it only advances after both tables are synced: query partnerPlatformPost similarly (e.g., aggregate max on partnerPlatformPost to get latestPostDate) and compute the watermark from the earlier of latestEngagementDate and latestPostDate (or implement/persist an explicit lastSyncedDate that is only updated after the entire sync completes successfully), then use that stable watermark when computing startTime/endTime instead of using partnerPlatformEngagement._max.date alone.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts:
- Around line 120-128: The current code filters engagementResults into
engagementFailures and only logs a warning, which lets partial write failures
return 200 and bypass withCron retry; instead, detect when
engagementFailures.length > 0 and throw an Error (or reject) so the withCron
wrapper surface the failure to QStash; update the block that references
engagementFailures/engagementResults/dailyEngagements/partnerPlatformId to throw
a descriptive error including the counts and partnerPlatformId (and make the
analogous change in the similar block that covers lines 170-176) so cron
handlers fail fast on any partial writes.
In `@apps/web/lib/social-platforms/youtube-adapter.ts`:
- Around line 156-161: The loop over the uploads playlist currently early-exits
based on an assumed newest-first order, which can miss videos if ordering isn't
guaranteed; update the logic in the uploads playlist pagination loop (the for
(let page = 0; page < 10; page++) block that checks startTime) to stop relying
on ordering—either remove the early-exit condition so you always scan up to the
10-page cap, or implement a safety buffer by comparing items against (startTime
- toleranceMs) before exiting; adjust references to startTime and the early-exit
condition in that block accordingly so the scan is robust even if ordering
changes.
---
Duplicate comments:
In `@apps/web/app/`(ee)/api/cron/sync-social-engagement/route.ts:
- Around line 64-79: The code currently sets the watermark from
partnerPlatformEngagement alone (latestResult -> latestEngagementDate) which can
advance before partnerPlatformPost is durable; change the watermark logic so it
only advances after both tables are synced: query partnerPlatformPost similarly
(e.g., aggregate max on partnerPlatformPost to get latestPostDate) and compute
the watermark from the earlier of latestEngagementDate and latestPostDate (or
implement/persist an explicit lastSyncedDate that is only updated after the
entire sync completes successfully), then use that stable watermark when
computing startTime/endTime instead of using partnerPlatformEngagement._max.date
alone.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 461846ce-c0e6-4591-9cbc-238950257339
📒 Files selected for processing (9)
apps/web/app/(ee)/api/cron/queue-sync-social-engagement/route.tsapps/web/app/(ee)/api/cron/sync-social-engagement/route.tsapps/web/lib/social-platforms/rate-limiter.tsapps/web/lib/social-platforms/x-adapter.tsapps/web/lib/social-platforms/x-client.tsapps/web/lib/social-platforms/x-schemas.tsapps/web/lib/social-platforms/youtube-adapter.tsapps/web/lib/social-platforms/youtube-client.tsapps/web/lib/social-platforms/youtube-schemas.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/web/app/(ee)/api/cron/queue-sync-social-engagement/route.ts
| // The uploads playlist returns newest-first in practice (not guaranteed | ||
| // by the API docs). We paginate until we pass startTime, then stop. | ||
| // Cap at 10 pages (500 videos) — the playlist API doesn't support | ||
| // server-side time filtering, so we scan client-side. Most syncs | ||
| // (2-day window) finish in 1-2 pages due to the early exit. | ||
| for (let page = 0; page < 10; page++) { |
There was a problem hiding this comment.
Reliance on unspecified ordering could miss videos.
The early exit at line 186-189 assumes the uploads playlist returns videos newest-first, but as the comment notes, this isn't guaranteed by the API. If YouTube changes ordering, videos within the time window could be skipped.
Consider either:
- Removing the early exit and scanning all pages (with the 10-page cap as protection)
- Adding a tolerance buffer before exiting
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/web/lib/social-platforms/youtube-adapter.ts` around lines 156 - 161, The
loop over the uploads playlist currently early-exits based on an assumed
newest-first order, which can miss videos if ordering isn't guaranteed; update
the logic in the uploads playlist pagination loop (the for (let page = 0; page <
10; page++) block that checks startTime) to stop relying on ordering—either
remove the early-exit condition so you always scan up to the 10-page cap, or
implement a safety buffer by comparing items against (startTime - toleranceMs)
before exiting; adjust references to startTime and the early-exit condition in
that block accordingly so the scan is robust even if ordering changes.
- Remove quote_count from comments field in x-adapter fetchPosts() so comments only reflects replies (quotes already in engagementRate) - Paginate queue-sync-social-engagement with cursor-based batching
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/web/lib/social-platforms/x-adapter.ts`:
- Around line 140-174: The loop currently stops after 5 pages (500 tweets) and
silently returns incomplete results; update the pagination logic in the function
that calls checkXApiRateLimit/xFetch so that instead of a fixed for (let page =
0; page < 5; page++) cap you either remove the hard limit and iterate until
paginationToken is falsy, or detect when page limit is reached and surface an
explicit error (e.g., throw a SyncOverflowError or XApiRateLimitError with a
clear message) that includes context (userId, startTime, endTime, pagesFetched)
so callers know the sync overflowed; reference paginationToken, allTweets,
xFetch, checkXApiRateLimit, and XApiError/XApiRateLimitError when making the
change.
- Around line 141-145: The X API 429 responses are being wrapped as generic
XApiError which hides upstream rate-limit conditions; update the error handling
in the X API call path where you currently catch and rethrow as XApiError (the
block that handles the response/error after calling the remote X API) to check
err.status (or error.status) and if it equals 429 throw XApiRateLimitError
instead of XApiError; ensure you still preserve original error details when
constructing the XApiRateLimitError and keep the existing local
checkXApiRateLimit/XApiRateLimitError behavior intact so callers can uniformly
detect rate-limit conditions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 802be467-7953-4044-9199-ac3c7a960666
📒 Files selected for processing (2)
apps/web/app/(ee)/api/cron/queue-sync-social-engagement/route.tsapps/web/lib/social-platforms/x-adapter.ts
| const { success } = await checkXApiRateLimit(); | ||
|
|
||
| if (!success) { | ||
| throw new XApiRateLimitError("X API rate limit exceeded"); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /root/repo && cat -n apps/web/lib/social-platforms/x-adapter.ts | sed -n '135,170p'Repository: dubinc/dub
Length of output: 114
🏁 Script executed:
cd /root/repo && cat -n apps/web/lib/social-platforms/x-client.ts | head -80Repository: dubinc/dub
Length of output: 114
🏁 Script executed:
cd /root/repo && rg "class XApiError|class XApiRateLimitError|interface.*Error" apps/web/lib/social-platforms/x-client.ts -A 5Repository: dubinc/dub
Length of output: 114
🏁 Script executed:
pwd && ls -la | head -20Repository: dubinc/dub
Length of output: 1148
🏁 Script executed:
git ls-files "apps/web/lib/social-platforms/x-adapter.ts" "apps/web/lib/social-platforms/x-client.ts"Repository: dubinc/dub
Length of output: 138
🏁 Script executed:
cat -n apps/web/lib/social-platforms/x-adapter.ts | sed -n '130,175p'Repository: dubinc/dub
Length of output: 1474
🏁 Script executed:
cat -n apps/web/lib/social-platforms/x-client.tsRepository: dubinc/dub
Length of output: 2334
Normalize upstream HTTP 429 responses into XApiRateLimitError.
Currently, the local Upstash rate limit check at line 141 throws XApiRateLimitError when exhausted. However, if the X API itself responds with HTTP 429, line 162 rethrows it as a generic XApiError, making upstream quota exhaustion indistinguishable from other API failures. This prevents callers from reliably treating it as a retryable condition.
The error object passed to XApiError already includes the status property, so the fix is straightforward:
Suggested fix
if (error) {
+ if (error.status === 429) {
+ throw new XApiRateLimitError(error.detail ?? error.statusText);
+ }
+
throw new XApiError(error);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/web/lib/social-platforms/x-adapter.ts` around lines 141 - 145, The X API
429 responses are being wrapped as generic XApiError which hides upstream
rate-limit conditions; update the error handling in the X API call path where
you currently catch and rethrow as XApiError (the block that handles the
response/error after calling the remote X API) to check err.status (or
error.status) and if it equals 429 throw XApiRateLimitError instead of
XApiError; ensure you still preserve original error details when constructing
the XApiRateLimitError and keep the existing local
checkXApiRateLimit/XApiRateLimitError behavior intact so callers can uniformly
detect rate-limit conditions.
Summary by CodeRabbit
New Features
Chores