Skip to content

Commit 6a661b5

Browse files
committed
Fix email ingestor: treat dates slightly in the future correctly
There was a logic which tried to prvent future dates because of some artifacts in old mboxes, but this caused issues with timezones and missing timezone handling. That is now fixed even for mboxes, but for the imap ingestor we simply accept the date as is.
1 parent 0a00729 commit 6a661b5

2 files changed

Lines changed: 14 additions & 7 deletions

File tree

app/services/email_ingestor.rb

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
# frozen_string_literal: true
22

33
class EmailIngestor
4-
def ingest_raw(raw_message, fallback_threading: false)
4+
def ingest_raw(raw_message, fallback_threading: false, trust_date: false)
55
m = Mail.new(raw_message)
66

77
message_id = clean_reference(m.message_id)
88
message_id = nil if message_id.blank?
99
return nil unless message_id
10-
sent_at = sanitize_email_date(m.date, m[:date], message_id)
10+
sent_at = trust_date ? m.date : sanitize_email_date(m.date, m[:date], message_id)
1111

1212
body = normalize_body(extract_body(m))
1313
existing_message = Message.find_by_message_id(message_id)
@@ -302,10 +302,17 @@ def add_mentions(msg, users)
302302
end
303303

304304
def sanitize_email_date(mail_date, mail_date_header, message_id)
305-
current_time = Time.now
306-
return mail_date if mail_date.nil? || (mail_date >= Time.parse('1996-01-01') && mail_date <= current_time)
305+
return mail_date if mail_date.nil?
306+
307+
current_time_utc = Time.now.utc
308+
mail_date_utc = mail_date.utc
309+
min_valid_date = Time.parse('1996-01-01 00:00:00 UTC')
310+
future_tolerance = 24 * 3600
311+
312+
if mail_date_utc >= min_valid_date && mail_date_utc <= current_time_utc + future_tolerance
313+
return mail_date
314+
end
307315

308-
original_date = mail_date
309316
sanitized_date = mail_date
310317

311318
if mail_date_header && mail_date_header.to_s =~ /\b(\d{2})\s+\w+\s+(\d{2,4})\b/
@@ -320,7 +327,7 @@ def sanitize_email_date(mail_date, mail_date_header, message_id)
320327
end
321328
end
322329

323-
if sanitized_date > current_time || sanitized_date.year < 1996
330+
if sanitized_date.utc > current_time_utc + future_tolerance || sanitized_date.year < 1996
324331
sanitized_date = Time.parse('2000-01-01 00:00:00 UTC')
325332
end
326333

app/services/imap_idle_runner.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def process_uid(uid)
149149
msg = nil
150150
ActiveRecord::Base.transaction do
151151
msg = instrument('ingestor.ingest', uid: uid) do
152-
@ingestor.ingest_raw(raw)
152+
@ingestor.ingest_raw(raw, trust_date: true)
153153
end
154154
end
155155

0 commit comments

Comments
 (0)