Skip to content

CryptoFail due to Invalid Nonce, caused by Quiche AEAD algorithm selection #2411

@somethingcorrosive

Description

@somethingcorrosive

As part of the contributing guidelines I am creating this issue to start the conversation before pushing a PR with the fix, which I will include the fix in this issue as well.

I also looked for other conversations, I found this one from 2020, with no response: #524

In our QUIC stack, connections would complete handshake but then fail on early 1-RTT sends because quiche’s BoringSSL backend selected TLS13-specific AES-GCM AEAD entrypoints (EVP_aead_aes_gcm_tls13), which on the resolved boring backend rejected that runtime nonce/counter usage with INVALID_NONCE; this surfaced from quiche as CryptoFail, and in tokio-quiche appeared as send_on_path failure followed by local transport close code 1, causing reconnect churn. Switching quiche’s AES-GCM selection to the generic AEAD entrypoints (EVP_aead_aes_gcm) resolved the nonce failure and restored stable connections.

Findings:

  • In tokio-quiche, that failure is observed in the worker send path at qconn.send_on_path(...), then it closes with transport InternalError (1) if no local error is already set , see source here:

if send_buf.len() > segment_size.unwrap_or(usize::MAX) {
// Never let the buffer be longer than segment size, for GSO to
// function properly.
send_buf = &mut send_buf[..segment_size.unwrap_or(usize::MAX)];
}
// On the first call to `select_path()` a path will be chosen based on
// the local address the connection initially landed on. Once a path is
// selected following calls to `select_path()` will return it, until it
// is reset at the start of the next write cycle.
//
// The path is then passed to `send_on_path()` which will only generate
// packets meant for that path, this way a single GSO buffer will only
// contain packets that belong to the same network path, which is
// required because the from/to addresses for each `sendmsg()` call
// apply to the whole GSO buffer.
let (from, to) = self.select_path(qconn).unzip();
match qconn.send_on_path(send_buf, from, to) {
Ok((packet_size, info)) => {
let _ = send_info.get_or_insert(info);
self.write_state.bytes_written += packet_size;
self.write_state.num_pkts += 1;
let from = send_info.as_ref().map(|info| info.from);
let to = send_info.as_ref().map(|info| info.to);
self.write_state.selected_path = from.zip(to);
self.write_state.has_pending_data = true;
Ok(packet_size)
},
Err(QuicheError::Done) => {
// Flush the current buffer to network. If no other path needs
// to be flushed to the network also yield the work loop task.
//
// Otherwise the write loop will start again and the next path
// will be selected.
let has_pending_paths = self.write_state.pending_paths.len() > 0;
// Keep writing if there are paths left to try.
self.write_state.has_pending_data = has_pending_paths;
Ok(0)
},
Err(e) => {
let error_code = if let Some(local_error) = qconn.local_error() {
local_error.error_code
} else {

  • The send path calls into quiche packet encryption (encrypt_pkt -> seal_with_u64_counter):

quiche/quiche/src/packet.rs

Lines 711 to 725 in b30f9e7

extra_in,
)?;
encrypt_hdr(&mut header, pn_len, payload.as_ref(), aead)?;
Ok(payload_offset + ciphertext_len)
}
pub fn encode_pkt_num(
pn: u64, pn_len: usize, b: &mut octets::OctetsMut,
) -> Result<()> {
match pn_len {
1 => b.put_u8(pn as u8)?,
2 => b.put_u16(pn as u16)?,

  • In quiche’s BoringSSL backend, AEAD algorithm selection uses TLS13-specific AES-GCM entrypoints:

#[repr(C)]
pub(crate) struct AES_KEY {
rd_key: [u32; 4 * (14 + 1)],
rounds: c_int,
}
impl Algorithm {
fn get_evp_aead(self) -> *const EVP_AEAD {
match self {

  • The actual crypto failure is emitted from the seal call (EVP_AEAD_CTX_seal_scatter) which maps to CryptoFail on non-success: 
    let rc = unsafe {
    EVP_AEAD_CTX_open(
    &self.ctx, // ctx
    buf.as_mut_ptr(), // out
    &mut out_len, // out_len
    max_out_len, // max_out_len
    nonce[..].as_ptr(), // nonce
    nonce.len(), // nonce_len
    buf.as_ptr(), // inp
    buf.len(), // in_len
    ad.as_ptr(), // ad
    ad.len(), // ad_len
    )
    };
    if rc != 1 {
    return Err(Error::CryptoFail);
    }
    Ok(out_len)
    }
    pub fn seal_with_u64_counter(
    &mut self, counter: u64, ad: &[u8], buf: &mut [u8], in_len: usize,
    extra_in: Option<&[u8]>,
    ) -> Result<usize> {
    let tag_len = self.alg.tag_len();
    let mut out_tag_len = tag_len;
    let (extra_in_ptr, extra_in_len) = match extra_in {
    Some(v) => (v.as_ptr(), v.len()),
    None => (std::ptr::null(), 0),
    };
    // Make sure all the outputs combined fit in the buffer.
    if in_len + tag_len + extra_in_len > buf.len() {
    return Err(Error::CryptoFail);
    }
    let nonce = make_nonce(&self.nonce, counter);
    let rc = unsafe {

I have tested this patch locally and it works, however, it could have some potential impacts on decisions later.

The fix is as follows:

impl Algorithm {
fn get_evp_aead(self) -> *const EVP_AEAD {
match self {
Algorithm::AES128_GCM => unsafe { EVP_aead_aes_128_gcm_tls13() },
Algorithm::AES256_GCM => unsafe { EVP_aead_aes_256_gcm_tls13() },
Algorithm::ChaCha20_Poly1305 => unsafe {
EVP_aead_chacha20_poly1305()
},
}
}
}

Changes to:

`impl Algorithm {
    fn get_evp_aead(self) -> *const EVP_AEAD {
        match self {
            Algorithm::AES128_GCM => unsafe { EVP_aead_aes_128_gcm() },
            Algorithm::AES256_GCM => unsafe { EVP_aead_aes_256_gcm() },
            Algorithm::ChaCha20_Poly1305 => unsafe {
                EVP_aead_chacha20_poly1305()
            },
        }
    }
}`

And:

fn EVP_aead_aes_128_gcm_tls13() -> *const EVP_AEAD;
fn EVP_aead_aes_256_gcm_tls13() -> *const EVP_AEAD;
fn EVP_aead_chacha20_poly1305() -> *const EVP_AEAD;

Changes to:

`extern "C" {
    fn EVP_aead_aes_128_gcm() -> *const EVP_AEAD;
    fn EVP_aead_aes_256_gcm() -> *const EVP_AEAD;
    fn EVP_aead_chacha20_poly1305() -> *const EVP_AEAD;`

PR: #2412

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions