Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 11 additions & 20 deletions PRNG.ml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ module type STATE = sig
val make_self_init: unit -> t
val bool: t -> bool
val bit: t -> bool
val uniform: t -> float
val float: t -> float -> float
val byte: t -> int
val bits8: t -> int
Expand Down Expand Up @@ -49,6 +50,7 @@ module type PURE = sig
val make_self_init: unit -> t
val bool: t -> bool * t
val bit: t -> bool * t
val uniform: t -> float * t
val float: float -> t -> float * t
val byte: t -> int * t
val bits8: t -> int * t
Expand Down Expand Up @@ -143,14 +145,12 @@ let nativeint =
then fun g bound -> Nativeint.of_int32 (int32 g (Nativeint.to_int32 bound))
else fun g bound -> Int64.to_nativeint (int64 g (Int64.of_nativeint bound))

let float_64 g bound =
let rec uniform g =
let b = X.bits64 g in
(Int64.(to_float (shift_right_logical b 11)) *. 0x1.p-53) *. bound
let n = Int64.shift_right_logical b 11 in
if n = 0L then uniform g else Int64.to_float n *. 0x1.p-53
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation certainly works, but:

  • Why masking the lowest 11 bits when we could in fact take them into account (and multiply by 0x1p-64 instead)?
  • Testing n = 0 and branching can be a bit costly. In stead, the same behaviour could be obtained by adding a small number after the multiplication (0x1p-54 or 0x1p-65, depending on what you choose for the first remark). Sure, then you need to change the documentation below...

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"take [all the bits] into account (and multiply by 0x1p-64 instead)": then you get FP rounding during the int64 -> float conversion. The rounding can produce 0.0 and 1.0, and seems to introduce some (very subtle) bias in the output, as discussed in Generating Random Floating-Point Numbers by Dividing Integers: a Case Study. With the current code, we know exactly which FP values we can get, and we're sure they all have the same probability.

"Testing n = 0 and branching can be a bit costly". Really? This is one integer comparison plus a branch that is perfectly predicted (as n is almost never 0).

"adding a small number after the multiplication": the resulting rounding makes me nervous (again).

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for fun: I tweaked the n = 0 test so that the conditional branch is statically predicted as not taken.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Adding 0x1p-54 here would make producing 1. possible because of the round-to-even rule. And adding a slightly smaller number would introduce a -0x1p55 biais, because all values > 0.5 would be rounded down. While this is a very small biais (probably impossible to notice), I then agree that what you propose is probably the best.

My concern about branching was a remainder of memories of what's happening when writing the same code in C : then, the introduced branching would be a problem, since it would prevent some SIMD optimizations. But Ocamlopt does not support these optimizations anyway, and the branch will very likely be predicted.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I forgot about your adventures with SIMD :-)

I take it that you find the name uniform acceptable for this function ? The only alternative I could think of is float1, but it's ugly.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that uniform is not such a good name, because it does not follow the convention of the rest of the file, where the name follow, in some sense, the type of the return value, while uniform describes the distribution.

One possibility would be to not provide publicly this function, but still document the new guarantee. The downside is the additional multiplication by the constant 1., but:

  • This is "just" a float multiplication, which is not very costly
  • We could hope (hint, hint !) that the OCaml compiler would remove the multiplication if one of the multiplicand is 1., which would be clear after inlining. (Correct me if I'm wrong but removing multiplication by 1.0 is one of the only correct simplification of float computations.)

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the additional multiplication by 1.0 is not costly, so it's OK if users continue to call [float g 1.0] instead of [uniform g]. On the other hand, I find it makes the documentation clearer. By "it" I mean "having the uniform function exported and documented separately". So let's do that.


let float_32 g bound =
let a = X.bits30 g in
let b = X.bits30 g in
(float a *. 0x1.p-60 +. float b *. 0x1.p-30) *. bound
let float g bound = uniform g *. bound

end

Expand Down Expand Up @@ -218,14 +218,13 @@ let nativeint =
(Int64.to_nativeint r, g')
end

let float_64 bound g =
let rec uniform g =
let (b, g) = X.bits64 g in
((Int64.(to_float (shift_right_logical b 11)) *. 0x1.p-53) *. bound, g)
let n = Int64.shift_right_logical b 11 in
if n = 0L then uniform g else (Int64.to_float n *. 0x1.p-53, g)

let float_32 bound g =
let (a, g) = X.bits30 g in
let (b, g) = X.bits30 g in
((float a *. 0x1.p-60 +. float b *. 0x1.p-30) *. bound, g)
let float bound g =
let (f, g) = uniform g in (f *. bound, g)

end

Expand Down Expand Up @@ -317,8 +316,6 @@ include StateDerived(struct
let errorprefix = "PRNG.Splitmix.State."
end)

let float = float_64

let bytes g dst ofs len =
if ofs < 0 || len < 0 || ofs > Bytes.length dst - len then
invalid_arg "PRNG.State.bytes"
Expand Down Expand Up @@ -401,8 +398,6 @@ include PureDerived(struct
let errorprefix = "PRNG.Splitmix.Pure."
end)

let float = float_64

let split g =
let g1 = next g in
let g2 = next g1 in
Expand Down Expand Up @@ -554,8 +549,6 @@ include StateDerived(struct
let errorprefix = "PRNG.Chacha.State."
end)

let float = if Sys.word_size = 64 then float_64 else float_32

let bytes g dst ofs len =
if ofs < 0 || len < 0 || Bytes.length dst - len > ofs then
invalid_arg "PRNG.Chacha.State.bytes";
Expand Down Expand Up @@ -676,8 +669,6 @@ include PureDerived(struct
let errorprefix = "PRNG.Chacha.Pure."
end)

let float = if Sys.word_size = 64 then float_64 else float_32

let bytes g dst ofs len =
if ofs < 0 || len < 0 || Bytes.length dst - len > ofs then
invalid_arg "PRNG.Chacha.Pure.bytes";
Expand Down
16 changes: 13 additions & 3 deletions PRNG.mli
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,19 @@ module type STATE = sig
val bit: t -> bool
(** Return a Boolean value in [false,true] with 0.5 probability each. *)

val uniform: t -> float
(** Return a floating-point number evenly distributed between 0.0 and 1.0.
0.0 and 1.0 are never returned.
The result is of the form [n * 2{^-53}], where [n] is a random integer
in [(0, 2{^53})]. *)

val float: t -> float -> float
(** [float g x] returns a floating-point number evenly distributed
between 0.0 and [x]. If [x] is negative, negative numbers
between [x] and 0.0 are returned. *)
between 0.0 and [x]. If [x] is negative, negative numbers
between [x] and 0.0 are returned. Implemented as [uniform g *. x].
Consequently, the values [0.0] and [x] can be returned
(as a result of floating-point rounding), but not if [x] is
[1.0], in which case [float g x] behaves like [uniform g]. *)

val byte: t -> int
val bits8: t -> int
Expand Down Expand Up @@ -90,7 +99,7 @@ module type STATE = sig
Note that [int32 Int32.max_int] produces numbers between 0 and
[Int32.max_int] excluded. To produce numbers between 0 and
[Int32.max_int] included, use
[Int32.logand (bits32 g) Int64.max_int]. *)
[Int32.logand (bits32 g) Int32.max_int]. *)

val bits64: t -> int64
(** Return a 64-bit integer evenly distributed between
Expand Down Expand Up @@ -179,6 +188,7 @@ module type PURE = sig
val bool: t -> bool * t
val bit: t -> bool * t

val uniform: t -> float * t
val float: float -> t -> float * t

val byte: t -> int * t
Expand Down