Skip to content

Long-mantissa JSON numbers (>~19 sig digits) silently mis-parsed by orders of magnitude from std::string under default policy #474

@marcelomatus

Description

@marcelomatus

Library: daw_json_link 3.31.0
Compiler: g++-15, reproduced at -std=c++20, c++23, c++26, all -O levels

Summary

A JSON number literal with more significant digits than fit in a uint64_t
(roughly > 19 significant digits) is parsed to a catastrophically wrong
value — off by many orders of magnitude — silently, with no error, when both:

  1. the document is passed as a std::string (a char[] literal or a
    std::string_view over the same bytes parse correctly), and
  2. the default parse policy is used (options::IEEE754Precise::no).

This is not a last-ULP rounding error. For 0.33333333333333333333 (20 threes)
the parser returns 0.14886589259623784; adding more digits drives the result
toward 1e-20 and smaller.

Minimal reproducer

#include <cstdio>
#include <string>
#include <daw/json/daw_json_link.h>

struct Holder { double v; };
template<> struct daw::json::json_data_contract<Holder> {
  using type = json_member_list<json_number<"v", double>>;
};

int main() {
  const std::string doc = R"({"v":0.33333333333333333333})"; // 20 sig digits
  std::printf("%.17g\n", daw::json::from_json<Holder>(doc).v);
  // prints 0.14886589259623784   (expected 0.33333333333333331)
}

Input-type dependence (same bytes, same literal)

Input form Result
char[] literal 0.33333333333333331
std::string_view 0.33333333333333331
std::string 0.14886589259623784

It is general, not value-specific

Divergence from strtod/std::from_chars begins once the significant-digit
count crosses ~19–20, for every magnitude tested. The catastrophic value keeps
shrinking as digits are added:

0.33333333333333333333        (20) -> 0.14886589259623784
0.333333333333333333333       (21) -> 0.0012919400065614041
0.3333333333333333333333333   (25) -> 6.6792140173563225e-07
1.00000000000000000001        (21) -> 0.077662796314522414   (truth: 1.0)
123.44444444444444444444      (23) -> 0.035726591327544133   (truth: 123.444…)

Root cause

include/daw/json/impl/daw_json_parse_real.h, parse_real_known (and the same
shape in parse_real_unknown):

  • use_strtod is computed unconditionally (should_use_strtod(...), ~L153),
    correctly detecting digit_count > digits10<uint64_t>.
  • But the statement that acts on it —
    if (use_strtod) return parse_with_strtod(...) (~L267–271) — lives inside
    if constexpr (std::is_floating_point_v<Result> and ParseState::precise_ieee754)
    (~L259).

Under the default policy precise_ieee754 is false, so the entire fallback is
compiled out. Execution always falls through to
power10(significant_digits, exponent) with a significand that has been
truncated/overflowed, producing the garbage value. The char[] / string_view
paths happen to take a parse-state configuration that avoids the faulty
fall-through; the std::string path does not.

Confirmation / workaround

Enabling precise parsing fixes every case:

daw::json::from_json<Holder>(
    doc,
    daw::json::options::parse_flags<daw::json::options::IEEE754Precise::yes>);
// -> 0.33333333333333331  (correct)

Suggested fix

Either:

  • act on use_strtod regardless of precise_ieee754 when
    digit_count > digits10<uint64_t> (over-long significands are unsafe in the
    fast path no matter the precision mode), or
  • ensure the std::string overload selects the same parse-state configuration
    as std::string_view, and when digits are dropped/overflowed without a
    strtod fallback, adjust the decimal exponent so the result is at worst a
    truncation rather than a magnitude error.

At minimum, the fast path should never return a result off by orders of
magnitude without signaling an error.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions