Internet-Draft U-CBOR March 2025
Rundgren Expires 4 September 2025 [Page]
Workgroup:
Internet Engineering Task Force
Internet-Draft:
draft-rundgren-universal-cbor-02
Published:
Intended Status:
Informational
Expires:
Author:
A. Rundgren, Ed.
Independent

Universal CBOR (U-CBOR)

Abstract

This document defines Universal CBOR (U-CBOR), a strict subset of CBOR (RFC 8949) intended to serve as a viable replacement for JSON. To foster interoperability, deterministic encoding is mandated. Furthermore, the document outlines how deterministic encoding combined with enhanced CBOR tools, enables the support of cryptographic constructs that operate on "raw" (non-wrapped) CBOR data. This document mainly targets CBOR tool developers.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 4 September 2025.

Table of Contents

1. Introduction

The Universal CBOR (U-CBOR) specification is based on CBOR [RFC8949]. While there are different ways you can encode certain CBOR objects, this is non-trivial to support in general purpose platform-based tools, not to mention the limited utility of such measures. To cope with this, U-CBOR defines a specific (non-variant) encoding scheme, aka "Deterministic Encoding". The selected encoding scheme is believed to be compatible with most existing systems using CBOR. See also Appendix C.

U-CBOR is intended to be agnostic with respect to programming languages.

By combining the compact binary representation and the rich set of data types offered by CBOR, with a deterministic encoding scheme, U-CBOR could for new designs, serve as viable alternative to JSON [RFC8259]. Although the mandated encoding scheme has proved to be deployable in constrained environments, the primary target is rather mainstream platforms like mobile phones and Web servers.

However, for unleashing the full power of deterministic encoding, the ability to perform cryptographic operations on "raw" (non-wrapped) CBOR data, compliant U-CBOR tools need additional functionality. See also Appendix B.

Section 2 contains the actual specification.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

1.2. Common Definitions

2. Detailed Description

This section describes the three pillars that U-CBOR relies on.

2.1. Supported CBOR Objects

The following table shows the CBOR subset supported by U-CBOR:

Table 1: Supported CBOR Objects
CDDL Note
int Integer
bigint Big integer
float 16-, 32-, and 64-bit [IEEE754] numbers
tstr Text string encoded as UTF-8 [RFC3629]
bstr Byte string
bool Boolean true and false
null Represents a null object
[] Array
{} Map
#6.nnn(type) Tagged data

Conforming implementations (of course) only have to implement the U-CBOR types required by the targeted application(s).

Although extensions are imaginable (like supporting all "simple" types), extensions will most likely cause interoperability issues and are thus NOT RECOMMENDED. In addition, the mandated CBOR subset is compatible with most computer languages and platforms. Compared to the current state-of-the-art, JSON [RFC8259], the availability of bigint, bstr, and "tagged data" represent major improvements.

However, nothing prevents developers from at the application (API) level, through mapping concepts, support additional, "virtual" data types, analogous to how you map an application's data model to the set of data types available, be it a data interchange format, a database, or a programming language.

2.2. Deterministic Encoding Scheme

The U-CBOR encoding scheme adheres to section 4.2 of [RFC8949], but adds a few constraints (denoted by RFC+), where the RFC offers choices. The deterministic encoding rules are as follows:

  • RFC+: Floating point and integer objects MUST be treated as distinct types regardless of their numeric value. This is compliant with Rule 2 in section 4.2.2 of [RFC8949].
  • RFC: Integers, represented by the int and bigint types, MUST use the int type if the value is between -264 and 264-1, otherwise the bigint type MUST be used. Appendix A.1 features a list of compliant integer sample values.

  • RFC+: Floating point numbers MUST always use the shortest [IEEE754] variant that preserves the precision of the original value. Appendix A.2 features a list of compliant floating point sample values.

    Note that NaN "signaling" (like f97e01), MUST be rejected.

  • RFC: Map keys MUST be sorted in the bytewise lexicographic order of their deterministic encoding. Duplicate keys MUST be rejected. Somewhat surprisingly the following represents a properly sorted map:

    {
      "a": ... ,
      "b": ... ,
      "aa": ...
    }
  • RFC+: Since CBOR encoding according to this specification maintains type and data uniqueness, there are no specific restrictions or tests needed in order to determine map key equivalence. As an example, the floating-point numbers 0.0 and -0.0, and the integer number 0 represent the distinct keys f90000, f98000, and 00 respectively.
  • RFC+: Indefinite length objects MUST be rejected.

2.3. CBOR Tool Requirements

The primary feature that deterministic encoding brings to the table is that wrapping CBOR data to be signed in bstr objects, like specified by COSE [RFC9052] (Section 2), no longer is a prerequisite. That is, cryptographic operations can optionally be performed on "raw" CBOR data. Turn to Appendix B for an example of an application depending on such features.

However, to make this a reality, the following functionality MUST be provided by CBOR tools compliant with this specification:

  • It MUST be possible to add, delete, and update the contents of CBOR map and array objects, of received and decoded CBOR data. Note: CBOR primitives MUST remain immutable.
  • It MUST be possible to reserialize received CBOR data, be it updated or not.
  • Irrespective of if CBOR data was received, updated, or created programmatically, deterministic encoding MUST be maintained.
  • Invalid or unsupported CBOR constructs, as well as CBOR data not adhering to the deterministic encoding scheme MUST be rejected. See also Appendix C and Appendix A.3.

It is RECOMMENDED separating CBOR data and application / platform-level data, since the latter may not always reserialize as expected, like in this Chrome browser console example:

> let date = new Date('2025-03-02T13:08:55.0001Z');
> date.toISOString()
'2025-03-02T13:08:55.000Z'

How this separation actually is accomplished is out of scope for this specification. However, encapsulation of CBOR data in high-level, and self-rendering objects, represents a usable method. Similar approaches are used by most ASN.1 tools. The code in Appendix B.5 shows an example that updates and reserializes decoded CBOR data.

3. IANA Considerations

This memo includes no request to IANA.

4. Security Considerations

All is good 😸

5. References

5.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8949]
Bormann, C. and P. Hoffman, "Concise Binary Object Representation (CBOR)", STD 94, RFC 8949, DOI 10.17487/RFC8949, , <https://www.rfc-editor.org/info/rfc8949>.
[RFC8610]
Birkholz, H., Vigano, C., and C. Bormann, "Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, , <https://www.rfc-editor.org/info/rfc8610>.
[RFC3629]
Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, , <https://www.rfc-editor.org/info/rfc3629>.
[IEEE754]
IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, <https://ieeexplore.ieee.org/document/8766229>.

5.2. Informative References

[RFC9052]
Schaad, J., "CBOR Object Signing and Encryption (COSE): Structures and Process", STD 96, RFC 9052, DOI 10.17487/RFC9052, , <https://www.rfc-editor.org/info/rfc9052>.
[RFC9053]
Schaad, J., "CBOR Object Signing and Encryption (COSE): Initial Algorithms", RFC 9053, DOI 10.17487/RFC9053, , <https://www.rfc-editor.org/info/rfc9053>.
[RFC8785]
Rundgren, A., Jordan, B., and S. Erdtman, "JSON Canonicalization Scheme (JCS)", RFC 8785, DOI 10.17487/RFC8785, , <https://www.rfc-editor.org/info/rfc8785>.
[RFC8259]
Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", STD 90, RFC 8259, DOI 10.17487/RFC8259, , <https://www.rfc-editor.org/info/rfc8259>.
[CSF]
Rundgren, A., "CBOR Signature Format (CSF)", <https://cyberphone.github.io/javaapi/org/webpki/cbor/doc-files/signatures.html>.
[CEF]
Rundgren, A., "CBOR Encryption Format (CEF)", <https://cyberphone.github.io/javaapi/org/webpki/cbor/doc-files/encryption.html>.
[CREDENTIALS]
Sporny (et al), M., "Verifiable Credential Data Integrity 1.0", , <https://www.w3.org/TR/vc-data-integrity/>.
[ECMASCRIPT]
Ecma International, "ECMAScript 2020 Language Specification", Standard ECMA-262, 11th Edition, , <https://www.ecma-international.org/publications/standards/Ecma-262.htm>.
[GordianEnvelope]
McNally, W. and C. Allen, "The Gordian Envelope Structured Data Format", <https://datatracker.ietf.org/doc/draft-mcnally-envelope/>.

Appendix A. Deterministic Encoding Samples

A.1. Integers

This normative section holds a selection of CBOR integer values, with an emphasize on edge cases.

Table 2: Integers
Value CBOR Encoding Note
0 00 Smallest positive implicit int
-1 20 Smallest negative implicit int
23 17 Largest positive implicit int
-24 37 Largest negative implicit int
24 1818 Smallest positive one-byte int
-25 3818 Smallest negative one-byte int
255 18ff Largest positive one-byte int
-256 38ff Largest negative one-byte int
256 190100 Smallest positive two-byte int
-257 390100 Smallest negative two-byte int
65535 19ffff Largest positive two-byte int
-65536 39ffff Largest negative two-byte int
65536 1a00010000 Smallest positive four-byte int
-65537 3a00010000 Smallest negative four-byte int
4294967295 1affffffff Largest positive four-byte int
-4294967296 3affffffff Largest negative four-byte int
4294967296 1b0000000100000000 Smallest positive eight-byte int
-4294967297 3b0000000100000000 Smallest negative eight-byte int
18446744073709551615 1bffffffffffffffff Largest positive eight-byte int
-18446744073709551616 3bffffffffffffffff Largest negative eight-byte int
18446744073709551616 c249010000000000000000 Smallest positive bigint
-18446744073709551617 c349010000000000000000 Smallest negative bigint

A.2. Floating Point Numbers

This normative section holds a selection of [IEEE754] 16, 32, and 64-bit values, with an emphasize on edge cases.

The textual representation of the values is based on the serialization method for the Number data type, defined by [ECMASCRIPT] with one change: to comply with diagnostic notation (section 8 of [RFC8949]), all values are expressed as floating point numbers. The rationale for using [ECMASCRIPT] serialization is because it supposed to generate the shortest and most correct representation of [IEEE754] numbers.

Table 3: Floating Point Numbers
Value CBOR Encoding Note
0.0 f90000 Zero
-0.0 f98000 Negative zero
Infinity f97c00 Infinity
-Infinity f9fc00 -Infinity
NaN f97e00 NaN
5.960464477539063e-8 f90001 Smallest positive subnormal 16-bit float
0.00006097555160522461 f903ff Largest positive subnormal 16-bit float
0.00006103515625 f90400 Smallest positive 16-bit float
65504.0 f97bff Largest positive 16-bit float
1.401298464324817e-45 fa00000001 Smallest positive subnormal 32-bit float
1.1754942106924411e-38 fa007fffff Largest positive subnormal 32-bit float
1.1754943508222875e-38 fa00800000 Smallest positive 32-bit float
3.4028234663852886e+38 fa7f7fffff Largest positive 32-bit float
5.0e-324 fb0000000000000001 Smallest positive subnormal 64-bit float
2.225073858507201e-308 fb000fffffffffffff Largest positive subnormal 64-bit float
2.2250738585072014e-308 fb0010000000000000 Smallest positive 64-bit float
1.7976931348623157e+308 fb7fefffffffffffff Largest positive 64-bit float
-0.0000033333333333333333 fbbecbf647612f3696 Randomly selected number
295147905179352830000.0 fa61800000 ~268
2.0 f94000 Number without a fractional part
-5.960464477539063e-8 f98001 Smallest negative subnormal 16-bit float
-5.960464477539062e-8 fbbe6fffffffffffff Close to smallest negative subnormal 16-bit float
-5.960464477539064e-8 fbbe70000000000001          ""
-5.960465188081798e-8 fab3800001          ""
0.0000609755516052246 fb3f0ff7ffffffffff Close to largest subnormal 16-bit float
0.000060975551605224616 fb3f0ff80000000001          ""
0.000060975555243203416 fa387fc001          ""
0.00006103515624999999 fb3f0fffffffffffff Close to smallest 16-bit float
0.00006103515625000001 fb3f10000000000001          ""
0.00006103516352595761 fa38800001          ""
65503.99999999999 fb40effbffffffffff Close to largest 16-bit float
65504.00000000001 fb40effc0000000001          ""
65504.00390625 fa477fe001          ""
1.4012984643248169e-45 fb369fffffffffffff Close to smallest subnormal 32-bit float
1.4012984643248174e-45 fb36a0000000000001          ""
1.175494210692441e-38 fb380fffffbfffffff Close to largest subnormal 32-bit float
1.1754942106924412e-38 fb380fffffc0000001          ""
1.1754943508222874e-38 fb380fffffffffffff Close to smallest 32-bit float
1.1754943508222878e-38 fb3810000000000001          ""
3.4028234663852882e+38 fb47efffffdfffffff Close to largest 32-bit float
3.402823466385289e+38 fb47efffffe0000001          ""

A.3. Invalid Encodings

The following table holds a selection of valid CBOR objects, not permitted by U-CBOR.

Table 4: Invalid Encodings
CBOR Encoding Diagnostic Notation Note
a2616200616101 {"b":0,"a":1} Improper map key ordering [1]
1900ff 255 Number with leading zero bytes [1]
c34a00010000000000000000 -18446744073709551617 Number with leading zero bytes [1]
Fa41280000 10.5 Not in shortest encoding [1]
fa7fc00000 NaN Not in shortest encoding [1]
c243010000 65536 Incorrect value for bigint [1]
f97e01 NaN NaN with payload [1]
f7 undefined Unsupported simple type
f0 simple(16) Unsupported simple type
5f4101420203ff (_ h'01', h'0203') Unsupported indefinite length object
  1. See also Appendix C.

Appendix B. Enveloped Signatures

This is a non-normative appendix showing how U-CBOR can be used for supporting enveloped signatures.

The primary advantages with enveloped signatures compared to the approach used by COSE [RFC9052] include:

Enveloped signatures are for example featured in Verified Credentials [CREDENTIALS]. A drawback with designs based on JSON [RFC8259] is that they rely on canonicalization schemes like JCS [RFC8785], that require specialized encoders and decoders, whereas U-CBOR works "straight out of the box".

Although this specification is not "married" to any particular signature schema, the example uses the CBOR Signature Format [CSF]. For the sake of simplicity, the example uses an HMAC (see Appendix B.4) as signature algorithm.

B.1. Unsigned Data

Imagine you have a CBOR map object like the following that you want to sign:

{
  1: "data",
  2: "more data"
}

Then continue to the next section (Appendix B.2)...

B.2. Signature Process

This section describes the steps required for adding an enveloped signature to the CBOR map object in Appendix B.1.

  1. Add an empty CSF container (a CBOR map) to the unsigned CBOR map using an application-defined label (-1).
  2. Add the designated signature algorithm to the CSF container using the CSF algorithm label (1).
  3. Optional. Add other signature meta data to the CSF container. Not used in the example.
  4. Generate a signature by invoking a (hypothetical) signature method with the following arguments:

    • the designated signature key.
    • the designated signature algorithm.
    • the deterministic encoding of the current CBOR object in its entirety. In the example that would be a301646461746102696d6f7265206461746120a10105, if expressed in hex code.
  5. Add the returned signature value to the CSF container using the CSF signature label (6).

The result after the final step (using the parameters from Appendix B.4), should match the following CBOR object:

{
  1: "data",
  2: "more data",
  -1: {
    1: 5,
    6: h'4853d7730cc1340682b1748dc346cf627a5e91ce62c67fff15c40257ed2a37a1'
  }
}

Note that the signature covers the entire CBOR object except for the CSF signature value and label (6).

B.3. Validation Process

In order to validate the enveloped signature created in the Appendix B.2, the following steps are performed:

  1. Fetch a reference to the CSF container using the application-defined label (-1). Next perform the following operations using the reference:

    1. Retrieve the signature algorithm using the CSF algorithm label (1).
    2. Retrieve the signature value using the CSF algorithm label (6).
    3. Remove the CSF algorithm label (6) and its associated value.

    Now we should have exactly the same CBOR object as we had before step #4 in Appendix B.2. That is:

    {
      1: "data",
      2: "more data",
      -1: {
        1: 5
      }
    }
  2. Validate the signature data by invoking a (hypothetical) signature validation method with the following arguments:

    • the designated signature key (in the example taken from Appendix B.4).
    • the signature algorithm retrieved in step #1.
    • the signature value retrieved in step #1.
    • the deterministic encoding of the current CBOR object in its entirety.

Note: this is a "bare-bones" validation process, lacking the ruggedness of a real-world implementation.

B.4. Example Parameters

The signature and validation processes depend on the COSE [RFC9053] algorithm "HMAC 256/256" and an associated 256-bit key, here provided in hex code:

7fdd851a3b9d2dafc5f0d00030e22b9343900cd42ede4948568a4a2ee655291a

B.5. Code Example

Using the JavaScript implementation mentioned in Appendix E, basic signature validation of the signed CBOR object created in Appendix B.2, could be performed by the following code:

// The variable cborBinary is supposed to contain CBOR
let object = CBOR.decode(cborBinary);             // Decode
let csf = object.get(CBOR.Int(-1));               // Get CSF container
let alg = csf.get(CBOR.Int(1)).getInt();          // Read algorithm
let sig = csf.remove(CBOR.Int(6)).getBytes();     // Read and remove signature value
let key = CBOR.fromHex('7fdd851a3b9d2dafc5f0d00030e22b9343900cd42ede4948568a4a2ee655291a');

// Hypothetical HMAC validation method:
hmacValidate(alg, sig, key, object.encode());     // Note that object.encode()
                                                  // reserializes all but sig.
// Validated object, access the "payload":
let param = object.get(CBOR.Int(1)).getString();  // param should now contain "data"

Note that this code depends heavily on the CBOR tool features outlined in Section 2.3.

Appendix C. Supporting Existing Systems

It is assumed that most systems using CBOR are able to process an (application specific), selection of CBOR data items that are encoded in compliance with [RFC8949]. Since the deterministic encoding scheme mandated by U-CBOR, also is compliant with [RFC8949], there should be no major interoperability issues. That is, if the previous assumption actually is correct 😏

However, in the other direction (U-CBOR tools processing data from Systems using "legacy" CBOR encoding schemes), the situation is likely to be considerably more challenging since deterministic encoding "by design" is strict. Due to this potential obstacle, implementers of U-CBOR tools, are RECOMMENDED to offer decoder options that permit "relaxing" the rigidness of deterministic encoding with respect to:

Note that regardless of the format of received CBOR data, a compliant U-CBOR implementation MUST maintain deterministic encoding. See also Appendix A.3.

Appendix D. Compatible Online Tools

For testing and learning about U-CBOR, there are currently a number of compatible online tools (subject to availability...).

Browser-based CBOR "playground":
https://cyberphone.github.io/CBOR.js/doc/playground.html
Server-based CBOR and [CSF] test system:
https://test.webpki.org/csf-lab

Appendix E. Compatible Implementations

For using U-CBOR in applications, there are currently a number of compatible ibraries.

JavaScript-based implementation:
https://github.com/cyberphone/CBOR.js
Java-based implementation that also supports [CSF] and [CEF]:
https://github.com/cyberphone/openkeystore
Android Java-based implementation that also supports [CSF] and [CEF]:
https://github.com/cyberphone/android-cbor

Document History

Acknowledgements

This work was inspired by a CBOR based project known as [GordianEnvelope], pioneered by Wolf McNally and Christopher Allen. This project also exploits the ability to hash "raw" (non-wrapped) CBOR data, enabled by the use of a deterministic encoding scheme.

Author's Address

Anders Rundgren (editor)
Independent
Montpellier
France