blob: 5421d2fad5c1defd63f47abd77e900a1637d6e95 [file] [log] [blame]
Nigel Tao1b073492020-02-16 22:11:36 +11001// Copyright 2020 The Wuffs Authors.
2//
3// Licensed under the Apache License, Version 2.0 (the "License");
4// you may not use this file except in compliance with the License.
5// You may obtain a copy of the License at
6//
7// https://www.apache.org/licenses/LICENSE-2.0
8//
9// Unless required by applicable law or agreed to in writing, software
10// distributed under the License is distributed on an "AS IS" BASIS,
11// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12// See the License for the specific language governing permissions and
13// limitations under the License.
14
15// ----------------
16
17/*
Nigel Tao0cd2f982020-03-03 23:03:02 +110018jsonptr is a JSON formatter (pretty-printer) that supports the JSON Pointer
19(RFC 6901) query syntax. It reads UTF-8 JSON from stdin and writes
20canonicalized, formatted UTF-8 JSON to stdout.
21
Nigel Taod60815c2020-03-26 14:32:35 +110022See the "const char* g_usage" string below for details.
Nigel Tao0cd2f982020-03-03 23:03:02 +110023
24----
25
26JSON Pointer (and this program's implementation) is one of many JSON query
27languages and JSON tools, such as jq, jql and JMESPath. This one is relatively
28simple and fewer-featured compared to those others.
29
30One benefit of simplicity is that this program's JSON and JSON Pointer
31implementations do not dynamically allocate or free memory (yet it does not
32require that the entire input fits in memory at once). They are therefore
33trivially protected against certain bug classes: memory leaks, double-frees and
34use-after-frees.
35
36The core JSON implementation is also written in the Wuffs programming language
Nigel Taof2eb7012020-03-16 21:10:20 +110037(and then transpiled to C/C++), which is memory-safe (e.g. array indexing is
38bounds-checked) but also guards against integer arithmetic overflows.
Nigel Tao0cd2f982020-03-03 23:03:02 +110039
Nigel Taofe0cbbd2020-03-05 22:01:30 +110040For defense in depth, on Linux, this program also self-imposes a
41SECCOMP_MODE_STRICT sandbox before reading (or otherwise processing) its input
42or writing its output. Under this sandbox, the only permitted system calls are
43read, write, exit and sigreturn.
44
Nigel Tao0cd2f982020-03-03 23:03:02 +110045All together, this program aims to safely handle untrusted JSON files without
46fear of security bugs such as remote code execution.
47
48----
Nigel Tao1b073492020-02-16 22:11:36 +110049
Nigel Taoc5b3a9e2020-02-24 11:54:35 +110050As of 2020-02-24, this program passes all 318 "test_parsing" cases from the
51JSON test suite (https://github.com/nst/JSONTestSuite), an appendix to the
52"Parsing JSON is a Minefield" article (http://seriot.ch/parsing_json.php) that
53was first published on 2016-10-26 and updated on 2018-03-30.
54
Nigel Tao0cd2f982020-03-03 23:03:02 +110055After modifying this program, run "build-example.sh example/jsonptr/" and then
56"script/run-json-test-suite.sh" to catch correctness regressions.
57
58----
59
Nigel Taod0b16cb2020-03-14 10:15:54 +110060This program uses Wuffs' JSON decoder at a relatively low level, processing the
61decoder's token-stream output individually. The core loop, in pseudo-code, is
62"for_each_token { handle_token(etc); }", where the handle_token function
Nigel Taod60815c2020-03-26 14:32:35 +110063changes global state (e.g. the `g_depth` and `g_ctx` variables) and prints
Nigel Taod0b16cb2020-03-14 10:15:54 +110064output text based on that state and the token's source text. Notably,
65handle_token is not recursive, even though JSON values can nest.
66
67This approach is centered around JSON tokens. Each JSON 'thing' (e.g. number,
68string, object) comprises one or more JSON tokens.
69
70An alternative, higher-level approach is in the sibling example/jsonfindptrs
71program. Neither approach is better or worse per se, but when studying this
72program, be aware that there are multiple ways to use Wuffs' JSON decoder.
73
74The two programs, jsonfindptrs and jsonptr, also demonstrate different
75trade-offs with regard to JSON object duplicate keys. The JSON spec permits
76different implementations to allow or reject duplicate keys. It is not always
77clear which approach is safer. Rejecting them is certainly unambiguous, and
78security bugs can lurk in ambiguous corners of a file format, if two different
79implementations both silently accept a file but differ on how to interpret it.
80On the other hand, in the worst case, detecting duplicate keys requires O(N)
81memory, where N is the size of the (potentially untrusted) input.
82
83This program (jsonptr) allows duplicate keys and requires only O(1) memory. As
84mentioned above, it doesn't dynamically allocate memory at all, and on Linux,
85it runs in a SECCOMP_MODE_STRICT sandbox.
86
87----
88
Nigel Tao1b073492020-02-16 22:11:36 +110089This example program differs from most other example Wuffs programs in that it
90is written in C++, not C.
91
92$CXX jsonptr.cc && ./a.out < ../../test/data/github-tags.json; rm -f a.out
93
94for a C++ compiler $CXX, such as clang++ or g++.
95*/
96
Nigel Tao721190a2020-04-03 22:25:21 +110097#if defined(__cplusplus) && (__cplusplus < 201103L)
98#error "This C++ program requires -std=c++11 or later"
99#endif
100
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100101#include <errno.h>
Nigel Tao01abc842020-03-06 21:42:33 +1100102#include <fcntl.h>
103#include <stdio.h>
Nigel Tao9cc2c252020-02-23 17:05:49 +1100104#include <string.h>
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100105#include <unistd.h>
Nigel Tao1b073492020-02-16 22:11:36 +1100106
107// Wuffs ships as a "single file C library" or "header file library" as per
108// https://github.com/nothings/stb/blob/master/docs/stb_howto.txt
109//
110// To use that single file as a "foo.c"-like implementation, instead of a
111// "foo.h"-like header, #define WUFFS_IMPLEMENTATION before #include'ing or
112// compiling it.
113#define WUFFS_IMPLEMENTATION
114
115// Defining the WUFFS_CONFIG__MODULE* macros are optional, but it lets users of
116// release/c/etc.c whitelist which parts of Wuffs to build. That file contains
117// the entire Wuffs standard library, implementing a variety of codecs and file
118// formats. Without this macro definition, an optimizing compiler or linker may
119// very well discard Wuffs code for unused codecs, but listing the Wuffs
120// modules we use makes that process explicit. Preprocessing means that such
121// code simply isn't compiled.
122#define WUFFS_CONFIG__MODULES
123#define WUFFS_CONFIG__MODULE__BASE
124#define WUFFS_CONFIG__MODULE__JSON
125
126// If building this program in an environment that doesn't easily accommodate
127// relative includes, you can use the script/inline-c-relative-includes.go
128// program to generate a stand-alone C++ file.
129#include "../../release/c/wuffs-unsupported-snapshot.c"
130
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100131#if defined(__linux__)
132#include <linux/prctl.h>
133#include <linux/seccomp.h>
134#include <sys/prctl.h>
135#include <sys/syscall.h>
136#define WUFFS_EXAMPLE_USE_SECCOMP
137#endif
138
Nigel Tao2cf76db2020-02-27 22:42:01 +1100139#define TRY(error_msg) \
140 do { \
141 const char* z = error_msg; \
142 if (z) { \
143 return z; \
144 } \
145 } while (false)
146
Nigel Taod60815c2020-03-26 14:32:35 +1100147static const char* g_eod = "main: end of data";
Nigel Tao2cf76db2020-02-27 22:42:01 +1100148
Nigel Taod60815c2020-03-26 14:32:35 +1100149static const char* g_usage =
Nigel Tao01abc842020-03-06 21:42:33 +1100150 "Usage: jsonptr -flags input.json\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100151 "\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100152 "Flags:\n"
Nigel Tao3690e832020-03-12 16:52:26 +1100153 " -c -compact-output\n"
Nigel Tao94440cf2020-04-02 22:28:24 +1100154 " -d=NUM -max-output-depth=NUM\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100155 " -i=NUM -indent=NUM\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100156 " -q=STR -query=STR\n"
Nigel Taod6fdfb12020-03-11 12:24:14 +1100157 " -s -strict-json-pointer-syntax\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100158 " -t -tabs\n"
159 " -fail-if-unsandboxed\n"
160 "\n"
Nigel Tao01abc842020-03-06 21:42:33 +1100161 "The input.json filename is optional. If absent, it reads from stdin.\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100162 "\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100163 "----\n"
164 "\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100165 "jsonptr is a JSON formatter (pretty-printer) that supports the JSON\n"
166 "Pointer (RFC 6901) query syntax. It reads UTF-8 JSON from stdin and\n"
167 "writes canonicalized, formatted UTF-8 JSON to stdout.\n"
168 "\n"
169 "Canonicalized means that e.g. \"abc\\u000A\\tx\\u0177z\" is re-written\n"
170 "as \"abc\\n\\txÅ·z\". It does not sort object keys, nor does it reject\n"
Nigel Tao01abc842020-03-06 21:42:33 +1100171 "duplicate keys. Canonicalization does not imply Unicode normalization.\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100172 "\n"
173 "Formatted means that arrays' and objects' elements are indented, each\n"
Nigel Tao3690e832020-03-12 16:52:26 +1100174 "on its own line. Configure this with the -c / -compact-output, -i=NUM /\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100175 "-indent=NUM (for NUM ranging from 0 to 8) and -t / -tabs flags.\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100176 "\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100177 "----\n"
178 "\n"
179 "The -q=STR or -query=STR flag gives an optional JSON Pointer query, to\n"
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100180 "print a subset of the input. For example, given RFC 6901 section 5's\n"
Nigel Tao01abc842020-03-06 21:42:33 +1100181 "sample input (https://tools.ietf.org/rfc/rfc6901.txt), this command:\n"
182 " jsonptr -query=/foo/1 rfc-6901-json-pointer.json\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100183 "will print:\n"
184 " \"baz\"\n"
185 "\n"
186 "An absent query is equivalent to the empty query, which identifies the\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100187 "entire input (the root value). Unlike a file system, the \"/\" query\n"
Nigel Taod0b16cb2020-03-14 10:15:54 +1100188 "does not identify the root. Instead, \"\" is the root and \"/\" is the\n"
189 "child (the value in a key-value pair) of the root whose key is the empty\n"
190 "string. Similarly, \"/xyz\" and \"/xyz/\" are two different nodes.\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100191 "\n"
192 "If the query found a valid JSON value, this program will return a zero\n"
193 "exit code even if the rest of the input isn't valid JSON. If the query\n"
194 "did not find a value, or found an invalid one, this program returns a\n"
195 "non-zero exit code, but may still print partial output to stdout.\n"
196 "\n"
Nigel Tao01abc842020-03-06 21:42:33 +1100197 "The JSON specification (https://json.org/) permits implementations that\n"
Nigel Tao0cd2f982020-03-03 23:03:02 +1100198 "allow duplicate keys, as this one does. This JSON Pointer implementation\n"
199 "is also greedy, following the first match for each fragment without\n"
200 "back-tracking. For example, the \"/foo/bar\" query will fail if the root\n"
201 "object has multiple \"foo\" children but the first one doesn't have a\n"
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100202 "\"bar\" child, even if later ones do.\n"
203 "\n"
Nigel Taod6fdfb12020-03-11 12:24:14 +1100204 "The -s or -strict-json-pointer-syntax flag restricts the -query=STR\n"
205 "string to exactly RFC 6901, with only two escape sequences: \"~0\" and\n"
206 "\"~1\" for \"~\" and \"/\". Without this flag, this program also lets\n"
207 "\"~n\" and \"~r\" escape the New Line and Carriage Return ASCII control\n"
208 "characters, which can work better with line oriented Unix tools that\n"
209 "assume exactly one value (i.e. one JSON Pointer string) per line.\n"
210 "\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100211 "----\n"
212 "\n"
Nigel Tao94440cf2020-04-02 22:28:24 +1100213 "The -d=NUM or -max-output-depth=NUM flag gives the maximum (inclusive)\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100214 "output depth. JSON containers ([] arrays and {} objects) can hold other\n"
215 "containers. When this flag is set, containers at depth NUM are replaced\n"
Nigel Tao94440cf2020-04-02 22:28:24 +1100216 "with \"[…]\" or \"{…}\". A bare -d or -max-output-depth is equivalent to\n"
217 "-d=1. The flag's absence is equivalent to an unlimited output depth.\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100218 "\n"
219 "The -max-output-depth flag only affects the program's output. It doesn't\n"
220 "affect whether or not the input is considered valid JSON. The JSON\n"
221 "specification permits implementations to set their own maximum input\n"
222 "depth. This JSON implementation sets it to 1024.\n"
223 "\n"
224 "Depth is measured in terms of nested containers. It is unaffected by the\n"
225 "number of spaces or tabs used to indent.\n"
226 "\n"
227 "When both -max-output-depth and -query are set, the output depth is\n"
228 "measured from when the query resolves, not from the input root. The\n"
229 "input depth (measured from the root) is still limited to 1024.\n"
230 "\n"
231 "----\n"
232 "\n"
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100233 "The -fail-if-unsandboxed flag causes the program to exit if it does not\n"
234 "self-impose a sandbox. On Linux, it self-imposes a SECCOMP_MODE_STRICT\n"
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100235 "sandbox, regardless of whether this flag was set.";
Nigel Tao0cd2f982020-03-03 23:03:02 +1100236
Nigel Tao2cf76db2020-02-27 22:42:01 +1100237// ----
238
Nigel Taof3146c22020-03-26 08:47:42 +1100239// Wuffs allows either statically or dynamically allocated work buffers. This
240// program exercises static allocation.
241#define WORK_BUFFER_ARRAY_SIZE \
242 WUFFS_JSON__DECODER_WORKBUF_LEN_MAX_INCL_WORST_CASE
243#if WORK_BUFFER_ARRAY_SIZE > 0
Nigel Taod60815c2020-03-26 14:32:35 +1100244uint8_t g_work_buffer_array[WORK_BUFFER_ARRAY_SIZE];
Nigel Taof3146c22020-03-26 08:47:42 +1100245#else
246// Not all C/C++ compilers support 0-length arrays.
Nigel Taod60815c2020-03-26 14:32:35 +1100247uint8_t g_work_buffer_array[1];
Nigel Taof3146c22020-03-26 08:47:42 +1100248#endif
249
Nigel Taod60815c2020-03-26 14:32:35 +1100250bool g_sandboxed = false;
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100251
Nigel Taod60815c2020-03-26 14:32:35 +1100252int g_input_file_descriptor = 0; // A 0 default means stdin.
Nigel Tao01abc842020-03-06 21:42:33 +1100253
Nigel Tao2cf76db2020-02-27 22:42:01 +1100254#define MAX_INDENT 8
Nigel Tao107f0ef2020-03-01 21:35:02 +1100255#define INDENT_SPACES_STRING " "
Nigel Tao6e7d1412020-03-06 09:21:35 +1100256#define INDENT_TAB_STRING "\t"
Nigel Tao107f0ef2020-03-01 21:35:02 +1100257
Nigel Taofdac24a2020-03-06 21:53:08 +1100258#ifndef DST_BUFFER_ARRAY_SIZE
259#define DST_BUFFER_ARRAY_SIZE (32 * 1024)
Nigel Tao1b073492020-02-16 22:11:36 +1100260#endif
Nigel Taofdac24a2020-03-06 21:53:08 +1100261#ifndef SRC_BUFFER_ARRAY_SIZE
262#define SRC_BUFFER_ARRAY_SIZE (32 * 1024)
Nigel Tao1b073492020-02-16 22:11:36 +1100263#endif
Nigel Taofdac24a2020-03-06 21:53:08 +1100264#ifndef TOKEN_BUFFER_ARRAY_SIZE
265#define TOKEN_BUFFER_ARRAY_SIZE (4 * 1024)
Nigel Tao1b073492020-02-16 22:11:36 +1100266#endif
267
Nigel Taod60815c2020-03-26 14:32:35 +1100268uint8_t g_dst_array[DST_BUFFER_ARRAY_SIZE];
269uint8_t g_src_array[SRC_BUFFER_ARRAY_SIZE];
270wuffs_base__token g_tok_array[TOKEN_BUFFER_ARRAY_SIZE];
Nigel Tao1b073492020-02-16 22:11:36 +1100271
Nigel Taod60815c2020-03-26 14:32:35 +1100272wuffs_base__io_buffer g_dst;
273wuffs_base__io_buffer g_src;
274wuffs_base__token_buffer g_tok;
Nigel Tao1b073492020-02-16 22:11:36 +1100275
Nigel Taod60815c2020-03-26 14:32:35 +1100276// g_curr_token_end_src_index is the g_src.data.ptr index of the end of the
277// current token. An invariant is that (g_curr_token_end_src_index <=
278// g_src.meta.ri).
279size_t g_curr_token_end_src_index;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100280
Nigel Taod60815c2020-03-26 14:32:35 +1100281uint32_t g_depth;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100282
283enum class context {
284 none,
285 in_list_after_bracket,
286 in_list_after_value,
287 in_dict_after_brace,
288 in_dict_after_key,
289 in_dict_after_value,
Nigel Taod60815c2020-03-26 14:32:35 +1100290} g_ctx;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100291
Nigel Tao0cd2f982020-03-03 23:03:02 +1100292bool //
293in_dict_before_key() {
Nigel Taod60815c2020-03-26 14:32:35 +1100294 return (g_ctx == context::in_dict_after_brace) ||
295 (g_ctx == context::in_dict_after_value);
Nigel Tao0cd2f982020-03-03 23:03:02 +1100296}
297
Nigel Taod60815c2020-03-26 14:32:35 +1100298uint32_t g_suppress_write_dst;
299bool g_wrote_to_dst;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100300
Nigel Taod60815c2020-03-26 14:32:35 +1100301wuffs_json__decoder g_dec;
Nigel Tao1b073492020-02-16 22:11:36 +1100302
Nigel Tao0cd2f982020-03-03 23:03:02 +1100303// ----
304
305// Query is a JSON Pointer query. After initializing with a NUL-terminated C
306// string, its multiple fragments are consumed as the program walks the JSON
307// data from stdin. For example, letting "$" denote a NUL, suppose that we
308// started with a query string of "/apple/banana/12/durian" and are currently
Nigel Taob48ee752020-03-13 09:27:33 +1100309// trying to match the second fragment, "banana", so that Query::m_depth is 2:
Nigel Tao0cd2f982020-03-03 23:03:02 +1100310//
311// +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
312// / a p p l e / b a n a n a / 1 2 / d u r i a n $
313// +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
314// ^ ^
Nigel Taob48ee752020-03-13 09:27:33 +1100315// m_frag_i m_frag_k
Nigel Tao0cd2f982020-03-03 23:03:02 +1100316//
Nigel Taob48ee752020-03-13 09:27:33 +1100317// The two pointers m_frag_i and m_frag_k (abbreviated as mfi and mfk) are the
318// start (inclusive) and end (exclusive) of the query fragment. They satisfy
319// (mfi <= mfk) and may be equal if the fragment empty (note that "" is a valid
320// JSON object key).
Nigel Tao0cd2f982020-03-03 23:03:02 +1100321//
Nigel Taob48ee752020-03-13 09:27:33 +1100322// The m_frag_j (mfj) pointer moves between these two, or is nullptr. An
323// invariant is that (((mfi <= mfj) && (mfj <= mfk)) || (mfj == nullptr)).
Nigel Tao0cd2f982020-03-03 23:03:02 +1100324//
325// Wuffs' JSON tokenizer can portray a single JSON string as multiple Wuffs
326// tokens, as backslash-escaped values within that JSON string may each get
327// their own token.
328//
Nigel Taob48ee752020-03-13 09:27:33 +1100329// At the start of each object key (a JSON string), mfj is set to mfi.
Nigel Tao0cd2f982020-03-03 23:03:02 +1100330//
Nigel Taob48ee752020-03-13 09:27:33 +1100331// While mfj remains non-nullptr, each token's unescaped contents are then
332// compared to that part of the fragment from mfj to mfk. If it is a prefix
333// (including the case of an exact match), then mfj is advanced by the
334// unescaped length. Otherwise, mfj is set to nullptr.
Nigel Tao0cd2f982020-03-03 23:03:02 +1100335//
336// Comparison accounts for JSON Pointer's escaping notation: "~0" and "~1" in
337// the query (not the JSON value) are unescaped to "~" and "/" respectively.
Nigel Taob48ee752020-03-13 09:27:33 +1100338// "~n" and "~r" are also unescaped to "\n" and "\r". The program is
339// responsible for calling Query::validate (with a strict_json_pointer_syntax
340// argument) before otherwise using this class.
Nigel Tao0cd2f982020-03-03 23:03:02 +1100341//
Nigel Taob48ee752020-03-13 09:27:33 +1100342// The mfj pointer therefore advances from mfi to mfk, or drops out, as we
343// incrementally match the object key with the query fragment. For example, if
344// we have already matched the "ban" of "banana", then we would accept any of
345// an "ana" token, an "a" token or a "\u0061" token, amongst others. They would
346// advance mfj by 3, 1 or 1 bytes respectively.
Nigel Tao0cd2f982020-03-03 23:03:02 +1100347//
Nigel Taob48ee752020-03-13 09:27:33 +1100348// mfj
Nigel Tao0cd2f982020-03-03 23:03:02 +1100349// v
350// +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
351// / a p p l e / b a n a n a / 1 2 / d u r i a n $
352// +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
353// ^ ^
Nigel Taob48ee752020-03-13 09:27:33 +1100354// mfi mfk
Nigel Tao0cd2f982020-03-03 23:03:02 +1100355//
356// At the end of each object key (or equivalently, at the start of each object
Nigel Taob48ee752020-03-13 09:27:33 +1100357// value), if mfj is non-nullptr and equal to (but not less than) mfk then we
358// have a fragment match: the query fragment equals the object key. If there is
359// a next fragment (in this example, "12") we move the frag_etc pointers to its
360// start and end and increment Query::m_depth. Otherwise, we have matched the
361// complete query, and the upcoming JSON value is the result of that query.
Nigel Tao0cd2f982020-03-03 23:03:02 +1100362//
363// The discussion above centers on object keys. If the query fragment is
364// numeric then it can also match as an array index: the string fragment "12"
365// will match an array's 13th element (starting counting from zero). See RFC
366// 6901 for its precise definition of an "array index" number.
367//
Nigel Taob48ee752020-03-13 09:27:33 +1100368// Array index fragment match is represented by the Query::m_array_index field,
Nigel Tao0cd2f982020-03-03 23:03:02 +1100369// whose type (wuffs_base__result_u64) is a result type. An error result means
370// that the fragment is not an array index. A value result holds the number of
371// list elements remaining. When matching a query fragment in an array (instead
372// of in an object), each element ticks this number down towards zero. At zero,
373// the upcoming JSON value is the one that matches the query fragment.
374class Query {
375 private:
Nigel Taob48ee752020-03-13 09:27:33 +1100376 uint8_t* m_frag_i;
377 uint8_t* m_frag_j;
378 uint8_t* m_frag_k;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100379
Nigel Taob48ee752020-03-13 09:27:33 +1100380 uint32_t m_depth;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100381
Nigel Taob48ee752020-03-13 09:27:33 +1100382 wuffs_base__result_u64 m_array_index;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100383
384 public:
385 void reset(char* query_c_string) {
Nigel Taob48ee752020-03-13 09:27:33 +1100386 m_frag_i = (uint8_t*)query_c_string;
387 m_frag_j = (uint8_t*)query_c_string;
388 m_frag_k = (uint8_t*)query_c_string;
389 m_depth = 0;
390 m_array_index.status.repr = "#main: not an array index query fragment";
391 m_array_index.value = 0;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100392 }
393
Nigel Taob48ee752020-03-13 09:27:33 +1100394 void restart_fragment(bool enable) { m_frag_j = enable ? m_frag_i : nullptr; }
Nigel Tao0cd2f982020-03-03 23:03:02 +1100395
Nigel Taob48ee752020-03-13 09:27:33 +1100396 bool is_at(uint32_t depth) { return m_depth == depth; }
Nigel Tao0cd2f982020-03-03 23:03:02 +1100397
398 // tick returns whether the fragment is a valid array index whose value is
399 // zero. If valid but non-zero, it decrements it and returns false.
400 bool tick() {
Nigel Taob48ee752020-03-13 09:27:33 +1100401 if (m_array_index.status.is_ok()) {
402 if (m_array_index.value == 0) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100403 return true;
404 }
Nigel Taob48ee752020-03-13 09:27:33 +1100405 m_array_index.value--;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100406 }
407 return false;
408 }
409
410 // next_fragment moves to the next fragment, returning whether it existed.
411 bool next_fragment() {
Nigel Taob48ee752020-03-13 09:27:33 +1100412 uint8_t* k = m_frag_k;
413 uint32_t d = m_depth;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100414
415 this->reset(nullptr);
416
417 if (!k || (*k != '/')) {
418 return false;
419 }
420 k++;
421
422 bool all_digits = true;
423 uint8_t* i = k;
424 while ((*k != '\x00') && (*k != '/')) {
425 all_digits = all_digits && ('0' <= *k) && (*k <= '9');
426 k++;
427 }
Nigel Taob48ee752020-03-13 09:27:33 +1100428 m_frag_i = i;
429 m_frag_j = i;
430 m_frag_k = k;
431 m_depth = d + 1;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100432 if (all_digits) {
433 // wuffs_base__parse_number_u64 rejects leading zeroes, e.g. "00", "07".
Nigel Tao6b7ce302020-07-07 16:19:46 +1000434 m_array_index = wuffs_base__parse_number_u64(
435 wuffs_base__make_slice_u8(i, k - i),
436 WUFFS_BASE__PARSE_NUMBER_XXX__DEFAULT_OPTIONS);
Nigel Tao0cd2f982020-03-03 23:03:02 +1100437 }
438 return true;
439 }
440
Nigel Taob48ee752020-03-13 09:27:33 +1100441 bool matched_all() { return m_frag_k == nullptr; }
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100442
Nigel Taob48ee752020-03-13 09:27:33 +1100443 bool matched_fragment() { return m_frag_j && (m_frag_j == m_frag_k); }
Nigel Tao0cd2f982020-03-03 23:03:02 +1100444
445 void incremental_match_slice(uint8_t* ptr, size_t len) {
Nigel Taob48ee752020-03-13 09:27:33 +1100446 if (!m_frag_j) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100447 return;
448 }
Nigel Taob48ee752020-03-13 09:27:33 +1100449 uint8_t* j = m_frag_j;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100450 while (true) {
451 if (len == 0) {
Nigel Taob48ee752020-03-13 09:27:33 +1100452 m_frag_j = j;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100453 return;
454 }
455
456 if (*j == '\x00') {
457 break;
458
459 } else if (*j == '~') {
460 j++;
461 if (*j == '0') {
462 if (*ptr != '~') {
463 break;
464 }
465 } else if (*j == '1') {
466 if (*ptr != '/') {
467 break;
468 }
Nigel Taod6fdfb12020-03-11 12:24:14 +1100469 } else if (*j == 'n') {
470 if (*ptr != '\n') {
471 break;
472 }
473 } else if (*j == 'r') {
474 if (*ptr != '\r') {
475 break;
476 }
Nigel Tao0cd2f982020-03-03 23:03:02 +1100477 } else {
478 break;
479 }
480
481 } else if (*j != *ptr) {
482 break;
483 }
484
485 j++;
486 ptr++;
487 len--;
488 }
Nigel Taob48ee752020-03-13 09:27:33 +1100489 m_frag_j = nullptr;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100490 }
491
492 void incremental_match_code_point(uint32_t code_point) {
Nigel Taob48ee752020-03-13 09:27:33 +1100493 if (!m_frag_j) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100494 return;
495 }
496 uint8_t u[WUFFS_BASE__UTF_8__BYTE_LENGTH__MAX_INCL];
497 size_t n = wuffs_base__utf_8__encode(
498 wuffs_base__make_slice_u8(&u[0],
499 WUFFS_BASE__UTF_8__BYTE_LENGTH__MAX_INCL),
500 code_point);
501 if (n > 0) {
502 this->incremental_match_slice(&u[0], n);
503 }
504 }
505
506 // validate returns whether the (ptr, len) arguments form a valid JSON
507 // Pointer. In particular, it must be valid UTF-8, and either be empty or
508 // start with a '/'. Any '~' within must immediately be followed by either
Nigel Taod6fdfb12020-03-11 12:24:14 +1100509 // '0' or '1'. If strict_json_pointer_syntax is false, a '~' may also be
510 // followed by either 'n' or 'r'.
511 static bool validate(char* query_c_string,
512 size_t length,
513 bool strict_json_pointer_syntax) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100514 if (length <= 0) {
515 return true;
516 }
517 if (query_c_string[0] != '/') {
518 return false;
519 }
520 wuffs_base__slice_u8 s =
521 wuffs_base__make_slice_u8((uint8_t*)query_c_string, length);
522 bool previous_was_tilde = false;
523 while (s.len > 0) {
524 wuffs_base__utf_8__next__output o = wuffs_base__utf_8__next(s);
525 if (!o.is_valid()) {
526 return false;
527 }
Nigel Taod6fdfb12020-03-11 12:24:14 +1100528
529 if (previous_was_tilde) {
530 switch (o.code_point) {
531 case '0':
532 case '1':
533 break;
534 case 'n':
535 case 'r':
536 if (strict_json_pointer_syntax) {
537 return false;
538 }
539 break;
540 default:
541 return false;
542 }
Nigel Tao0cd2f982020-03-03 23:03:02 +1100543 }
544 previous_was_tilde = o.code_point == '~';
Nigel Taod6fdfb12020-03-11 12:24:14 +1100545
Nigel Tao0cd2f982020-03-03 23:03:02 +1100546 s.ptr += o.byte_length;
547 s.len -= o.byte_length;
548 }
549 return !previous_was_tilde;
550 }
Nigel Taod60815c2020-03-26 14:32:35 +1100551} g_query;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100552
553// ----
554
Nigel Tao68920952020-03-03 11:25:18 +1100555struct {
556 int remaining_argc;
557 char** remaining_argv;
558
Nigel Tao3690e832020-03-12 16:52:26 +1100559 bool compact_output;
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100560 bool fail_if_unsandboxed;
Nigel Tao68920952020-03-03 11:25:18 +1100561 size_t indent;
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100562 uint32_t max_output_depth;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100563 char* query_c_string;
Nigel Taod6fdfb12020-03-11 12:24:14 +1100564 bool strict_json_pointer_syntax;
Nigel Tao68920952020-03-03 11:25:18 +1100565 bool tabs;
Nigel Taod60815c2020-03-26 14:32:35 +1100566} g_flags = {0};
Nigel Tao68920952020-03-03 11:25:18 +1100567
568const char* //
569parse_flags(int argc, char** argv) {
Nigel Taod60815c2020-03-26 14:32:35 +1100570 g_flags.indent = 4;
571 g_flags.max_output_depth = 0xFFFFFFFF;
Nigel Tao68920952020-03-03 11:25:18 +1100572
573 int c = (argc > 0) ? 1 : 0; // Skip argv[0], the program name.
574 for (; c < argc; c++) {
575 char* arg = argv[c];
576 if (*arg++ != '-') {
577 break;
578 }
579
580 // A double-dash "--foo" is equivalent to a single-dash "-foo". As special
581 // cases, a bare "-" is not a flag (some programs may interpret it as
582 // stdin) and a bare "--" means to stop parsing flags.
583 if (*arg == '\x00') {
584 break;
585 } else if (*arg == '-') {
586 arg++;
587 if (*arg == '\x00') {
588 c++;
589 break;
590 }
591 }
592
Nigel Tao3690e832020-03-12 16:52:26 +1100593 if (!strcmp(arg, "c") || !strcmp(arg, "compact-output")) {
Nigel Taod60815c2020-03-26 14:32:35 +1100594 g_flags.compact_output = true;
Nigel Tao68920952020-03-03 11:25:18 +1100595 continue;
596 }
Nigel Tao94440cf2020-04-02 22:28:24 +1100597 if (!strcmp(arg, "d") || !strcmp(arg, "max-output-depth")) {
598 g_flags.max_output_depth = 1;
599 continue;
600 } else if (!strncmp(arg, "d=", 2) ||
601 !strncmp(arg, "max-output-depth=", 16)) {
602 while (*arg++ != '=') {
603 }
604 wuffs_base__result_u64 u = wuffs_base__parse_number_u64(
Nigel Tao6b7ce302020-07-07 16:19:46 +1000605 wuffs_base__make_slice_u8((uint8_t*)arg, strlen(arg)),
606 WUFFS_BASE__PARSE_NUMBER_XXX__DEFAULT_OPTIONS);
Nigel Tao94440cf2020-04-02 22:28:24 +1100607 if (wuffs_base__status__is_ok(&u.status) && (u.value <= 0xFFFFFFFF)) {
608 g_flags.max_output_depth = (uint32_t)(u.value);
609 continue;
610 }
611 return g_usage;
612 }
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100613 if (!strcmp(arg, "fail-if-unsandboxed")) {
Nigel Taod60815c2020-03-26 14:32:35 +1100614 g_flags.fail_if_unsandboxed = true;
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100615 continue;
616 }
Nigel Tao68920952020-03-03 11:25:18 +1100617 if (!strncmp(arg, "i=", 2) || !strncmp(arg, "indent=", 7)) {
618 while (*arg++ != '=') {
619 }
620 if (('0' <= arg[0]) && (arg[0] <= '8') && (arg[1] == '\x00')) {
Nigel Taod60815c2020-03-26 14:32:35 +1100621 g_flags.indent = arg[0] - '0';
Nigel Tao68920952020-03-03 11:25:18 +1100622 continue;
623 }
Nigel Taod60815c2020-03-26 14:32:35 +1100624 return g_usage;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100625 }
626 if (!strncmp(arg, "q=", 2) || !strncmp(arg, "query=", 6)) {
627 while (*arg++ != '=') {
628 }
Nigel Taod60815c2020-03-26 14:32:35 +1100629 g_flags.query_c_string = arg;
Nigel Taod6fdfb12020-03-11 12:24:14 +1100630 continue;
631 }
632 if (!strcmp(arg, "s") || !strcmp(arg, "strict-json-pointer-syntax")) {
Nigel Taod60815c2020-03-26 14:32:35 +1100633 g_flags.strict_json_pointer_syntax = true;
Nigel Taod6fdfb12020-03-11 12:24:14 +1100634 continue;
Nigel Tao68920952020-03-03 11:25:18 +1100635 }
636 if (!strcmp(arg, "t") || !strcmp(arg, "tabs")) {
Nigel Taod60815c2020-03-26 14:32:35 +1100637 g_flags.tabs = true;
Nigel Tao68920952020-03-03 11:25:18 +1100638 continue;
639 }
640
Nigel Taod60815c2020-03-26 14:32:35 +1100641 return g_usage;
Nigel Tao68920952020-03-03 11:25:18 +1100642 }
643
Nigel Taod60815c2020-03-26 14:32:35 +1100644 if (g_flags.query_c_string &&
645 !Query::validate(g_flags.query_c_string, strlen(g_flags.query_c_string),
646 g_flags.strict_json_pointer_syntax)) {
Nigel Taod6fdfb12020-03-11 12:24:14 +1100647 return "main: bad JSON Pointer (RFC 6901) syntax for the -query=STR flag";
648 }
649
Nigel Taod60815c2020-03-26 14:32:35 +1100650 g_flags.remaining_argc = argc - c;
651 g_flags.remaining_argv = argv + c;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100652 return nullptr;
Nigel Tao68920952020-03-03 11:25:18 +1100653}
654
Nigel Tao2cf76db2020-02-27 22:42:01 +1100655const char* //
656initialize_globals(int argc, char** argv) {
Nigel Taod60815c2020-03-26 14:32:35 +1100657 g_dst = wuffs_base__make_io_buffer(
658 wuffs_base__make_slice_u8(g_dst_array, DST_BUFFER_ARRAY_SIZE),
Nigel Tao2cf76db2020-02-27 22:42:01 +1100659 wuffs_base__empty_io_buffer_meta());
Nigel Tao1b073492020-02-16 22:11:36 +1100660
Nigel Taod60815c2020-03-26 14:32:35 +1100661 g_src = wuffs_base__make_io_buffer(
662 wuffs_base__make_slice_u8(g_src_array, SRC_BUFFER_ARRAY_SIZE),
Nigel Tao2cf76db2020-02-27 22:42:01 +1100663 wuffs_base__empty_io_buffer_meta());
664
Nigel Taod60815c2020-03-26 14:32:35 +1100665 g_tok = wuffs_base__make_token_buffer(
666 wuffs_base__make_slice_token(g_tok_array, TOKEN_BUFFER_ARRAY_SIZE),
Nigel Tao2cf76db2020-02-27 22:42:01 +1100667 wuffs_base__empty_token_buffer_meta());
668
Nigel Taod60815c2020-03-26 14:32:35 +1100669 g_curr_token_end_src_index = 0;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100670
Nigel Taod60815c2020-03-26 14:32:35 +1100671 g_depth = 0;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100672
Nigel Taod60815c2020-03-26 14:32:35 +1100673 g_ctx = context::none;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100674
Nigel Tao68920952020-03-03 11:25:18 +1100675 TRY(parse_flags(argc, argv));
Nigel Taod60815c2020-03-26 14:32:35 +1100676 if (g_flags.fail_if_unsandboxed && !g_sandboxed) {
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100677 return "main: unsandboxed";
678 }
Nigel Tao01abc842020-03-06 21:42:33 +1100679 const int stdin_fd = 0;
Nigel Taod60815c2020-03-26 14:32:35 +1100680 if (g_flags.remaining_argc >
681 ((g_input_file_descriptor != stdin_fd) ? 1 : 0)) {
682 return g_usage;
Nigel Tao107f0ef2020-03-01 21:35:02 +1100683 }
684
Nigel Taod60815c2020-03-26 14:32:35 +1100685 g_query.reset(g_flags.query_c_string);
Nigel Tao0cd2f982020-03-03 23:03:02 +1100686
687 // If the query is non-empty, suprress writing to stdout until we've
688 // completed the query.
Nigel Taod60815c2020-03-26 14:32:35 +1100689 g_suppress_write_dst = g_query.next_fragment() ? 1 : 0;
690 g_wrote_to_dst = false;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100691
Nigel Taod60815c2020-03-26 14:32:35 +1100692 TRY(g_dec.initialize(sizeof__wuffs_json__decoder(), WUFFS_VERSION, 0)
Nigel Tao4b186b02020-03-18 14:25:21 +1100693 .message());
694
695 // Consume an optional whitespace trailer. This isn't part of the JSON spec,
696 // but it works better with line oriented Unix tools (such as "echo 123 |
697 // jsonptr" where it's "echo", not "echo -n") or hand-edited JSON files which
698 // can accidentally contain trailing whitespace.
Nigel Taod60815c2020-03-26 14:32:35 +1100699 g_dec.set_quirk_enabled(WUFFS_JSON__QUIRK_ALLOW_TRAILING_NEW_LINE, true);
Nigel Tao4b186b02020-03-18 14:25:21 +1100700
701 return nullptr;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100702}
Nigel Tao1b073492020-02-16 22:11:36 +1100703
704// ----
705
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100706// ignore_return_value suppresses errors from -Wall -Werror.
707static void //
708ignore_return_value(int ignored) {}
709
Nigel Tao2914bae2020-02-26 09:40:30 +1100710const char* //
711read_src() {
Nigel Taod60815c2020-03-26 14:32:35 +1100712 if (g_src.meta.closed) {
Nigel Tao9cc2c252020-02-23 17:05:49 +1100713 return "main: internal error: read requested on a closed source";
Nigel Taoa8406922020-02-19 12:22:00 +1100714 }
Nigel Taod60815c2020-03-26 14:32:35 +1100715 g_src.compact();
716 if (g_src.meta.wi >= g_src.data.len) {
717 return "main: g_src buffer is full";
Nigel Tao1b073492020-02-16 22:11:36 +1100718 }
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100719 while (true) {
Nigel Taod60815c2020-03-26 14:32:35 +1100720 ssize_t n = read(g_input_file_descriptor, g_src.data.ptr + g_src.meta.wi,
721 g_src.data.len - g_src.meta.wi);
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100722 if (n >= 0) {
Nigel Taod60815c2020-03-26 14:32:35 +1100723 g_src.meta.wi += n;
724 g_src.meta.closed = n == 0;
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100725 break;
726 } else if (errno != EINTR) {
727 return strerror(errno);
728 }
Nigel Tao1b073492020-02-16 22:11:36 +1100729 }
730 return nullptr;
731}
732
Nigel Tao2914bae2020-02-26 09:40:30 +1100733const char* //
734flush_dst() {
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100735 while (true) {
Nigel Taod60815c2020-03-26 14:32:35 +1100736 size_t n = g_dst.meta.wi - g_dst.meta.ri;
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100737 if (n == 0) {
738 break;
Nigel Tao1b073492020-02-16 22:11:36 +1100739 }
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100740 const int stdout_fd = 1;
Nigel Taod60815c2020-03-26 14:32:35 +1100741 ssize_t i = write(stdout_fd, g_dst.data.ptr + g_dst.meta.ri, n);
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100742 if (i >= 0) {
Nigel Taod60815c2020-03-26 14:32:35 +1100743 g_dst.meta.ri += i;
Nigel Taofe0cbbd2020-03-05 22:01:30 +1100744 } else if (errno != EINTR) {
745 return strerror(errno);
746 }
Nigel Tao1b073492020-02-16 22:11:36 +1100747 }
Nigel Taod60815c2020-03-26 14:32:35 +1100748 g_dst.compact();
Nigel Tao1b073492020-02-16 22:11:36 +1100749 return nullptr;
750}
751
Nigel Tao2914bae2020-02-26 09:40:30 +1100752const char* //
753write_dst(const void* s, size_t n) {
Nigel Taod60815c2020-03-26 14:32:35 +1100754 if (g_suppress_write_dst > 0) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100755 return nullptr;
756 }
Nigel Tao1b073492020-02-16 22:11:36 +1100757 const uint8_t* p = static_cast<const uint8_t*>(s);
758 while (n > 0) {
Nigel Taod60815c2020-03-26 14:32:35 +1100759 size_t i = g_dst.writer_available();
Nigel Tao1b073492020-02-16 22:11:36 +1100760 if (i == 0) {
761 const char* z = flush_dst();
762 if (z) {
763 return z;
764 }
Nigel Taod60815c2020-03-26 14:32:35 +1100765 i = g_dst.writer_available();
Nigel Tao1b073492020-02-16 22:11:36 +1100766 if (i == 0) {
Nigel Taod60815c2020-03-26 14:32:35 +1100767 return "main: g_dst buffer is full";
Nigel Tao1b073492020-02-16 22:11:36 +1100768 }
769 }
770
771 if (i > n) {
772 i = n;
773 }
Nigel Taod60815c2020-03-26 14:32:35 +1100774 memcpy(g_dst.data.ptr + g_dst.meta.wi, p, i);
775 g_dst.meta.wi += i;
Nigel Tao1b073492020-02-16 22:11:36 +1100776 p += i;
777 n -= i;
Nigel Taod60815c2020-03-26 14:32:35 +1100778 g_wrote_to_dst = true;
Nigel Tao1b073492020-02-16 22:11:36 +1100779 }
780 return nullptr;
781}
782
783// ----
784
Nigel Tao2914bae2020-02-26 09:40:30 +1100785uint8_t //
786hex_digit(uint8_t nibble) {
Nigel Taob5461bd2020-02-21 14:13:37 +1100787 nibble &= 0x0F;
788 if (nibble <= 9) {
789 return '0' + nibble;
790 }
791 return ('A' - 10) + nibble;
792}
793
Nigel Tao2914bae2020-02-26 09:40:30 +1100794const char* //
Nigel Tao3b486982020-02-27 15:05:59 +1100795handle_unicode_code_point(uint32_t ucp) {
796 if (ucp < 0x0020) {
797 switch (ucp) {
798 case '\b':
799 return write_dst("\\b", 2);
800 case '\f':
801 return write_dst("\\f", 2);
802 case '\n':
803 return write_dst("\\n", 2);
804 case '\r':
805 return write_dst("\\r", 2);
806 case '\t':
807 return write_dst("\\t", 2);
808 default: {
809 // Other bytes less than 0x0020 are valid UTF-8 but not valid in a
810 // JSON string. They need to remain escaped.
811 uint8_t esc6[6];
812 esc6[0] = '\\';
813 esc6[1] = 'u';
814 esc6[2] = '0';
815 esc6[3] = '0';
816 esc6[4] = hex_digit(ucp >> 4);
817 esc6[5] = hex_digit(ucp >> 0);
818 return write_dst(&esc6[0], 6);
819 }
820 }
821
Nigel Taob9ad34f2020-03-03 12:44:01 +1100822 } else if (ucp == '\"') {
823 return write_dst("\\\"", 2);
824
825 } else if (ucp == '\\') {
826 return write_dst("\\\\", 2);
827
828 } else {
829 uint8_t u[WUFFS_BASE__UTF_8__BYTE_LENGTH__MAX_INCL];
830 size_t n = wuffs_base__utf_8__encode(
831 wuffs_base__make_slice_u8(&u[0],
832 WUFFS_BASE__UTF_8__BYTE_LENGTH__MAX_INCL),
833 ucp);
834 if (n > 0) {
835 return write_dst(&u[0], n);
Nigel Tao3b486982020-02-27 15:05:59 +1100836 }
Nigel Tao3b486982020-02-27 15:05:59 +1100837 }
838
Nigel Tao2cf76db2020-02-27 22:42:01 +1100839 return "main: internal error: unexpected Unicode code point";
Nigel Tao3b486982020-02-27 15:05:59 +1100840}
841
842const char* //
Nigel Tao2ef39992020-04-09 17:24:39 +1000843handle_token(wuffs_base__token t, bool start_of_token_chain) {
Nigel Tao2cf76db2020-02-27 22:42:01 +1100844 do {
Nigel Tao462f8662020-04-01 23:01:51 +1100845 int64_t vbc = t.value_base_category();
Nigel Tao2cf76db2020-02-27 22:42:01 +1100846 uint64_t vbd = t.value_base_detail();
847 uint64_t len = t.length();
Nigel Tao1b073492020-02-16 22:11:36 +1100848
849 // Handle ']' or '}'.
Nigel Tao9f7a2502020-02-23 09:42:02 +1100850 if ((vbc == WUFFS_BASE__TOKEN__VBC__STRUCTURE) &&
Nigel Tao2cf76db2020-02-27 22:42:01 +1100851 (vbd & WUFFS_BASE__TOKEN__VBD__STRUCTURE__POP)) {
Nigel Taod60815c2020-03-26 14:32:35 +1100852 if (g_query.is_at(g_depth)) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100853 return "main: no match for query";
854 }
Nigel Taod60815c2020-03-26 14:32:35 +1100855 if (g_depth <= 0) {
856 return "main: internal error: inconsistent g_depth";
Nigel Tao1b073492020-02-16 22:11:36 +1100857 }
Nigel Taod60815c2020-03-26 14:32:35 +1100858 g_depth--;
Nigel Tao1b073492020-02-16 22:11:36 +1100859
Nigel Taod60815c2020-03-26 14:32:35 +1100860 if (g_query.matched_all() && (g_depth >= g_flags.max_output_depth)) {
861 g_suppress_write_dst--;
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100862 // '…' is U+2026 HORIZONTAL ELLIPSIS, which is 3 UTF-8 bytes.
863 TRY(write_dst((vbd & WUFFS_BASE__TOKEN__VBD__STRUCTURE__FROM_LIST)
864 ? "\"[…]\""
865 : "\"{…}\"",
866 7));
867 } else {
868 // Write preceding whitespace.
Nigel Taod60815c2020-03-26 14:32:35 +1100869 if ((g_ctx != context::in_list_after_bracket) &&
870 (g_ctx != context::in_dict_after_brace) &&
871 !g_flags.compact_output) {
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100872 TRY(write_dst("\n", 1));
Nigel Taod60815c2020-03-26 14:32:35 +1100873 for (uint32_t i = 0; i < g_depth; i++) {
874 TRY(write_dst(
875 g_flags.tabs ? INDENT_TAB_STRING : INDENT_SPACES_STRING,
876 g_flags.tabs ? 1 : g_flags.indent));
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100877 }
Nigel Tao1b073492020-02-16 22:11:36 +1100878 }
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100879
880 TRY(write_dst(
881 (vbd & WUFFS_BASE__TOKEN__VBD__STRUCTURE__FROM_LIST) ? "]" : "}",
882 1));
Nigel Tao1b073492020-02-16 22:11:36 +1100883 }
884
Nigel Taod60815c2020-03-26 14:32:35 +1100885 g_ctx = (vbd & WUFFS_BASE__TOKEN__VBD__STRUCTURE__TO_LIST)
886 ? context::in_list_after_value
887 : context::in_dict_after_key;
Nigel Tao1b073492020-02-16 22:11:36 +1100888 goto after_value;
889 }
890
Nigel Taod1c928a2020-02-28 12:43:53 +1100891 // Write preceding whitespace and punctuation, if it wasn't ']', '}' or a
892 // continuation of a multi-token chain.
Nigel Tao2ef39992020-04-09 17:24:39 +1000893 if (start_of_token_chain) {
Nigel Taod60815c2020-03-26 14:32:35 +1100894 if (g_ctx == context::in_dict_after_key) {
895 TRY(write_dst(": ", g_flags.compact_output ? 1 : 2));
896 } else if (g_ctx != context::none) {
897 if ((g_ctx != context::in_list_after_bracket) &&
898 (g_ctx != context::in_dict_after_brace)) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100899 TRY(write_dst(",", 1));
Nigel Tao107f0ef2020-03-01 21:35:02 +1100900 }
Nigel Taod60815c2020-03-26 14:32:35 +1100901 if (!g_flags.compact_output) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100902 TRY(write_dst("\n", 1));
Nigel Taod60815c2020-03-26 14:32:35 +1100903 for (size_t i = 0; i < g_depth; i++) {
904 TRY(write_dst(
905 g_flags.tabs ? INDENT_TAB_STRING : INDENT_SPACES_STRING,
906 g_flags.tabs ? 1 : g_flags.indent));
Nigel Tao0cd2f982020-03-03 23:03:02 +1100907 }
908 }
909 }
910
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100911 bool query_matched_fragment = false;
Nigel Taod60815c2020-03-26 14:32:35 +1100912 if (g_query.is_at(g_depth)) {
913 switch (g_ctx) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100914 case context::in_list_after_bracket:
915 case context::in_list_after_value:
Nigel Taod60815c2020-03-26 14:32:35 +1100916 query_matched_fragment = g_query.tick();
Nigel Tao0cd2f982020-03-03 23:03:02 +1100917 break;
918 case context::in_dict_after_key:
Nigel Taod60815c2020-03-26 14:32:35 +1100919 query_matched_fragment = g_query.matched_fragment();
Nigel Tao0cd2f982020-03-03 23:03:02 +1100920 break;
Nigel Tao18ef5b42020-03-16 10:37:47 +1100921 default:
922 break;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100923 }
924 }
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100925 if (!query_matched_fragment) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100926 // No-op.
Nigel Taod60815c2020-03-26 14:32:35 +1100927 } else if (!g_query.next_fragment()) {
Nigel Tao0cd2f982020-03-03 23:03:02 +1100928 // There is no next fragment. We have matched the complete query, and
929 // the upcoming JSON value is the result of that query.
930 //
Nigel Taod60815c2020-03-26 14:32:35 +1100931 // Un-suppress writing to stdout and reset the g_ctx and g_depth as if
932 // we were about to decode a top-level value. This makes any subsequent
933 // indentation be relative to this point, and we will return g_eod
934 // after the upcoming JSON value is complete.
935 if (g_suppress_write_dst != 1) {
936 return "main: internal error: inconsistent g_suppress_write_dst";
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100937 }
Nigel Taod60815c2020-03-26 14:32:35 +1100938 g_suppress_write_dst = 0;
939 g_ctx = context::none;
940 g_depth = 0;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100941 } else if ((vbc != WUFFS_BASE__TOKEN__VBC__STRUCTURE) ||
942 !(vbd & WUFFS_BASE__TOKEN__VBD__STRUCTURE__PUSH)) {
943 // The query has moved on to the next fragment but the upcoming JSON
944 // value is not a container.
945 return "main: no match for query";
Nigel Tao1b073492020-02-16 22:11:36 +1100946 }
947 }
948
949 // Handle the token itself: either a container ('[' or '{') or a simple
Nigel Tao85fba7f2020-02-29 16:28:06 +1100950 // value: string (a chain of raw or escaped parts), literal or number.
Nigel Tao1b073492020-02-16 22:11:36 +1100951 switch (vbc) {
Nigel Tao85fba7f2020-02-29 16:28:06 +1100952 case WUFFS_BASE__TOKEN__VBC__STRUCTURE:
Nigel Taod60815c2020-03-26 14:32:35 +1100953 if (g_query.matched_all() && (g_depth >= g_flags.max_output_depth)) {
954 g_suppress_write_dst++;
Nigel Tao52c4d6a2020-03-08 21:12:38 +1100955 } else {
956 TRY(write_dst(
957 (vbd & WUFFS_BASE__TOKEN__VBD__STRUCTURE__TO_LIST) ? "[" : "{",
958 1));
959 }
Nigel Taod60815c2020-03-26 14:32:35 +1100960 g_depth++;
961 g_ctx = (vbd & WUFFS_BASE__TOKEN__VBD__STRUCTURE__TO_LIST)
962 ? context::in_list_after_bracket
963 : context::in_dict_after_brace;
Nigel Tao85fba7f2020-02-29 16:28:06 +1100964 return nullptr;
965
Nigel Tao2cf76db2020-02-27 22:42:01 +1100966 case WUFFS_BASE__TOKEN__VBC__STRING:
Nigel Tao2ef39992020-04-09 17:24:39 +1000967 if (start_of_token_chain) {
Nigel Tao2cf76db2020-02-27 22:42:01 +1100968 TRY(write_dst("\"", 1));
Nigel Taod60815c2020-03-26 14:32:35 +1100969 g_query.restart_fragment(in_dict_before_key() &&
970 g_query.is_at(g_depth));
Nigel Tao2cf76db2020-02-27 22:42:01 +1100971 }
Nigel Taocb37a562020-02-28 09:56:24 +1100972
973 if (vbd & WUFFS_BASE__TOKEN__VBD__STRING__CONVERT_0_DST_1_SRC_DROP) {
974 // No-op.
975 } else if (vbd &
976 WUFFS_BASE__TOKEN__VBD__STRING__CONVERT_1_DST_1_SRC_COPY) {
Nigel Taod60815c2020-03-26 14:32:35 +1100977 uint8_t* ptr = g_src.data.ptr + g_curr_token_end_src_index - len;
Nigel Tao0cd2f982020-03-03 23:03:02 +1100978 TRY(write_dst(ptr, len));
Nigel Taod60815c2020-03-26 14:32:35 +1100979 g_query.incremental_match_slice(ptr, len);
Nigel Taocb37a562020-02-28 09:56:24 +1100980 } else {
981 return "main: internal error: unexpected string-token conversion";
982 }
983
Nigel Tao496e88b2020-04-09 22:10:08 +1000984 if (t.continued()) {
Nigel Tao2cf76db2020-02-27 22:42:01 +1100985 return nullptr;
986 }
987 TRY(write_dst("\"", 1));
988 goto after_value;
989
990 case WUFFS_BASE__TOKEN__VBC__UNICODE_CODE_POINT:
Nigel Tao496e88b2020-04-09 22:10:08 +1000991 if (!t.continued()) {
992 return "main: internal error: unexpected non-continued UCP token";
Nigel Tao0cd2f982020-03-03 23:03:02 +1100993 }
994 TRY(handle_unicode_code_point(vbd));
Nigel Taod60815c2020-03-26 14:32:35 +1100995 g_query.incremental_match_code_point(vbd);
Nigel Tao0cd2f982020-03-03 23:03:02 +1100996 return nullptr;
Nigel Tao2cf76db2020-02-27 22:42:01 +1100997
Nigel Tao85fba7f2020-02-29 16:28:06 +1100998 case WUFFS_BASE__TOKEN__VBC__LITERAL:
Nigel Tao2cf76db2020-02-27 22:42:01 +1100999 case WUFFS_BASE__TOKEN__VBC__NUMBER:
Nigel Taod60815c2020-03-26 14:32:35 +11001000 TRY(write_dst(g_src.data.ptr + g_curr_token_end_src_index - len, len));
Nigel Tao2cf76db2020-02-27 22:42:01 +11001001 goto after_value;
Nigel Tao1b073492020-02-16 22:11:36 +11001002 }
1003
1004 // Return an error if we didn't match the (vbc, vbd) pair.
Nigel Tao2cf76db2020-02-27 22:42:01 +11001005 return "main: internal error: unexpected token";
1006 } while (0);
Nigel Tao1b073492020-02-16 22:11:36 +11001007
Nigel Tao2cf76db2020-02-27 22:42:01 +11001008 // Book-keeping after completing a value (whether a container value or a
1009 // simple value). Empty parent containers are no longer empty. If the parent
1010 // container is a "{...}" object, toggle between keys and values.
1011after_value:
Nigel Taod60815c2020-03-26 14:32:35 +11001012 if (g_depth == 0) {
1013 return g_eod;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001014 }
Nigel Taod60815c2020-03-26 14:32:35 +11001015 switch (g_ctx) {
Nigel Tao2cf76db2020-02-27 22:42:01 +11001016 case context::in_list_after_bracket:
Nigel Taod60815c2020-03-26 14:32:35 +11001017 g_ctx = context::in_list_after_value;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001018 break;
1019 case context::in_dict_after_brace:
Nigel Taod60815c2020-03-26 14:32:35 +11001020 g_ctx = context::in_dict_after_key;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001021 break;
1022 case context::in_dict_after_key:
Nigel Taod60815c2020-03-26 14:32:35 +11001023 g_ctx = context::in_dict_after_value;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001024 break;
1025 case context::in_dict_after_value:
Nigel Taod60815c2020-03-26 14:32:35 +11001026 g_ctx = context::in_dict_after_key;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001027 break;
Nigel Tao18ef5b42020-03-16 10:37:47 +11001028 default:
1029 break;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001030 }
1031 return nullptr;
1032}
1033
1034const char* //
1035main1(int argc, char** argv) {
1036 TRY(initialize_globals(argc, argv));
1037
Nigel Tao2ef39992020-04-09 17:24:39 +10001038 bool start_of_token_chain = false;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001039 while (true) {
Nigel Taod60815c2020-03-26 14:32:35 +11001040 wuffs_base__status status = g_dec.decode_tokens(
1041 &g_tok, &g_src,
1042 wuffs_base__make_slice_u8(g_work_buffer_array, WORK_BUFFER_ARRAY_SIZE));
Nigel Tao2cf76db2020-02-27 22:42:01 +11001043
Nigel Taod60815c2020-03-26 14:32:35 +11001044 while (g_tok.meta.ri < g_tok.meta.wi) {
1045 wuffs_base__token t = g_tok.data.ptr[g_tok.meta.ri++];
Nigel Tao2cf76db2020-02-27 22:42:01 +11001046 uint64_t n = t.length();
Nigel Taod60815c2020-03-26 14:32:35 +11001047 if ((g_src.meta.ri - g_curr_token_end_src_index) < n) {
1048 return "main: internal error: inconsistent g_src indexes";
Nigel Tao2cf76db2020-02-27 22:42:01 +11001049 }
Nigel Taod60815c2020-03-26 14:32:35 +11001050 g_curr_token_end_src_index += n;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001051
Nigel Taod0b16cb2020-03-14 10:15:54 +11001052 // Skip filler tokens (e.g. whitespace).
Nigel Tao2cf76db2020-02-27 22:42:01 +11001053 if (t.value() == 0) {
Nigel Tao496e88b2020-04-09 22:10:08 +10001054 start_of_token_chain = !t.continued();
Nigel Tao2cf76db2020-02-27 22:42:01 +11001055 continue;
1056 }
1057
Nigel Tao2ef39992020-04-09 17:24:39 +10001058 const char* z = handle_token(t, start_of_token_chain);
Nigel Tao496e88b2020-04-09 22:10:08 +10001059 start_of_token_chain = !t.continued();
Nigel Tao2cf76db2020-02-27 22:42:01 +11001060 if (z == nullptr) {
1061 continue;
Nigel Taod60815c2020-03-26 14:32:35 +11001062 } else if (z == g_eod) {
Nigel Tao0cd2f982020-03-03 23:03:02 +11001063 goto end_of_data;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001064 }
1065 return z;
Nigel Tao1b073492020-02-16 22:11:36 +11001066 }
Nigel Tao2cf76db2020-02-27 22:42:01 +11001067
1068 if (status.repr == nullptr) {
Nigel Tao0cd2f982020-03-03 23:03:02 +11001069 return "main: internal error: unexpected end of token stream";
Nigel Tao2cf76db2020-02-27 22:42:01 +11001070 } else if (status.repr == wuffs_base__suspension__short_read) {
Nigel Taod60815c2020-03-26 14:32:35 +11001071 if (g_curr_token_end_src_index != g_src.meta.ri) {
1072 return "main: internal error: inconsistent g_src indexes";
Nigel Tao2cf76db2020-02-27 22:42:01 +11001073 }
1074 TRY(read_src());
Nigel Taod60815c2020-03-26 14:32:35 +11001075 g_curr_token_end_src_index = g_src.meta.ri;
Nigel Tao2cf76db2020-02-27 22:42:01 +11001076 } else if (status.repr == wuffs_base__suspension__short_write) {
Nigel Taod60815c2020-03-26 14:32:35 +11001077 g_tok.compact();
Nigel Tao2cf76db2020-02-27 22:42:01 +11001078 } else {
1079 return status.message();
Nigel Tao1b073492020-02-16 22:11:36 +11001080 }
1081 }
Nigel Tao0cd2f982020-03-03 23:03:02 +11001082end_of_data:
1083
Nigel Taod60815c2020-03-26 14:32:35 +11001084 // With a non-empty g_query, don't try to consume trailing whitespace or
Nigel Tao0cd2f982020-03-03 23:03:02 +11001085 // confirm that we've processed all the tokens.
Nigel Taod60815c2020-03-26 14:32:35 +11001086 if (g_flags.query_c_string && *g_flags.query_c_string) {
Nigel Tao0cd2f982020-03-03 23:03:02 +11001087 return nullptr;
1088 }
Nigel Tao6b161af2020-02-24 11:01:48 +11001089
Nigel Tao6b161af2020-02-24 11:01:48 +11001090 // Check that we've exhausted the input.
Nigel Taod60815c2020-03-26 14:32:35 +11001091 if ((g_src.meta.ri == g_src.meta.wi) && !g_src.meta.closed) {
Nigel Taofe0cbbd2020-03-05 22:01:30 +11001092 TRY(read_src());
1093 }
Nigel Taod60815c2020-03-26 14:32:35 +11001094 if ((g_src.meta.ri < g_src.meta.wi) || !g_src.meta.closed) {
Nigel Tao6b161af2020-02-24 11:01:48 +11001095 return "main: valid JSON followed by further (unexpected) data";
1096 }
1097
1098 // Check that we've used all of the decoded tokens, other than trailing
Nigel Tao4b186b02020-03-18 14:25:21 +11001099 // filler tokens. For example, "true\n" is valid JSON (and fully consumed
1100 // with WUFFS_JSON__QUIRK_ALLOW_TRAILING_NEW_LINE enabled) with a trailing
1101 // filler token for the "\n".
Nigel Taod60815c2020-03-26 14:32:35 +11001102 for (; g_tok.meta.ri < g_tok.meta.wi; g_tok.meta.ri++) {
1103 if (g_tok.data.ptr[g_tok.meta.ri].value_base_category() !=
Nigel Tao6b161af2020-02-24 11:01:48 +11001104 WUFFS_BASE__TOKEN__VBC__FILLER) {
1105 return "main: internal error: decoded OK but unprocessed tokens remain";
1106 }
1107 }
1108
1109 return nullptr;
Nigel Tao1b073492020-02-16 22:11:36 +11001110}
1111
Nigel Tao2914bae2020-02-26 09:40:30 +11001112int //
1113compute_exit_code(const char* status_msg) {
Nigel Tao9cc2c252020-02-23 17:05:49 +11001114 if (!status_msg) {
1115 return 0;
1116 }
Nigel Tao01abc842020-03-06 21:42:33 +11001117 size_t n;
Nigel Taod60815c2020-03-26 14:32:35 +11001118 if (status_msg == g_usage) {
Nigel Tao01abc842020-03-06 21:42:33 +11001119 n = strlen(status_msg);
1120 } else {
Nigel Tao9cc2c252020-02-23 17:05:49 +11001121 n = strnlen(status_msg, 2047);
Nigel Tao01abc842020-03-06 21:42:33 +11001122 if (n >= 2047) {
1123 status_msg = "main: internal error: error message is too long";
1124 n = strnlen(status_msg, 2047);
1125 }
Nigel Tao9cc2c252020-02-23 17:05:49 +11001126 }
Nigel Taofe0cbbd2020-03-05 22:01:30 +11001127 const int stderr_fd = 2;
1128 ignore_return_value(write(stderr_fd, status_msg, n));
1129 ignore_return_value(write(stderr_fd, "\n", 1));
Nigel Tao9cc2c252020-02-23 17:05:49 +11001130 // Return an exit code of 1 for regular (forseen) errors, e.g. badly
1131 // formatted or unsupported input.
1132 //
1133 // Return an exit code of 2 for internal (exceptional) errors, e.g. defensive
1134 // run-time checks found that an internal invariant did not hold.
1135 //
1136 // Automated testing, including badly formatted inputs, can therefore
1137 // discriminate between expected failure (exit code 1) and unexpected failure
1138 // (other non-zero exit codes). Specifically, exit code 2 for internal
1139 // invariant violation, exit code 139 (which is 128 + SIGSEGV on x86_64
1140 // linux) for a segmentation fault (e.g. null pointer dereference).
1141 return strstr(status_msg, "internal error:") ? 2 : 1;
1142}
1143
Nigel Tao2914bae2020-02-26 09:40:30 +11001144int //
1145main(int argc, char** argv) {
Nigel Tao01abc842020-03-06 21:42:33 +11001146 // Look for an input filename (the first non-flag argument) in argv. If there
1147 // is one, open it (but do not read from it) before we self-impose a sandbox.
1148 //
1149 // Flags start with "-", unless it comes after a bare "--" arg.
1150 {
1151 bool dash_dash = false;
1152 int a;
1153 for (a = 1; a < argc; a++) {
1154 char* arg = argv[a];
1155 if ((arg[0] == '-') && !dash_dash) {
1156 dash_dash = (arg[1] == '-') && (arg[2] == '\x00');
1157 continue;
1158 }
Nigel Taod60815c2020-03-26 14:32:35 +11001159 g_input_file_descriptor = open(arg, O_RDONLY);
1160 if (g_input_file_descriptor < 0) {
Nigel Tao01abc842020-03-06 21:42:33 +11001161 fprintf(stderr, "%s: %s\n", arg, strerror(errno));
1162 return 1;
1163 }
1164 break;
1165 }
1166 }
1167
Nigel Taofe0cbbd2020-03-05 22:01:30 +11001168#if defined(WUFFS_EXAMPLE_USE_SECCOMP)
1169 prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);
Nigel Taod60815c2020-03-26 14:32:35 +11001170 g_sandboxed = true;
Nigel Taofe0cbbd2020-03-05 22:01:30 +11001171#endif
1172
Nigel Tao0cd2f982020-03-03 23:03:02 +11001173 const char* z = main1(argc, argv);
Nigel Taod60815c2020-03-26 14:32:35 +11001174 if (g_wrote_to_dst) {
Nigel Tao0cd2f982020-03-03 23:03:02 +11001175 const char* z1 = write_dst("\n", 1);
1176 const char* z2 = flush_dst();
1177 z = z ? z : (z1 ? z1 : z2);
1178 }
1179 int exit_code = compute_exit_code(z);
Nigel Taofe0cbbd2020-03-05 22:01:30 +11001180
1181#if defined(WUFFS_EXAMPLE_USE_SECCOMP)
1182 // Call SYS_exit explicitly, instead of calling SYS_exit_group implicitly by
1183 // either calling _exit or returning from main. SECCOMP_MODE_STRICT allows
1184 // only SYS_exit.
1185 syscall(SYS_exit, exit_code);
1186#endif
Nigel Tao9cc2c252020-02-23 17:05:49 +11001187 return exit_code;
Nigel Tao1b073492020-02-16 22:11:36 +11001188}