Auto Generated Documentation
This commit is contained in:
+29
-30
@@ -1,7 +1,7 @@
|
||||
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="encoding_rs is a Gecko-oriented Free Software / Open Source implementation of the Encoding Standard in Rust. Gecko-oriented means that converting to and from UTF-16 is supported in addition to converting to and from UTF-8, that the performance and streamability goals are browser-oriented, and that FFI-friendliness is a goal."><meta name="keywords" content="rust, rustlang, rust-lang, encoding_rs"><title>encoding_rs - Rust</title><link rel="stylesheet" type="text/css" href="../normalize.css"><link rel="stylesheet" type="text/css" href="../rustdoc.css" id="mainThemeStyle"><link rel="stylesheet" type="text/css" href="../light.css" id="themeStyle"><link rel="stylesheet" type="text/css" href="../dark.css" disabled ><link rel="stylesheet" type="text/css" href="../ayu.css" disabled ><script id="default-settings"></script><script src="../storage.js"></script><noscript><link rel="stylesheet" href="../noscript.css"></noscript><link rel="icon" type="image/svg+xml" href="../favicon.svg">
|
||||
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="API documentation for the Rust `encoding_rs` crate."><meta name="keywords" content="rust, rustlang, rust-lang, encoding_rs"><title>encoding_rs - Rust</title><link rel="stylesheet" type="text/css" href="../normalize.css"><link rel="stylesheet" type="text/css" href="../rustdoc.css" id="mainThemeStyle"><link rel="stylesheet" type="text/css" href="../light.css" id="themeStyle"><link rel="stylesheet" type="text/css" href="../dark.css" disabled ><link rel="stylesheet" type="text/css" href="../ayu.css" disabled ><script id="default-settings"></script><script src="../storage.js"></script><noscript><link rel="stylesheet" href="../noscript.css"></noscript><link rel="icon" type="image/svg+xml" href="../favicon.svg">
|
||||
<link rel="alternate icon" type="image/png" href="../favicon-16x16.png">
|
||||
<link rel="alternate icon" type="image/png" href="../favicon-32x32.png"><style type="text/css">#crate-search{background-image:url("../down-arrow.svg");}</style></head><body class="rustdoc mod"><!--[if lte IE 8]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><nav class="sidebar"><div class="sidebar-menu">☰</div><a href='../encoding_rs/index.html'><div class='logo-container rust-logo'><img src='../rust-logo.png' alt='logo'></div></a><p class="location">Crate encoding_rs</p><div class="block version"><p>Version 0.8.28</p></div><div class="sidebar-elems"><a id="all-types" href="all.html"><p>See all encoding_rs's items</p></a><div class="block items"><ul><li><a href="#modules">Modules</a></li><li><a href="#structs">Structs</a></li><li><a href="#enums">Enums</a></li><li><a href="#statics">Statics</a></li></ul></div><p class="location"></p><div id="sidebar-vars" data-name="encoding_rs" data-ty="mod" data-relpath="../"></div></div></nav><div class="theme-picker"><button id="theme-picker" aria-label="Pick another theme!" aria-haspopup="menu"><img src="../brush.svg" width="18" alt="Pick another theme!"></button><div id="theme-choices" role="menu"></div></div><script src="../theme.js"></script><nav class="sub"><form class="search-form"><div class="search-container"><div><select id="crate-search"><option value="All crates">All crates</option></select><input class="search-input" name="search" disabled autocomplete="off" spellcheck="false" placeholder="Click or press ‘S’ to search, ‘?’ for more options…" type="search"></div><button type="button" class="help-button">?</button>
|
||||
<a id="settings-menu" href="../settings.html"><img src="../wheel.svg" width="18" alt="Change settings"></a></div></form></nav><section id="main" class="content"><h1 class="fqn"><span class="in-band">Crate <a class="mod" href="">encoding_rs</a></span><span class="out-of-band"><span id="render-detail"><a id="toggle-all-docs" href="javascript:void(0)" title="collapse all docs">[<span class="inner">−</span>]</a></span><a class="srclink" href="../src/encoding_rs/lib.rs.html#10-6050" title="goto source code">[src]</a></span></h1><div class="docblock"><p>encoding_rs is a Gecko-oriented Free Software / Open Source implementation
|
||||
<link rel="alternate icon" type="image/png" href="../favicon-32x32.png"><style type="text/css">#crate-search{background-image:url("../down-arrow.svg");}</style></head><body class="rustdoc mod"><!--[if lte IE 8]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><nav class="sidebar"><div class="sidebar-menu">☰</div><a href='../encoding_rs/index.html'><div class='logo-container rust-logo'><img src='../rust-logo.png' alt='logo'></div></a><p class="location">Crate encoding_rs</p><div class="block version"><p>Version 0.8.28</p></div><div class="sidebar-elems"><a id="all-types" href="all.html"><p>See all encoding_rs's items</p></a><div class="block items"><ul><li><a href="#modules">Modules</a></li><li><a href="#structs">Structs</a></li><li><a href="#enums">Enums</a></li><li><a href="#statics">Statics</a></li></ul></div><p class="location"></p><script>window.sidebarCurrent = {name: "encoding_rs", ty: "mod", relpath: "../"};</script></div></nav><div class="theme-picker"><button id="theme-picker" aria-label="Pick another theme!" aria-haspopup="menu"><img src="../brush.svg" width="18" alt="Pick another theme!"></button><div id="theme-choices" role="menu"></div></div><script src="../theme.js"></script><nav class="sub"><form class="search-form"><div class="search-container"><div><select id="crate-search"><option value="All crates">All crates</option></select><input class="search-input" name="search" disabled autocomplete="off" spellcheck="false" placeholder="Click or press ‘S’ to search, ‘?’ for more options…" type="search"></div><button type="button" class="help-button">?</button>
|
||||
<a id="settings-menu" href="../settings.html"><img src="../wheel.svg" width="18" alt="Change settings"></a></div></form></nav><section id="main" class="content"><h1 class="fqn"><span class="out-of-band"><span id="render-detail"><a id="toggle-all-docs" href="javascript:void(0)" title="collapse all docs">[<span class="inner">−</span>]</a></span><a class="srclink" href="../src/encoding_rs/lib.rs.html#10-6050" title="goto source code">[src]</a></span><span class="in-band">Crate <a class="mod" href="">encoding_rs</a></span></h1><div class="docblock"><p>encoding_rs is a Gecko-oriented Free Software / Open Source implementation
|
||||
of the <a href="https://encoding.spec.whatwg.org/">Encoding Standard</a> in Rust.
|
||||
Gecko-oriented means that converting to and from UTF-16 is supported in
|
||||
addition to converting to and from UTF-8, that the performance and
|
||||
@@ -25,7 +25,7 @@ file for details.
|
||||
The <a href="https://github.com/hsivonen/encoding_rs">repository is on GitHub</a>. The
|
||||
<a href="https://crates.io/crates/encoding_rs">crate is available on crates.io</a>.</p>
|
||||
<h1 id="integration-with-stdio" class="section-header"><a href="#integration-with-stdio">Integration with <code>std::io</code></a></h1>
|
||||
<p>This crate doesn’t implement traits from <code>std::io</code>. However, for the case of
|
||||
<p>This crate doesn't implement traits from <code>std::io</code>. However, for the case of
|
||||
wrapping a <code>std::io::Read</code> in a decoder that implements <code>std::io::Read</code> and
|
||||
presents the data from the wrapped <code>std::io::Read</code> as UTF-8 is addressed by
|
||||
the <a href="https://docs.rs/encoding_rs_io/"><code>encoding_rs_io</code></a> crate.</p>
|
||||
@@ -140,12 +140,12 @@ the <a href="https://docs.rs/encoding_rs_io/"><code>encoding_rs_io</code></a> cr
|
||||
<span class="macro">assert_eq</span><span class="macro">!</span>(<span class="kw-2">&</span><span class="ident">output</span>[..], <span class="ident">expectation</span>);
|
||||
<span class="macro">assert</span><span class="macro">!</span>(<span class="op">!</span><span class="ident">total_had_errors</span>);</pre></div>
|
||||
<h2 id="utf-16le-utf-16be-and-unicode-encoding-schemes" class="section-header"><a href="#utf-16le-utf-16be-and-unicode-encoding-schemes">UTF-16LE, UTF-16BE and Unicode Encoding Schemes</a></h2>
|
||||
<p>The Encoding Standard doesn’t specify encoders for UTF-16LE and UTF-16BE,
|
||||
<p>The Encoding Standard doesn't specify encoders for UTF-16LE and UTF-16BE,
|
||||
<strong>so this crate does not provide encoders for those encodings</strong>!
|
||||
Along with the replacement encoding, their <em>output encoding</em> is UTF-8,
|
||||
so you get an UTF-8 encoder if you request an encoder for them.</p>
|
||||
<p>Additionally, the Encoding Standard factors BOM handling into wrapper
|
||||
algorithms so that BOM handling isn’t part of the definition of the
|
||||
algorithms so that BOM handling isn't part of the definition of the
|
||||
encodings themselves. The Unicode <em>encoding schemes</em> in the Unicode
|
||||
Standard define BOM handling or lack thereof as part of the encoding
|
||||
scheme.</p>
|
||||
@@ -162,8 +162,8 @@ but in that case, the UTF-8 BOM triggers UTF-8 decoding, which is
|
||||
not part of the behavior of the UTF-16 <em>encoding scheme</em> per the
|
||||
Unicode Standard.</p>
|
||||
<p>The UTF-32 family of Unicode encoding schemes is not supported
|
||||
by this crate. The Encoding Standard doesn’t define any UTF-32
|
||||
family encodings, since they aren’t necessary for consuming Web
|
||||
by this crate. The Encoding Standard doesn't define any UTF-32
|
||||
family encodings, since they aren't necessary for consuming Web
|
||||
content.</p>
|
||||
<h2 id="iso-8859-1" class="section-header"><a href="#iso-8859-1">ISO-8859-1</a></h2>
|
||||
<p>ISO-8859-1 does not exist as a distinct encoding from windows-1252 in
|
||||
@@ -178,12 +178,12 @@ in the <a href="https://infra.spec.whatwg.org/">Infra Standard</a>.</p>
|
||||
<h2 id="web--browser-focus" class="section-header"><a href="#web--browser-focus">Web / Browser Focus</a></h2>
|
||||
<p>Both in terms of scope and performance, the focus is on the Web. For scope,
|
||||
this means that encoding_rs implements the Encoding Standard fully and
|
||||
doesn’t implement encodings that are not specified in the Encoding
|
||||
doesn't implement encodings that are not specified in the Encoding
|
||||
Standard. For performance, this means that decoding performance is
|
||||
important as well as performance for encoding into UTF-8 or encoding the
|
||||
Basic Latin range (ASCII) into legacy encodings. Non-Basic Latin needs to
|
||||
be encoded into legacy encodings in only two places in the Web platform: in
|
||||
the query part of URLs, in which case it’s a matter of relatively rare
|
||||
the query part of URLs, in which case it's a matter of relatively rare
|
||||
error handling, and in form submission, in which case the user action and
|
||||
networking tend to hide the performance of the encoder.</p>
|
||||
<p>Deemphasizing performance of encoding non-Basic Latin text into legacy
|
||||
@@ -227,7 +227,7 @@ to C callers. The non-streaming part of the API is for Rust callers only and
|
||||
is smart about borrowing instead of copying when possible. When
|
||||
streamability is not needed, the non-streaming API should be preferrer in
|
||||
order to avoid copying data when a borrow suffices.</p>
|
||||
<p>There is no analogous C API exposed via FFI, mainly because C doesn’t have
|
||||
<p>There is no analogous C API exposed via FFI, mainly because C doesn't have
|
||||
standard types for growable byte buffers and Unicode strings that know
|
||||
their length.</p>
|
||||
<p>The C API (header file generated at <code>target/include/encoding_rs.h</code> when
|
||||
@@ -284,7 +284,7 @@ far is always valid taken as whole. (In the case of encoding to ISO-2022-JP,
|
||||
the output needs to be considered as a whole, because the latest output
|
||||
buffer taken alone might not be valid taken alone if the transition away
|
||||
from the ASCII state occurred in an earlier output buffer. However, since
|
||||
the ISO-2022-JP decoder doesn’t treat streams that don’t end in the ASCII
|
||||
the ISO-2022-JP decoder doesn't treat streams that don't end in the ASCII
|
||||
state as being in error despite the encoder generating a transition to the
|
||||
ASCII state at the end, the claim about the partial output taken as a whole
|
||||
being valid is true even for ISO-2022-JP.)</p>
|
||||
@@ -298,7 +298,7 @@ code point associated with the error without requiring the caller to
|
||||
extract it from the input on its own.</p>
|
||||
<p>On the encoder side, an error is always triggered by the most recently
|
||||
pushed Unicode scalar, which makes it simple to pass the <code>char</code> to the
|
||||
caller. Also, it’s very typical for the caller to wish to do something with
|
||||
caller. Also, it's very typical for the caller to wish to do something with
|
||||
this data: generate a numeric escape for the character. Additionally, the
|
||||
ISO-2022-JP encoder reports U+FFFD instead of the actual input character in
|
||||
certain cases, so requiring the caller to extract the character from the
|
||||
@@ -307,13 +307,13 @@ Furthermore, requiring the caller to extract the character from the input
|
||||
buffer would require the caller to implement UTF-8 or UTF-16 math, which is
|
||||
the job of an encoding conversion library.</p>
|
||||
<p>On the decoder side, errors are triggered in more complex ways. For
|
||||
example, when decoding the sequence ESC, ‘$’, <em>buffer boundary</em>, ‘A’ as
|
||||
example, when decoding the sequence ESC, '$', <em>buffer boundary</em>, 'A' as
|
||||
ISO-2022-JP, the ESC byte is in error, but this is discovered only after
|
||||
the buffer boundary when processing ‘A’. Thus, the bytes in error might not
|
||||
the buffer boundary when processing 'A'. Thus, the bytes in error might not
|
||||
be the ones most recently pushed to the decoder and the error might not even
|
||||
be in the current buffer.</p>
|
||||
<p>Some encoding conversion APIs address the problem by not acknowledging
|
||||
trailing bytes of an input buffer as consumed if it’s still possible for
|
||||
trailing bytes of an input buffer as consumed if it's still possible for
|
||||
future bytes to cause the trailing bytes to be in error. This way, error
|
||||
reporting can always refer to the most recently pushed buffer. This has the
|
||||
problem that the caller of the API has to copy the unconsumed trailing
|
||||
@@ -330,7 +330,7 @@ stream.</p>
|
||||
it possible to develop applications, such as HTML validators, that care
|
||||
about which bytes were in error, encoding_rs reports the length of the
|
||||
erroneous sequence and the number of bytes consumed after the erroneous
|
||||
sequence. As long as the caller doesn’t discard the 6 most recent bytes,
|
||||
sequence. As long as the caller doesn't discard the 6 most recent bytes,
|
||||
this makes it possible for callers that care about the erroneous bytes to
|
||||
locate them.</p>
|
||||
<h1 id="no-convenience-api-for-custom-replacements" class="section-header"><a href="#no-convenience-api-for-custom-replacements">No Convenience API for Custom Replacements</a></h1>
|
||||
@@ -342,7 +342,7 @@ encoders is emitting an HTML decimal numeric character reference for
|
||||
unmappable characters.</p>
|
||||
<p>Since encoding_rs is Web-focused, these are the only error recovery modes
|
||||
for which convenient support is provided. Moreover, on the decoder side,
|
||||
there aren’t really good alternatives for emitting the REPLACEMENT CHARACTER
|
||||
there aren't really good alternatives for emitting the REPLACEMENT CHARACTER
|
||||
on error (other than treating errors as fatal). In particular, simply
|
||||
ignoring errors is a
|
||||
<a href="http://www.unicode.org/reports/tr36/#Substituting_for_Ill_Formed_Subsequences">security problem</a>,
|
||||
@@ -352,12 +352,12 @@ callers to ignore errors.</p>
|
||||
numeric character references. For example, when outputting CSS, CSS-style
|
||||
escapes would seem to make sense. However, instead of facilitating the
|
||||
output of CSS, JS, etc. in non-UTF-8 encodings, encoding_rs takes the design
|
||||
position that you shouldn’t generate output in encodings other than UTF-8,
|
||||
position that you shouldn't generate output in encodings other than UTF-8,
|
||||
except where backward compatibility with interacting with the legacy Web
|
||||
requires it. The legacy Web requires it only when parsing the query strings
|
||||
of URLs and when submitting forms, and those two both use HTML decimal
|
||||
numeric character references.</p>
|
||||
<p>While encoding_rs doesn’t make encoder replacements other than HTML decimal
|
||||
<p>While encoding_rs doesn't make encoder replacements other than HTML decimal
|
||||
numeric character references easy, it does make them <em>possible</em>.
|
||||
<code>encode_from_utf8()</code>, which emits HTML decimal numeric character references
|
||||
for unmappable characters, is implemented on top of
|
||||
@@ -371,28 +371,28 @@ rather than <code>trait</code>s. encoding_rs takes the design position that all
|
||||
text interchange should be done using UTF-8, which can represent all of
|
||||
Unicode. (It is, in fact, the only encoding supported by the Encoding
|
||||
Standard and encoding_rs that can represent all of Unicode and that has
|
||||
encoder support. UTF-16LE and UTF-16BE don’t have encoder support, and
|
||||
encoder support. UTF-16LE and UTF-16BE don't have encoder support, and
|
||||
gb18030 cannot encode U+E5E5.) The other encodings are supported merely for
|
||||
legacy compatibility and not due to non-UTF-8 encodings having benefits
|
||||
other than being able to consume legacy content.</p>
|
||||
<p>Considering that UTF-8 can represent all of Unicode and is already supported
|
||||
by all Web browsers, introducing a new encoding wouldn’t add to the
|
||||
by all Web browsers, introducing a new encoding wouldn't add to the
|
||||
expressiveness but would add to compatibility problems. In that sense,
|
||||
adding new encodings to the Web Platform doesn’t make sense, and, in fact,
|
||||
adding new encodings to the Web Platform doesn't make sense, and, in fact,
|
||||
post-UTF-8 attempts at encodings, such as BOCU-1, have been rejected from
|
||||
the Web Platform. On the other hand, the set of legacy encodings that must
|
||||
be supported for a Web browser to be able to be successful is not going to
|
||||
expand. Empirically, the set of encodings specified in the Encoding Standard
|
||||
is already sufficient and the set of legacy encodings won’t grow
|
||||
is already sufficient and the set of legacy encodings won't grow
|
||||
retroactively.</p>
|
||||
<p>Since extensibility doesn’t make sense considering the Web focus of
|
||||
<p>Since extensibility doesn't make sense considering the Web focus of
|
||||
encoding_rs and adding encodings to Web clients would be actively harmful,
|
||||
it makes sense to make the set of encodings that encoding_rs supports
|
||||
non-extensible and to take the (admittedly small) benefits arising from
|
||||
that, such as the size of <code>Decoder</code> and <code>Encoder</code> objects being known ahead
|
||||
of time, which enables stack allocation thereof.</p>
|
||||
<p>This does have downsides for applications that might want to put encoding_rs
|
||||
to non-Web uses if those non-Web uses involve legacy encodings that aren’t
|
||||
to non-Web uses if those non-Web uses involve legacy encodings that aren't
|
||||
needed for Web uses. The needs of such applications should not complicate
|
||||
encoding_rs itself, though. It is up to those applications to provide a
|
||||
framework that delegates the operations with encodings that encoding_rs
|
||||
@@ -401,11 +401,11 @@ else (as opposed to encoding_rs itself providing an extensibility
|
||||
framework).</p>
|
||||
<h1 id="panics" class="section-header"><a href="#panics">Panics</a></h1>
|
||||
<p>Methods in encoding_rs can panic if the API is used against the requirements
|
||||
stated in the documentation, if a state that’s supposed to be impossible
|
||||
stated in the documentation, if a state that's supposed to be impossible
|
||||
is reached due to an internal bug or on integer overflow. When used
|
||||
according to documentation with buffer sizes that stay below integer
|
||||
overflow, in the absence of internal bugs, encoding_rs does not panic.</p>
|
||||
<p>Panics arising from API misuse aren’t documented beyond this on individual
|
||||
<p>Panics arising from API misuse aren't documented beyond this on individual
|
||||
methods.</p>
|
||||
<h1 id="at-risk-parts-of-the-api" class="section-header"><a href="#at-risk-parts-of-the-api">At-Risk Parts of the API</a></h1>
|
||||
<p>The foreseeable source of partially backward-incompatible API change is the
|
||||
@@ -671,5 +671,4 @@ replacement.</p>
|
||||
</td></tr><tr class="module-item"><td><a class="static" href="static.X_MAC_CYRILLIC_INIT.html" title="encoding_rs::X_MAC_CYRILLIC_INIT static">X_MAC_CYRILLIC_INIT</a></td><td class="docblock-short"><p>The initializer for the <a href="static.X_MAC_CYRILLIC.html">x-mac-cyrillic</a> encoding.</p>
|
||||
</td></tr><tr class="module-item"><td><a class="static" href="static.X_USER_DEFINED.html" title="encoding_rs::X_USER_DEFINED static">X_USER_DEFINED</a></td><td class="docblock-short"><p>The x-user-defined encoding.</p>
|
||||
</td></tr><tr class="module-item"><td><a class="static" href="static.X_USER_DEFINED_INIT.html" title="encoding_rs::X_USER_DEFINED_INIT static">X_USER_DEFINED_INIT</a></td><td class="docblock-short"><p>The initializer for the <a href="static.X_USER_DEFINED.html">x-user-defined</a> encoding.</p>
|
||||
</td></tr></table></section><section id="search" class="content hidden"></section><section class="footer"></section><div id="rustdoc-vars" data-root-path="../" data-current-crate="encoding_rs"></div>
|
||||
<script src="../main.js"></script><script defer src="../search-index.js"></script></body></html>
|
||||
</td></tr></table></section><section id="search" class="content hidden"></section><section class="footer"></section><script>window.rootPath = "../";window.currentCrate = "encoding_rs";</script><script src="../main.js"></script><script defer src="../search-index.js"></script></body></html>
|
||||
Reference in New Issue
Block a user