Home > Terms > English, UK (UE) > Well-formed UTF-8 code unit sequence

Well-formed UTF-8 code unit sequence

A well-formed Unicode code unit sequence of UTF-8 code units.

  • The UTF-8 code unit sequence <41 C3 B1 42> is well-formed, because it can be partitioned into subsequences, all of which match the specification for UTF-8 in Table 3-7. It consists of the following minimal well-formed code unit subsequences: <41>, , and <42>.
  • The UTF-8 code unit sequence <41 C2 C3 B1 42> is ill-formed, because it contains one ill-formed subsequence. There is no subsequence for the C2 byte which matches the specification for UTF-8 in Table 3-7. The code unit sequence is partitioned into one minimal well-formed code unit subsequence, <41>, followed by one ill-formed code unit subsequence, , followed by two minimal well-formed code unit subsequences, and <42>.
  • In isolation, the UTF-8 code unit sequence would be ill-formed, but in the context of the UTF-8 code unit sequence <41 C2 C3 B1 42>, does not constitute an ill-formed code unit subsequence, because the C3 byte is actually the first byte of the minimal well-formed UTF-8 code unit subsequence . Ill-formed code unit subsequences do not overlap with minimal well-formed code unit subsequences.
This is auto-generated content. You can help to improve it.
0
Collect to Blossary

Member comments

You have to log in to post to discussions.

Terms in the News

Featured Terms

Harry8L
  • 0

    Terms

  • 0

    Blossaries

  • 1

    Followers

Industry/Domain: Sporting goods Category: Exercise equipment

Shell suit

A shell suit is a derivate of the tracksuit. It became popular in the late 80's (among the hip hop environment). Shell suits are made of triacetate ...

Contributor

Featured blossaries

Chinese Warring States

Category: History   2 2 Terms

The Best Smartphones of 2014

Category: Technology   1 10 Terms

Browers Terms By Category