5.1.1 White-space Characters
If the paragraph element or any of its child elements
contains white-space characters, they are collapsed.
Leading white-space characters at the paragraph start as well as trailing
white-space characters at the paragraph end are ignored. In detail, the
following conversions take place:
The following [UNICODE] characters are normalized to a SPACE
character:
�
HORIZONTAL TABULATION (0x0009)
�
CARRIAGE RETURN (0x000D)
�
LINE FEED (0x000A)
�
SPACE (0x0020)
In addition, these characters are ignored if the preceding
character is a white-space character. The preceding character can be contained
in the same element, in the parent element, or in the preceding sibling
element, as long as it is contained within the same paragraph element and the
element in which it is contained processes white-space characters as described
above. White-space characters at the start or end of the paragraph are ignored,
regardless whether they are contained in the paragraph element itself, or in a
child element in which white-space characters are collapsed as described above.
These white-space processing rules shall enable authors to
use white-space characters to improve the readability of the XML source of an
OpenDocument document in the same way as they can use them in HTML.
White-space processing takes place within the following
elements:
�
<text:p>
�
<text:h>
�
<text:span>
�
<text:a>
�
<text:ref-point>
�
<text:ref-point-start>
�
<text:ref-point-end>
�
<text:bookmark>
�
<text:bookmark-start>
�
<text:bookmark-end>
Note: In [XSL], white-space
processing of a paragraph of text can be enabled by attaching an fo:white-space="collapse"
attribute to the <fo:block> element that corresponds to the
paragraph element.
, in other words they are processed in the
same way that [HTML4] processes them.
Space Character
In general, consecutive
white-space characters in a paragraph are collapsed. For this reason, there is
a special XML element used to represent the [UNICODE] character SPACE (0x0020).
This element uses an optional attribute called text:c to specify the number of SPACE characters that
the element represents. A missing text:c attribute
is interpreted as meaning a single SPACE character.
This element is required to represent the second and all
following SPACE characters in a sequence of SPACE characters. It is not an
error if the character preceding the element is not a white-space character,
but it is good practice to use this element for the second and all following
SPACE characters in a sequence. This way, an application recognizes a single
space character without recognizing this element.
<define name="paragraph-content"
combine="choice">
<element
name="text:s">
<optional>
<attribute
name="text:c">
<ref
name="nonNegativeInteger"/>
</attribute>
</optional>
</element>
</define>
Tab Character
The <text:tab> element
represents the [UNICODE] tab character HORIZONTAL TABULATION (0x0009) in a
heading or paragraph. A <text:tab> element
reserves space from the current position up to the next tab-stop, as defined in
the paragraph's style information.
<define name="paragraph-content"
combine="choice">
<element
name="text:tab">
<ref
name="text-tab-attr"/>
</element>
</define>
To determine which tab-stop a tab character will advance to
requires layout information. To make it easier for non-layout oriented
processors to determine this information, applications may generate a
text:tab-ref attribute as a hint that associates a tab character with a
tab-stop in the current paragraph style. It contains the number of the tab-stop
that the tab character refers to. The position 0 has a special meaning and
signifies the start margin of the paragraph.
<define name="text-tab-attr">
<optional>
<attribute
name="text:tab-ref">
<ref
name="nonNegativeInteger"/>
</attribute>
</optional>
</define>
Note: The text:tab-ref attribute is only a hint to
help non-layout oriented processors to determine the tab/tab-stop association.
Layout oriented processors should determine the tab positions solely based on
the style information.
Line Breaks
The <text:line-break>
element represents a line break in a heading or paragraph.
<define name="paragraph-content"
combine="choice">
<element
name="text:line-break">
<empty/>
</element>
</define>
Soft Page Break
The <text:soft-page-break>
element represents a soft page break within a heading or paragraph.
See section 2.3.1:Use Soft Page BreaksUse Soft Page Breaks for details regarding soft
page breaks.
<define name="paragraph-content"
combine="choice">
<ref
name="text-soft-page-break"/>
</define>