RXP(1) RXP(1) NNAAMMEE rxp - XML parser program SSYYNNOOPPSSIISS rrxxpp [ --aabbeemmnnNNRRssSSttvvVVxx ] [ --oo bb||pp||00||11||22||33||ii||dd ] [ UU 00||11||22 ] [ --cc _e_n_c_o_d_i_n_g ] [ _u_r_l ] DDEESSCCRRIIPPTTIIOONN rrxxpp reads and parses XML from the _u_r_l (or standard input if none is provided) and writes it to standard output, optionally expanding enti- ties, defaulting attributes, and translating to a different output encoding. rrxxpp accepts XML 1.0 and 1.1, and the corresponding versions of XML namespaces. It implements the Oasis XML catalog specification. Common option combinations are --NNxxss to check a document for well- formedness and namespace well-formedness, and --VVNNxxss to also check for DTD-validity. OOPPTTIIOONNSS --aa Insert declared default values for omitted attributes. --vv Be verbose. --VV Validate the document. Repeating this option will make the pro- gram treat validity errors as well-formedness errors, and exit after the first validity error (otherwise a warning will be printed for each one). --dd Read the whole DTD (internal and external parts) regardless of any standalone declaration. Otherwise a declaration "stand- alone='yes'" will prevent the external part from being read (unless validation is selected). --NN Enable XML namespace support. The document will be checked for correct namespace syntax, and if --bb is specified qualified ele- ment and attribute names will be displayed with their URIs. --RR The value of this flag is a time limit in seconds, after which the program will abort. This is to protect against denial-of- service attacks using malicious documents. --SS Keep track of xml:space attributes. This will only affect out- put when --bb is specified. --ee Obsolete, do not use. --EE Do not expand entity references (opposite of old --ee flag) --ss Be silent (that is, suppress output). Useful for benchmarking or if you just want to see the error messages. --bb Print output as "bits". --nn Treat the input as normalised SGML rather than XML. Not intended for general use. --oo If this flag is pp, output is in the default (plain) format. If it is bb, output is printed as "bits" (equivalent to --bb). If it is 00, output is suppressed (equivalent to --ss). If it is 11, 22 or 33, output is in first, second or third canonical form. If it is ii, output is a dump of the document's infoset. If it is dd, output is in a form suitable for use with "diff"; in particular attributes are sorted into alphabetical order. --mm Merge PCData across entity references. This will only affect the output when --bb is specified. --tt Read in the input as a tree, rather than bits. Should make no difference to the output. --uu _b_a_s_e___u_r_i Use the specified base URI when resolving system identifiers. --UU This flag controls Unicode normalization checking and is only relevant when parsing XML 1.1 documents. If it is 00, no check- ing is done. If it is 11, rrxxpp checks that the document is fully normalized as defined by the W3C character model. If it is 22, the document is checked and any unknown characters (which may be ones corresponding to a newer version of Unicode than rrxxpp knows about) will also cause an error. --xx Strict XML mode. This suppresses some warnings (eg entity redefinitions) but treats all XML well-formedness errors as fatal. This flag implies the --aa flag, and sets the output encoding to UTF-8 unless the --cc flag is given. It sets the out- put format to first canonical form unless the --oo, --bb or --ss flag is given. --cc _e_n_c_o_d_i_n_g Produce output in the specified character encoding. Known encodings include IISSOO--88885599--11, UUTTFF--88, IISSOO--1100664466--UUCCSS and UUTTFF--1166. 16-bit encoding names my be suffixed with --BB or --LL to specify big- or little-endian byte order (the default is the host byte order). If no --cc or --xx option is given, output is in the same encoding as the input document. --DD _n_a_m_e _s_y_s_i_d Force use of the document type specified by _s_y_s_i_d. The root element name for validation is _n_a_m_e. Any DTD in the document is ignored. This flag does not imply validation; use --VV if required. --ii Do xml:id processing. Attributes named xml:id are recognised as IDs even if not declared. --II The same as --ii, but in addition xml:id attributes are checked for uniqueness. --zz Use a shorter format for error messages. Particularly useful when using the parser in Emacs compilation mode, so that Emacs can find the error location. EEXXIITT SSTTAATTUUSS If the --VV flag is given, and the document is well-formed but not valid, 2 is returned. If the document is not well-formed, or a system error occurs, 1 is returned. Otherwise 0 is returned. Since the parser can expand external entities even when not validating, it treats certain errors which are technically validity errors as well-formedness errors. If --xx is not specified, some well-formedness errors produce only warn- ings and do not affect the exit status. EENNVVIIRROONNMMEENNTT If the environment variable XXMMLL__CCAATTAALLOOGG__FFIILLEESS is set, XML catalog pro- cessing is enabled. The variable should be set to a space-separated list of catalog files. The variable XXMMLL__CCAATTAALLOOGG__PPRREEFFEERR may be set to ppuubblliicc or ssyysstteemm to set the initial mode for catalog processing; the default is ssyysstteemm. If the variable RRXXPPUURRLL iiss sseett,, iitt iiss uusseedd aass tthhee UURRLL ooff tthhee ddooccuummeenntt ttoo ppaarrssee.. TThhiiss mmaayy be useful in CGI scripts and the like to avoid shell parsing of a user-supplied argument. The variable hhttttpp__pprrooxxyy can be used to specify a proxy for HTTP connec- tions. The syntax is hhoossttnnaammee[[::ppoorrtt]]. RXP release 1.4.7 RXP(1)