> -----Original Message-----
> From: Jeff Deskins
> Sent: Thursday, 2007 January 04 12:58
> To: arbortext-adepters
> Subject: Try to Determine Valid URN Syntax
> Importance: High
>
> I was wonder if I could reach out to some technical folks
> who are familiar with URN, URL/URI inter-workings.
>
> As I read and try to understand the specifications, a URI
> can contain a query after the path as in my top example.
>
> What I was wondering, can a URN can also contain a query as
> in my bottom example?
>
> foo://example.com:8042/over/there?name=ferret#nose
> \_/ \______________/\_________/ \_________/ \__/
> | | | | |
> scheme authority path query fragment
> | _____________________|__
> / \ / \
> urn:example:animal:ferret:nose?type=blue;etc.
>
> Even with lots of net research I've not been able to validate this.
URNs are a kind of URI.
The latest URI spec is RFC 3986 (which supercedes RFC 2986)
which is at ftp://ftp.ietf.org/rfc/rfc3986.txt .
The URN syntax itself is described in RFC 2141 at
ftp://ftp.ietf.org/rfc/rfc2141.txt .
(I suspect Jeff knows all this, since his example comes
right out of 3986.)
Since a URN is a URI, I see nothing in RFC 3986 preventing
a URN-schemed URI from having the optional query part.
RFC 2141 defines a URN as:
<urn> ::= "urn:" <nid> ":" <nss>
and
<nss> ::= 1*<urn chars=">
<urn chars="> ::= <trans> | "%" <hex> <hex>
<trans> ::= <upper> | <lower> | <number> | <other> | <reserved>
<hex> ::= <number> | "A" | "B" | "C" | "D" | "E" | "F" |
"a" | "b" | "c" | "d" | "e" | "f"
<other> ::= "(" | ")" | "+" | "," | "-" | "." |
":" | "=" | "@" | ";" | "$" |
"_" | "!" | "*" | "
<reserved> ::= '%" | "/" | "?" | "#"
Given the productions, it looks like an <nss> could consist
of "?type=blue;etc." (without the quotes)
However, section 2.3.2 of 2141 says:
RFC 1630 reserves the characters "/", "?", and "#" for
particular purposes. The URN-WG has not yet debated the
applicability and precise semantics of those purposes as
applied to URNs. Therefore, these characters are RESERVED
for future developments. Namespace developers SHOULD NOT
use these characters in unencoded form, but rather use the
appropriate %-encoding for each character.
The phrase "SHOULD NOT" actually has an official definition
in RFC 2119:
SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED"
mean that there may exist valid reasons in particular
circumstances when the particular behavior is acceptable
or even useful, but the full implications should be
understood and the case carefully weighed before
implementing any behavior described with this label.
So I'd say you need some pretty strong justifications before
creating URNs with query parts. The point of URNs is to give
unique persistent names to things (that may or may not actually
be retrievable web resources), so using URNs with queries isn't
exactly something that URN designers considered within scope
of URNs.
Understanding the intricacies of these specs can be difficult,
and I'd be the first to want to check my understanding with other
experts in the field. As chair of the W3C XML Core Working Group,
I've sent out email to check with other working group members.
My response here reflects the replies I got from that group
to date. (If I get different input from others, I'll let
you know.)
Finally, whether a given tool would actually do what you want
with a URN with a query part is another question entirely.
For that answer, you'll just have to experiment, but given
section 2.3.2 of RFC 2141, I'd be surprised if you found any
tools that would support that.
paul