Anchoring in patterns of string parameters


#1

Disclaimer: While I have been working with RAML for almost a year now, and am familiar with the 0.8 spec, I’m new to this forum category. Also, I have not been involved in the ongoing efforts of defining RAML 1.0 yet. Please treat my incompetence with kindness.

In the section on patterns, the RAML spec says:

[…] The pattern attribute is a regular expression that a parameter of type string MUST match. […]

I find that slightly ambiguous. Does it mean that the pattern must occur somewhere in the parameter value? Or does it mean that the whole parameter value (from beginning to end) must match the pattern?

From the point of view of a RAML author: is manual anchoring of the regexp required, if the pattern is intended to match the whole parameter value?

E.g. are the following to regexps equal, in the context of a RAML pattern?

a+
^a+$

Would be nice, if future versions of the spec could be more clear about this. What do you think?


#2

In the meantime, I have been able to clarify this issue through an exchange I had somewhere else.

Apparently, there is consensus that patterns must be anchored manually, if the RAML author intends them to match the whole parameter value from beginning to end. So my above examples are not equal in the context of RAML.

Coming to think about it, it does make sense. The RAML 0.8 spec says that it is based on the RegExp specification from
JavaScript/ECMA262. And that spec only provides one way to match a RegExp: using
the test function of RegExp itself or the match function of String, which can be used interchangeably. Both will return true, if the pattern is contained somewhere in the input. That is, the pattern does not need to match the whole input (unless explicitly anchored).


#3

I guess my confusion stems from the fact that some regular-expressions libraries handle that differently than JavaScript. For example, in the standard implementation for Java, you can use a pattern directly in different ways. The following is similar to the JavaScript way, and thus returns true:

Pattern.compile("[a-z][a-z0-9]+").matcher("INVALIDvalidpart").find();

In contrast, the following equivalent notations, check whether the whole input matches, and thus all return false:

Pattern.compile("[a-z][a-z0-9]+").matcher("INVALIDvalidpart").matches();
Pattern.matches("[a-z][a-z0-9]+", "INVALIDvalidpart");
"INVALIDvalidpart".matches("[a-z][a-z0-9]+");

Being more familiar with Java than with JavaScript, the term match in the RAML spec got me confused. I still believe, that the wording in the RAML 0.8 spec is in the least confusing. Maybe you guys could improve that in future spec versions?


#4

Hi All,

Is there any update regarding this issue?