6

I'm having some trouble translating my working C# regular expression into JavaScript's regular expression implementation.

Here's the regular expression:

([a-z]+)((\d+)([a-z]+))?,?

When used on "water2cups,flour4cups,salt2teaspoon" you should get:

[
    ["water", "2cups", "2", "cups"]
    ["flout", "4cups", "4", "cups"]
    ["salt", "2teaspoon", "2", "teaspoon"]
]

... And it does. In C#. But not in JavaScript.

I know there are some minor differences across implementations. What am I missing to get this expression working in JavaScript?

Update

I am using the regex like so:

"water2cups,flour4cups,salt2teaspoon".match(/([a-z]+)((\d+)([a-z]+))?,?/g);
1
  • Re your update: If you use a RegExp#exec loop rather than String#match, you get the results you're expecting (see my answer). I'm not enough of a RegExp guru to tell you why. :-) Commented May 5, 2010 at 12:11

2 Answers 2

13

Creating the RegExp

You haven't shown how you're creating your Javascript regular expression, e.g., are you using a literal:

var rex = /([a-z]+)((\d+)([a-z]+))?,?/;

or a string

var rex = new RegExp("([a-z]+)((\\d+)([a-z]+))?,?");

If the latter, note that I've escaped the backslash.

Global Flag

By default, Javascript regular expressions are not global, that may be an issue for you. Add the g flag if you don't already have it:

var rex = /([a-z]+)((\d+)([a-z]+))?,?/g;

or

var rex = new RegExp("([a-z]+)((\\d+)([a-z]+))?,?", "g");

Using RegExp#exec rather than String#match

Your edit says you're using String#match to get an array of matches. I have to admit I hardly ever use String#match (I use RegExp#exec, as below.) When I use String#match with your regex, I get...very odd results that vary from browser to browser. Using a RegExp#exec loop doesn't do that, so that's what I'd do.

Working Example

This code does what you're looking for:

var rex, str, match, index;

rex = /([a-z]+)((\d+)([a-z]+))?,?/g;
str = "water2cups,flour4cups,salt2teaspoon";

rex.lastIndex = 0; // Workaround for bug/issue in some implementations (they cache literal regexes and don't reset the index for you)
while (match = rex.exec(str)) {
    log("Matched:");
    for (index = 0; index < match.length; ++index) {
        log("&nbsp;&nbsp;match[" + index + "]: |" + match[index] + "|");
    }
}

(The log function just appends text to a div.)

My output for that is:

Matched:
  match[0]: |water2cups,|
  match[1]: |water|
  match[2]: |2cups|
  match[3]: |2|
  match[4]: |cups|
Matched:
  match[0]: |flour4cups,|
  match[1]: |flour|
  match[2]: |4cups|
  match[3]: |4|
  match[4]: |cups|
Matched:
  match[0]: |salt2teaspoon|
  match[1]: |salt|
  match[2]: |2teaspoon|
  match[3]: |2|
  match[4]: |teaspoon|

(Recall that in Javascript, match[0] will be the entire match; then match[1] and so on are your capture groups.)

Sign up to request clarification or add additional context in comments.

Comments

1

C# had the "@" operator which automatically escapes backslashes (). I do not think that Javascript supports it, so you basically need to "escape" the backslash by putting in another one, so this should do the trick

([a-z]+)((\d+)([a-z]+))?,?

1 Comment

You only have to escape backslashes if you're using a string to create the regex, not if you're using literal notation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.