3

Using PHP 5.3.5. Not sure how this works on other versions.

I'm confused about using strings that hold numbers, e.g., '0x4B0' or '1.2e3'. The way how PHP works with such strings seems inconsistent to me. Is it only me? Or is it a bug? Or undocumented feature? Or am I just missing some magic sentence in docs?

<?php

echo $str = '0x4B0', PHP_EOL;
echo "is_numeric() -> ", var_dump(is_numeric($str)); // bool(true)
echo "*1           -> ", var_dump($str * 1);         // int(1200)
echo "(int)        -> ", var_dump((int)$str);        // int(0)
echo "(float)      -> ", var_dump((float)$str);      // float(0)
echo PHP_EOL;

echo $str = '1.2e3', PHP_EOL;
echo "is_numeric() -> ", var_dump(is_numeric($str)); // bool(true)
echo "*1           -> ", var_dump($str * 1);         // float(1200)
echo "(int)        -> ", var_dump((int)$str);        // int(1)
echo "(float)      -> ", var_dump((float)$str);      // float(1200)
echo PHP_EOL;

In both cases, is_numeric() returns true. Also, in both cases, $str * 1 parses string and returns valid number (integer in one case, float in another case).

Casting with (int)$str and (float)$str gives unexpected results.

  • (int)$str in any case is able to parse only digits, with optional "+" or "-" in front of them.
  • (float)$str is more advanced and can parse something like ^[+-]?\d*(\.\d*)?(e[+-]?\d*)?, i.e., optional "+" or "-", followed by optional digits, followed by optional decimal point with optional digits, followed by optional exponent which consists of "e" with optional "+" or "-" followed by optional digits. Fails on hex data though.

Related docs:

  • is_numeric() - states that "Hexadecimal notation (0xFF) is allowed too but only without sign, decimal and exponential part". If function, meant to test if a string holds numeric data, returns true, I expect PHP to be able to convert such string to a number. This seems to work with $str * 1, but not with casting. Why?
  • Converting to integer - states that "in most cases the cast is not needed, since a value will be automatically converted if an operator, function or control structure requires an integer argument". After such statement, I expect both $s * 10 and (int)$s * 10 expressions to work the same way and to return the same result. Though, as shown in example, those expressions are evaluated differently.
  • String conversion to numbers - states that "Valid numeric data is an optional sign, followed by one or more digits (optionally containing a decimal point), followed by an optional exponent". "Exponent" is "e" or "E", followed by digits, e.g., 1.2e3 is valid numeric data. Sign ("+" or "-") is not mentioned. It does not mention hexidecimal values. This conflicts with definition of "numeric data" used in is_numeric(). Then, there is suggestion "For more information on this conversion, see the Unix manual page for strtod(3)", and man strtod describes additional numeric values (including HEX notation). So, after reading this, is hexidecimal data supposed to be valid or invalid numeric data?

So...

  • Is there (or, rather, should there be) any relation between is_numeric() and the way how PHP treats strings when they are used as numbers?
  • Why do (int)$s, (float)$s and $s * 1 work differently, i.e,. give completely different results, when $s is 0x4B0 or 1.2e3?
  • Is there any way to convert a string to a number and keep its value, if it is written as 0x4B0 or as 1.2e3? floatval() does not work with HEX at all, intval() needs $base to be set to 16 to work with HEX, typecasting with (int)$str and (float)$str sometimes works, sometimes does not work, so these are not valid options. I'm also not considering $n *= 1;, as it looks more like data manipulation rather than converting. Self-written functions also are not considered in this case, as I'm looking for native solution.

3 Answers 3

3

The direct casts (int)$str and (float)$str don't really work differently at all: They both read as many characters from the string as they can interpret as a number of the respective type.

For "0x4B0", the int-conversion reads "0" (OK), then "x" and stops, because it cannot convert "x" into an integer. Likewise for the float-conversion.

For "1.2e3", the int-conversion reads "1" (OK), then "." and stops. The float-conversion recognises the entire string as valid float notation.

The automatic type recognition for an expression like $str * 1 is simply more flexible than the explicit casts. The explicit casts require the integers and floats to be in the format produced by %i and %f in printf, essentially.

Perhaps you can use intval and floatval rather than explicit casts-to-int for more flexibility, though.

Finally, your question "is hexidecimal data supposed to be valid or invalid numeric data?" is awkward. There is no such thing as "hexadecimal data". Hexadecimal is just a number base. What you can do is take a string like "4B0" and use strtoul etc. to parse it as an integer in any number base between 2 and 36.[Sorry, that was BS. There's no strtoul in PHP. But intval has the equivalent functionality, see above.]

Sign up to request clarification or add additional context in comments.

15 Comments

Did you mean base_convert()? Never heard of strtoul() :X
@KingCrunch: Yep, sorry, just realized that. Use intval() instead.
Oh, just realized, that intval() has a second parameter. I'm usually use base_convert() here, but that seems equivalent. Nice to know :)
@KingCrunch: Well, they're different things: base_convert goes from string to string, intval goes from string to number, if you believe in types. Both have their uses! :-)
@binaryLV: yes yes yes, I know, I already fixed that... :-( I was thinking about the nearest C function that came to mind and was put off guard by your mentioning of strtod. Sorry!
|
2

intval uses strtol which recognizes oct/hex prefixes when the base parameter is zero, so

var_dump(intval('0xef'));     // int(0)
var_dump(intval('0xff', 0));  // int(255)

1 Comment

Nice find. I didn't know that $base can be set to 0. Seems to be an undocumented feature.
1

Is there (or, rather, should there be) any relation between is_numeric() and the way how PHP treats strings when they are used as numbers?

There is no datatype called numeric in PHP, the is_numeric() function is more of a test for something that can be interpreted as number by PHP.

As far as such number interpreting is concerned, adding a + in front of the value will actually make PHP to convert it into a number:

$int = +'0x4B0';
$float = +'1.2e3';

You find this explained in the manual for string, look for the section String conversion to numbers.

As it's triggered by an operator, I don't see any need why there should be a function in PHP that does the same. That would be superfluous.


Internally PHP uses a function called zendi_convert_scalar_to_number for the add operator (assumable +) that will make use of is_numeric_string to obtain the number.

The exact same function is called internally by is_numeric() when used with strings.

So to trigger the native conversion function, I would just use the + operator. This will ensure that you'll get back the numeric pseudo-type (int or float).

Ref: /Zend/zend_operators.c; /ext/standard/type.c

5 Comments

Nice try, though, eval() is even worse than $n += 0; or $n *= 1;.
Yes it is, I would prefer your expression instead. Stricly, eval would require a is_numeric() upfront which returns true on that string.
Found another native one. Looks like prefixing with a simple + is working as well.
@binaryLV, I was skimming through source, and it looks like that the is_numeric() shares it's commons with + on strings regarding the conversion.
Nice find (about the source).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.