cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Possible Unicode problem with string function search

Highlighted
Moonstone

Possible Unicode problem with string function search

Whilst writing a string replace function for Mathcad Express, I found what seems to be a problem with the string function search.   I wanted to check a string for the existence of the Unicode characters for 11 (↊) and 12 (↋) in the duodecimal system ... and it found them everywhere even when they weren't there at all.

 

Has this been noticed before?  Have I made some kind of error or unwitting assumption?

 

2020 05 17 B.png

 

Cheers,

 

Stuart

 

(The simple recursive algorithm shown above is O(n) and only good for strings shorter than about 4000 characters (ie, under Mathcad's recursion depth limit.   A slightly more complicated algorithm would use a binary-division algorithm to reduce the recursion stack to O(log n), which will handle much longer strings)

Tags (2)
2 REPLIES 2
Highlighted

Re: Possible Unicode problem with string function search

Interesting bug, indeed

Werner_E_0-1589757915252.png

Here Mathcad 15 (even a bit worse 😉

Werner_E_1-1589758164263.png

 

 

Highlighted

Re: Possible Unicode problem with string function search


@Werner_E wrote:

Interesting bug, indeed

Werner_E_0-1589757915252.png

Here Mathcad 15 (even a bit worse 😉


The search is worse than you may think.   Out of the first 8000 Unicode characters, search incorrectly found 1238 characters.   Quite some going. 

 

(the first 4 characters in the table below are genuine finds; I put them in to give confidence that my method was using search as intended)

 

2020 05 22 H.png2020 05 22 G.png

 


Here's the result of using the built-in search on the first 4000 characters vs a replacement search:

 

Built-in search

2020 05 22 I.png

 

Replacement search

2020 05 22 J.png

 

 

Announcements