Welcome to Telelogic Product Support
  Home Downloads Knowledgebase Case Tracking Licensing Help Telelogic Passport
Telelogic DOORS (steve huntington)
Decrease font size
Increase font size
Topic Title: Regex Errors
Topic Summary:
Created On: 17-Aug-2004 13:43
Status: Post and Reply
Linear : Threading : Single : Branch
Search Topic Search Topic
Topic Tools Topic Tools
Quick Reply Quick Reply
Subscribe to this topic Subscribe to this topic
E-mail this topic to someone. E-mail this topic
Bookmark this topic Bookmark this topic
View similar topics View similar topics
View topic in raw text format. Print this topic.
 17-Aug-2004 13:43
User is offline View Users Profile Print this message


Alan Wong

Posts: 11
Joined: 2-Aug-2004

From playing around with the regex, I have encounted odd problems that limit the usefulness of dxl's regex implementation. First, it lacks ? and {x,y} but that isn't as problematic.

What it seems to do from my testing is reset "start" and "end" when it approaches a newline. This changes the logic of a ^anything$ check to find the existance of anything on any line of the string, rather than the existance of anything in the string. Note this example:


Regexp emptyString = regexp("^(anything)+$")
string testtest = "anything\n"

if(emptyString testtest){
print "passed"
}
else{
print "failed"
}


The test string "testtest" contains a newline... which was not even mentioned in the regex string check, yet this should end up passing, even though the only string that logically works with this regex is "anything" "anythinganything" and so on without the newline. It passes since the first line passed.

This makes some checks either very difficult, or impossible. (Note that in other regex implementations, this can be turned on and off).

Edited: 17-Aug-2004 at 13:58 by Alan Wong
Report this to a Moderator Report this to a Moderator
 17-Aug-2004 13:59
User is offline View Users Profile Print this message


Paul Tiplady

Posts: 176
Joined: 28-Oct-2003

Ouch.

I think it's actually slightly worse than you suggest. Initial playing shows that any number of spaces on one side only of the new line also pass. If you put spaces on both sides, the test fails. Try setting testtest to:

"anything \nanything" -- pass
"anything \n anything" -- fail
"anything\n anything" -- pass

... time passes while I try some more combinations ...

Yukkier and yukkierer (!). So long as the character immediately after the end of the first string or immediately before the beginning of the second string is a new line (\n), you can put any amount of whitespace (space, tab or new line) in the gap, and the match still passes. For example:

"anything\n \n \t \t \n anything" -- pass!

Are you (Alan) reporting this as a bug to Telelogic? Which version are you using (I'm on V7.0 sp1, build 70210)? Can anyone else confirm that it's the same with other versions?

Paul.

-------------------------


Paul dot Tiplady at TRW dot com
TRW Automotive
Report this to a Moderator Report this to a Moderator
 17-Aug-2004 15:55
User is offline View Users Profile Print this message


Alan Wong

Posts: 11
Joined: 2-Aug-2004

Well it technically isn't a bug. I did some more research on regex and some editers allow per line definition of ^ $. In perl for example... similar behavior is possible with \m^ and $\m making, ^$ work per line. In some editors, \A and \Z represent absolute string end and start, while ^ $ are line only. Perhaps if such additions were included, it would help.

And no I have not reported this.

Note that it currently passes IF any line in the string matches. So as long as there exists a line with any permutation of anything+... the entire string passes, which is why your examples pass and fail as they do.

This causes problems when you need to do multiline matches, since some matches may not even be possible.
Report this to a Moderator Report this to a Moderator
 17-Aug-2004 16:28
User is offline View Users Profile Print this message


Paul Tiplady

Posts: 176
Joined: 28-Oct-2003

I understand now. I should have thought some more before firing off that reply.

Nevertheless, reading the manual (again) I would expect the ^ and $ to match the start and end of the string, not the start and end of any new-line delimited section within the string.

I think I can see how that might give rise to problems when doing some searches, but I haven't yet had the specific problems you're having with multiline searches. This is probably due to having done all the text manipulation I need by working with individual paragraphs (new-line delimited).

Good luck with your multi-line searches. I won't attempt to suggest solutions, because I don't understand your problem. If you think I can help (or just want someone to rant at) by all means contact me outside the forum (that's what my email address is there for...)

Paul.

-------------------------


Paul dot Tiplady at TRW dot com
TRW Automotive
Report this to a Moderator Report this to a Moderator
 17-Aug-2004 17:20
User is offline View Users Profile Print this message


Alan Wong

Posts: 11
Joined: 2-Aug-2004

In my case, I was trying to find a string that contained only spaces... \t or <space> or \n and so on. Since it passed if any line contained such, it can't work too well ("asdf \n " will pass since one of the lines is empty).

I solved it by using a function that went through by character... not much of a problem.

I just presented the problem so that others will notice it and know that it exists.
Report this to a Moderator Report this to a Moderator
Statistics
20925 users are registered to the Telelogic DOORS forum.
There are currently 1 users logged in.
The most users ever online was 15 on 15-Jan-2009 at 16:36.
There are currently 0 guests browsing this forum, which makes a total of 1 users using this forum.
You have posted 0 messages to this forum. 0 overall.

FuseTalk Standard Edition v3.2 - © 1999-2009 FuseTalk Inc. All rights reserved.