Welcome to Telelogic Product Support
  Home Downloads Knowledgebase Case Tracking Licensing Help Telelogic Passport
Telelogic DOORS (steve huntington)
Decrease font size
Increase font size
Topic Title: Replacing Hyphens with Soft Hyphens
Topic Summary: Soft hyphens appear to be confusing to rich text utilities.
Created On: 4-May-2006 02:40
Status: Post and Reply
Linear : Threading : Single : Branch
Search Topic Search Topic
Topic Tools Topic Tools
Quick Reply Quick Reply
Subscribe to this topic Subscribe to this topic
E-mail this topic to someone. E-mail this topic
Bookmark this topic Bookmark this topic
View similar topics View similar topics
View topic in raw text format. Print this topic.
 4-May-2006 02:40
User is offline View Users Profile Print this message


Kent Power

Posts: 18
Joined: 28-Apr-2005

We maintain requirements documents in DOORS, and export them into Word for review and formal release.  We've taken the Telelogic Word exportation routine, and modified it to some degree for unique aspects of the documents.

One of the modifications is to replace hyphens with soft hyphens so that strings such as "Figure 3-1" will not have the number split between lines.  For those not familiar with the difference between a hyphen and a soft hyphen, a hyphen is always visible; Word breaks at hyphens.  A soft hyphen does not show unless the word in which it's embedded can break across a line at that point.  Then the hyphen is displayed, and the word is broken.  As an aside, Word does not implement soft hyphens correctly as far as I can tell; a soft hyphen in Word is always visible, and acts like a non-breaking hyphen.  We are in fact using soft hyphens so that we don't have to go to Unicode for non-breaking hyphens.

Our object text is maintained in rich text so that symbols such as degrees can appear.  We also, upon extraction, append an identifier to each requirement that contains hyphens (and which we don't want to have break across lines).

In both DOORS 7.1 and 8.0, findRichText and replaceRichText, when used for strings that contain soft hyphens, don't behave properly.  The original approach was to use a while loop with findRichText to locate hyphens, and use replaceRichText to replace them.  However, once a soft hyphen was put into the rich text, findRichText finds the soft hyphen in the next pass through the while loop even though the search string is a regular hyphen, and the while loop never exits.  We had to use a copy of the original string for the hyphen search, replacing hyphens with an innocuous character like "z", and use the resulting starting position and length in the original string to insert the soft hyphens.

In DOORS 7.1, the above technique sufficed for the object text with the appended identifier.  But, in DOORS 8.0, the appended identifer, though rich text, is exported so that Word never receives the soft hyphen, even though it is in the string built from the object text.  We have had to prepend and postpend the identifier with the full rich text tagging information so that the same character set and font table would be used, and search it for hyphens separately from the rich text of the requirements from object text.

There is, of course, no documentation on any of this in the DXL reference manual.  We thought that others might like to be aware of this problem.  If anyone has come across this, and has a better solution, (or if we're doing something wrong wrt rich text) we'd like to hear about it.

Thanks.

P.S.  It might also be of interest to know that the length of a substring containing a soft hyphen, as returned from findRichText, is one less than the length of a corresponding substring containing a regular hyphen.  So, apparently not only does findRichText confuse hyphens and soft hyphens in its search algorithm, it also doesn't count the soft hyphen in the substring length information.  That's surprising, since one would expect this utility to work on the basis of the actual characters in the string, not on how a rich text formatter displays the text.
Report this to a Moderator Report this to a Moderator
 4-May-2006 22:52
User is offline View Users Profile Print this message


Louie Landale

Posts: 2070
Joined: 12-Sep-2002

Forgive ignorance here. A soft hyphen is used to tell MS-Word that if you must break inside this word, to please break at the location of the soft hyphen?

I must be in a minority here. I cannot stand any sort of hyphenated breaks: if MS-Word cannot fit the word on this line then please put the entire word on the next line (except in the rare case where the word is longer than a line). Just about every tool I know does this, including this Telelogic Forum post window. When combined with paragraph left alignment, the right side of my documents is rather choppy, but so what.

When we export, we use MS-Word Replace to change all hyphens to non-breaking hypens ("^~" in the replace box) in order to prevent hyphenated words from breaking, such as object IDs ABC_123.

I never could make findRichText-replaceRichText work well; both of which use the relative positions within the Rich Text string. I found a screwly method where I got the rich text and a parellel raw text. I'd search the raw text for things, then use cutRichText (whose indicies are relative to the raw, not the rich text) to take everything else away, and I'm left with a rich text chunk that parallels what I was looking for. Then build the full string from these chunks. But then, I've got routines that convert Rich to Raw text and also routines that find sub-strings and offsets within raw texts.

It would be interesting for you to take a peek at your rich text strings to see how the soft-hyphens are actually embedded. Text=richTextNoOle(obj."Object Text"); print Text "\n" should do it.

- Louie
Report this to a Moderator Report this to a Moderator
 5-May-2006 17:52
User is offline View Users Profile Print this message


Shawn Stepper

Posts: 96
Joined: 6-Aug-2004

One option to consider is using the Word find and replace after exporting. You can do this from DOORS, at the end of the export script. I'm doing this now to insert zero-width characters to allow line breaks after certain characters. Things like URLs will not break by default, but inserting a zero width character after each / in the URL will allow breaking. You could use the same mathod to replace regular hyphens with soft hyphens. Let me know if you are interested and I can upload a code snippet.

-------------------------
Shawn Stepper
shawn.e.stepper@wellsfargo.com
Report this to a Moderator Report this to a Moderator
 5-May-2006 18:27
User is offline View Users Profile Print this message


Kent Power

Posts: 18
Joined: 28-Apr-2005

Shawn:

I would be interested.  So far, I've found some undocumented ways to manipulate the Word file (e.g., setting the no row break property in tables, but I've not been able to find much in the way of working with Word (mostly, I'm sure, because I don't know where to look).

Thanks.

Kent

Report this to a Moderator Report this to a Moderator
 5-May-2006 19:10
User is offline View Users Profile Print this message


Shawn Stepper

Posts: 96
Joined: 6-Aug-2004

Here you go! There are a lot of global variables used here, which you will need to define. Here are a few:

// Find and Replace
const int wdReplaceOne = 1
const int wdReplaceAll = 2
const int wdFindContinue = 1

const string cPropertyReplacement = "Replacement"
const string cPropertyFind = "Find"
const string cPropertyMatchWildcards = "MatchWildcards"
const string cPropertyWrap = "Wrap"

-------------------------
Shawn Stepper
shawn.e.stepper@wellsfargo.com
Report this to a Moderator Report this to a Moderator
 31-May-2006 22:32
User is offline View Users Profile Print this message


Kent Power

Posts: 18
Joined: 28-Apr-2005

Shawn:

Thanks for the example.  I was able to adapt it to doing post-extraction Word modification.  Sorry for the delay in response; working on dxl scripts isn't my day job (just my fun job ...).

Kent Power
Report this to a Moderator Report this to a Moderator
Statistics
20925 users are registered to the Telelogic DOORS forum.
There are currently 1 users logged in.
The most users ever online was 15 on 15-Jan-2009 at 16:36.
There are currently 0 guests browsing this forum, which makes a total of 1 users using this forum.
You have posted 0 messages to this forum. 0 overall.

FuseTalk Standard Edition v3.2 - © 1999-2009 FuseTalk Inc. All rights reserved.