Board index » cppbuilder » Converting to proper case

Converting to proper case


2007-11-15 08:19:52 PM
cppbuilder94
Does anyone know of a function available anywhere that can do a good job of
converting a string to proper case? It's definitely nontrivial to do this
properly.
Db
 
 

Re:Converting to proper case

"Db" <drbsname_at_aol.com>wrote:
Quote
Does anyone know of a function available anywhere that can do a good job of
converting a string to proper case? It's definitely nontrivial to do this
properly.
What's 'proper' case?
(FWIW, converting a string to upper case or lower case isn't necessarily
as simple as some might think.)
Alan Bellingham
--
Team Browns
<url:www.borland.com/newsgroups/>Borland newsgroup descriptions
<url:www.borland.com/newsgroups/netiquette.html>netiquette
 

Re:Converting to proper case

"Alan Bellingham" < XXXX@XXXXX.COM >wrote in message
Quote
"Db" <drbsname_at_aol.com>wrote:

What's 'proper' case?

(FWIW, converting a string to upper case or lower case isn't necessarily
as simple as some might think.)

This Is Proper Case.
A few examples of things that make this difficult...
words like McDonalds
roman numerals
acronyms
state abbreviations
I have a function that can do an atrocious job, I'd like to find a better
one. It really could be a job for a LALR parser.
What's difficult about upper and lower?
Db
 

{smallsort}

Re:Converting to proper case

"Db" <drbsname_at_aol.com>wrote:
Quote
This Is Proper Case.
Ah. As in Proper Nouns. (I thought that might be the case (*ta-dish*),
but didn't want to make the assumption.)
Quote
A few examples of things that make this difficult...

words like McDonalds
roman numerals
acronyms
state abbreviations

I have a function that can do an atrocious job, I'd like to find a better
one. It really could be a job for a LALR parser.
Oh, my.
I think that one relies on the advent of true artificial intelligence.
Actually, I suspect a dictionary is required as a minimum, though
confusion between RAND and Rand (as an example) may still cause
problems.
Quote
What's difficult about upper and lower?
There's at least on languages (Turkish, perhaps) in which more than one
lower case letter converts to the same uppercase character. Now take
that upper case and convert it back. Which lower case letter does it
become.
And I'm also of the impression that some letters map differently
according to which language they're being used in. Accented letters may,
or may not, lose their accents when going to upper case.
Alan Bellingham
--
Team Browns
<url:www.borland.com/newsgroups/>Borland newsgroup descriptions
<url:www.borland.com/newsgroups/netiquette.html>netiquette
 

Re:Converting to proper case

"Db" <drbsname_at_aol.com>wrote in message
Quote
This Is Proper Case.

A few examples of things that make this difficult...

words like McDonalds
roman numerals
acronyms
state abbreviations

I have a function that can do an atrocious job, I'd like to find a better
one. It really could be a job for a LALR parser.
Don't forget the common French and Irish abbreviations which find their way
into English, like :-
D'Arcy
O'Leary
L'Armagne
I have stuck with the crude but working one I wrote for myself, which sounds
no better than yours, except it takes those above into account. It does not
handle McDonalds, though. It's a case of just where do you draw the line?
You probably shouldn't capitalise the words a, an, if, the, of, ...
etcetera.
--
Mark Jacobs
jacobsm.com
 

Re:Converting to proper case

Db <drbsname_at_aol.com>wrote:
Quote
[...]
What's difficult about upper and lower?
Thomas Maeder has posted on this one here several
times throughout the last years. Search for these
postings.
Quote
Db
Schobi
--
XXXX@XXXXX.COM is never read
I'm HSchober at gmx dot de
"A patched buffer overflow doesn't mean that there's one less way attackers
can get into your system; it means that your design process was so lousy
that it permitted buffer overflows, and there are probably thousands more
lurking in your code."
Bruce Schneier
 

Re:Converting to proper case

"Mark Jacobs" <www.jacobsm.com/mjmsg.htm?Borland Newsgroup>wrote:
Quote
Don't forget the common French and Irish abbreviations which find their way
into English, like :-

D'Arcy
O'Leary
L'Armagne
And not to mention those annoying people like e e cummings.
Alan Bellingham
--
Team Browns
ACCU Conference 2008: 2-5 April 2008 - Oxford, UK
 

Re:Converting to proper case

"Db" <drbsname_at_aol.com>wrote in message
Quote
Does anyone know of a function available anywhere that can do a good job
of converting a string to proper case? It's definitely nontrivial to do
this properly.

Db
Well, you're certainly correct about 'nontrivial', especially in the case of
sentences.
We have an app where our customers can input their customers names and
addresses. There are separate fields for first name, last name, etc. There
is also a check box where proper casing can be turned on or off for fairly
obvious reasons - annoying.
I spent quite a bit of time putting these functions together years ago and
they're reasonably accurate for the English language. But, that's why the
'turn it off' check box is there. Here's an example.
input:
john macfarland 32 w north street wattelbury
output:
John MacFarland 32 W North Street Wattlebury
There are rules but as we know, all rules are meant to be broken. Suppose
John actually spelled his last name as 'Macfarland'?
And again, while fairly complex, it only deals with names and addresses.
What irks me the most is that when doing a techincal doc in Word I start a
sentence wirh myFunc() and Word happily converts it to MyFunc().
Good luck Db, you'll need it.
- Arnie