You are not logged in.

Dear visitor, welcome to QtForum.org. If this is your first visit here, please read the Help. It explains in detail how this page works. To use all features of this page, you should consider registering. Please use the registration form, to register here or read more information about the registration process. If you are already registered, please login here.

1

Wednesday, November 12th 2014, 9:25pm

QString encoding confusion

In my application constructor I set the codec for C-strings to be UTF-8:

Source code

1
QTextCodec::setCodecForCStrings( QTextCodec::codecForName( "UTF-8" ) );

I was under the impression that doing so would cause the following:
1. QString::fromLocal8Bit() to interpret C-strings as UTF-8 encoded.
2. QString::QString( const char* ) to interpret C-strings as UTF-8 encoded.
3. QString::toLocal8Bit() to generate UTF-8 encoded C-strings.
4. qPrintable() to generate UTF-8 encoded C-strings.

Unfortunately, I've found that assumption (2) was incorrect as QString::QString( const char* ) always employs QString::fromAscii() to interpret C-strings (see http://qt-project.org/doc/qt-4.8/qstring.html#QString-8).

Assumptions (3) and (4) are coupled as qPrintable() employs QString::toLocal8Bit() (see http://qt-project.org/doc/qt-4.8/qtglobal.html#qPrintable).

That leaves assumptions (1) and (3) to be proven, which should have been straightforward.
However, I've encountered some strangeness and was hoping someone here could provide some insight.

I wrote a sample program, with the source file encoded as UTF-8:

Source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
int
main( int argc, char** argv )
{
	QTextCodec::setCodecForCStrings( QTextCodec::codecForName( "UTF-8" ) );

	const char* utf8String = "30 °C";
	
	QString utf8QString = QString::fromUtf8( utf8String );
	
	qDebug() << "str.toUtf8()" << utf8QString.toUtf8();
	qDebug() << "str.toLocal8Bit()" << utf8QString.toLocal8Bit();
	qDebug() << "qPrintable()" << qPrintable( utf8QString );

	return 0;
}

Which results in the output:

Source code

1
2
3
str.toUtf8() "30 °C"
str.toLocal8Bit() "30 °C"
qPrintable() 30 °C

I'm confused to as why QString::toUtf8() and QString::toLocal8Bit() do not have the same result, since I've set the encoding for C-strings to be UTF-8.

As an experiment, I changed QString::fromLocal8Bit() to QString::fromUtf8().
Which results in the output:

Source code

1
2
3
str.toUtf8() "30 °C"
str.toLocal8Bit() "30 ?C"
qPrintable() 30 ?C

I'm confused as to why QString::fromUtf8() and QString::fromLocal8Bit() do not have the same result, since I've set the encoding for C-strings to be UTF-8.

Any help is appreciated.