Reading font names from font files on Windows, Linux and Mac? - c++

I'm using C++ and Qt and I want to find a cross-platform solution for reading font names from font files. Looking through the Qt font classes, I can't see any solution. The closest I could get is the below code which reads the font family, but not the full font name, so "Arno Pro Regular" just returns "Arno Pro", "Blogger Sans Bold" just returns "Blogger Sans", and so forth.
int id = QFontDatabase::addApplicationFont(fileName);
if (id != -1)
{
QStringList fontFamily = QFontDatabase::applicationFontFamilies(id);
QFontDatabase::removeApplicationFont(id);
}
I've searched around for a solution, but can only find solutions in Java, .net and other languages, but nothing I can use in C++.
Is there any cross-platform way of reading font names from font files in C++?

Related

How can I access GDEF / GPOS / GSUB of a ttf font (not a otf)?

One main question, several underwritten questions (sorry).
I'm trying to read GSUB infos (and other tables) in a ttf font. How to do that? Which lib could I use?
GSUB is a substitution table telling how glyphs used in the same neighborhood must morph to another glyph. Is is very common in many languages, and in english, it is more rare, but the best example is ligature.
It is good documented for OpenType fonts (otf) and I know it exists in Truetype fonts (ttf).
But how can I access it? Is there a library such as Freetype + Harfbuzz?
It seems Freetype gives only access to OTF tables, not TTF, am I right?
FT_OpenType_Validate: This function only works with OpenType fonts
And is Harfbuzz optional or mandatory for such needs?
Documentations are poor (at my pov), so I'm searching for experiences, working examples.
It also seems to be hard to make freetype + harfbuzz working toghether on windows, is it really needed? How to?
Sources:
mactype
official poor example
My test code, not working, because GSUB is an "Unimplemented Feature" says Freetype:
#include <ft2build.h>
#include FT_FREETYPE_H
#include FT_OPENTYPE_VALIDATE_H
#include <stdexcept>
int main(int argc, char* argv[])
{
FT_Library ftLibrary;
FT_Error errorLib = FT_Init_FreeType(&ftLibrary);
if (errorLib)
throw std::runtime_error("Couldn't initialize the library: FT_Init_FreeType() failed");
FT_Face ftFace;
FT_Error errorFace = FT_New_Face(ftLibrary, argv[1], 0, &ftFace); //getting first face
if (errorFace)
throw std::runtime_error("Couldn't load the font file: FT_New_Face() failed");
FT_Bytes BASE = NULL;
FT_Bytes GDEF = NULL;
FT_Bytes GPOS = NULL;
FT_Bytes GSUB = NULL;
FT_Bytes JSTF = NULL;
FT_Error errorValidate = FT_OpenType_Validate(ftFace, FT_VALIDATE_GSUB, &BASE, &GDEF, &GPOS, &GSUB, &JSTF);
if (errorValidate)
throw std::runtime_error("Couldn't validate opentype datas");
//7=Unimplemented_Feature
FT_OpenType_Free(ftFace, BASE);
FT_OpenType_Free(ftFace, GDEF);
FT_OpenType_Free(ftFace, GPOS);
FT_OpenType_Free(ftFace, GSUB);
FT_OpenType_Free(ftFace, JSTF);
FT_Done_Face(ftFace);
FT_Done_FreeType(ftLibrary);
return 0;
}
On Windows you have to enable OpenType Validation Module. If you're using Visual Studio to build FreeType then follow steps below.
In freetype/config/ftmodule.h add this:
FT_USE_MODULE( FT_Module_Class, otv_module_class )
Then in Solution Explorer add src/otvalid/otvalid.c to the project.
You are ready to build library. Don't forget to update your project with new library or object files.
Using this I was able to get access to GPOS table. But don't be very optimistic. OpenType's tables support in FreeType is super limited. So, what you really get is raw pointer to bytes. In order to get some useful data there you have to parse these bytes according to OpenType spec. And I would say that this is not a trivial task taking into account complexity of the OpenType spec. I would even say that it's overcomplicated, but still possible.
If you decide to do it remember that you have to reverse byte order for data that you read from any table.

not accurate tesseract OCR data from a png image in QT c++

I am using Tesseract OCR c++ library in QT to get a text from a png image
using this code
const char* lang = "eng";
QString filename = "D:/image.png";
tesseract::TessBaseAPI tess;
tess.Init(NULL, lang, tesseract::OEM_DEFAULT);
tess.SetPageSegMode(tesseract::PSM_AUTO);
FILE* fin = fopen(filename.toStdString().c_str(), "rb");
if (fin == NULL)
{
std::cout << "Cannot open " << filename.toStdString().c_str() << std::endl;
return;
}
fclose(fin);
STRING text;
if (tess.ProcessPages(filename.toStdString().c_str(), NULL, 0, &text))
{
ui->plainTextEdit->setPlainText(QString::fromUtf8(text.string()));
//show result in plainttext qt gui
}
put the data not accurate enough for the data in the table and it gives me strange characters and when I use an online OCR website to convert my image to text (the same image) it does it with 100% accurate so what makes it gives me this wrong text is this a problem with the library? or my code? or if there is a better free library I can use to be more accurate?
I got the image from pdf I use ghost script to get the image with a good quality so the OCR library should get me the correct data
link to download the image
website I use to get the accurate ocr
I am not experienced with cpp, but I think your problem relates to the below line with a great probability:
tess.Init(NULL, lang, tesseract::OEM_DEFAULT);
It must show the tessdata folder. instead of NULL you may write the folder name, for example "C:/tessdata/". Again, I am not experienced with cpp, that's why you may decide slash "/" or backslash "\". This folder should contain the language file(s).
As Eddge mentioned in his comment you should apply some image preprocessing stuff there are bunch of scripts for imagemagick.
Ans of course OpenCV will vastly help in this stuff as well.
The next point could be PSM mode which by default should satisfy your needs to extract whole page information.
Also the result of the online OCR is not 100% as you mentioned.
There is "1 S Days" instead of "15 Days"
There is "Mail: finance(a)" instead of "E Mail: finance#"
There is "TiA THE GREEN HOL1 5" instead of "T/A THE GREEN HOU 5"
etc.
Which Tesseract version are you using? I highly recommend to use 3.05. (4.0 shows much better results but it is not officially released yet).
Also the following link could help you with your results: https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality
P.S. I hope you are eligible to share publicly such financial documentations;)

Qt- Add custom font from resource

I added this font to resource: BYekan.ttf
I want to use this font in my application. I've tried this :
QFont font(":/images/font/BYekan.ttf");
nLabel->setFont(font);
nLabel->setText(tr("This is for test"));
layout->addWidget(nLabel);
But, I guess it's not working. How to use it?
Edit:
After reading this question , I've tried again :
int fontID(-1);
bool fontWarningShown(false);
QFile res(":/images/font/Yekan.ttf");
if (res.open(QIODevice::ReadOnly) == false) {
if (fontWarningShown == false) {
QMessageBox::warning(0, "Application", (QString)"Impossible d'ouvrir la police " + QChar(0x00AB) + " DejaVu Serif " + QChar(0x00BB) + ".");
fontWarningShown = true;
}
}else {
fontID = QFontDatabase::addApplicationFontFromData(res.readAll());
if (fontID == -1 && fontWarningShown == false) {
QMessageBox::warning(0, "Application", (QString)"Impossible d'ouvrir la police " + QChar(0x00AB) + " DejaVu Serif " + QChar(0x00BB) + ".");
fontWarningShown = true;
}
else
nLabel->setFont(QFont(":/images/font/Yekan.ttf", 10));
}
I compare this font and other font, but there isn't any different on Qt. why?
int id = QFontDatabase::addApplicationFont(":/fonts/monospace.ttf");
QString family = QFontDatabase::applicationFontFamilies(id).at(0);
QFont monospace(family);
In QML you can
FontLoader { id: font; source: "/fonts/font.otf" }
I had the same problem as reported in the original question. The above presented solution (answer beginning with the line "int id = QFontDatabase::addApplicationFont....) however did not work, as can be also seen in the comments above. addApplicationFont returned -1.
The reason is, that there is a leading ':' in the string for the call of the function addApplicationFont. I removed this. Now it works for me (testet with Qt 5.5.1 and Qt 4.8.6 on Linux) and returns 0. On Windows it might be necessary to add a drive letter in front.
Note: I had to provide the full path to the font file (e.g. /usr/share/fonts/ttf/droid/DroidSansFallbackFull.ttf)
No, see, I didn't do any of this. So for me, what I did was there's a ~/.font/ directory, if it doesn't exist you can create it.
Now you copy the ttf into this directory, and Linux will see it. However, in my case, I'm writing a QT application, so these fonts have names, so how does Linux know the name?
If you run the command:
fc-list
It dumps out all the font information systemwide and you can search for the font that you've added.
The output looks something like this:
...
/usr/share/texmf/fonts/opentype/public/lm/lmsans17-oblique.otf: Latin Modern Sans,LM Sans 17:style=17 Oblique,Italic
/home/XXX/.fonts/PAPYRUS.TTF: Papyrus:style=Regular,Normal,obyčejné,Standard,Κανονικά,Normaali,Normál,Normale,Standaard,Normalny,Обычный,Normálne,Navadno,Arrunta
/usr/share/fonts/X11/Type1/n019064l.pfb: Nimbus Sans L:style=Bold Condensed Italic
...
The parts of the output after the first colon on each line is the name of the font as it's seen from inside of Linux for that user. So these are "Latin Modern Sans,LM Sans 17" / "Papyrus" / "Nimbus Sans L". So Linux sees it, and all your applications running as your user will see them also, (Gimp, your Window Manager, QT applications etc etc etc)
Inside your QT application you call the one you are interested in, in my case i'm interested in Papyrus font:
tabWidget->setFont( QFont( "Papyrus",10 ) );
And then, sure enough the QT application just picks up the font...
If you wanted to make the font systemwide, then you'd have to locate the position of the font directories, from what i can see it's /usr/share/fonts/truetype/ you will need to create a subdirectory in there for your fonts but perhaps some other distros may be in a different location, you might want to double check that. Anyhow you can dump the ttf files in there. If you do that, you might want to consider running fc-cache -fv as this will treewalk through the truetype subdirectories seeking out newly added fonts.
With anything font related under Linux, run fc-list. It clears up all sorts of confusions and missunderstandings and sheds light on the otherwise dark and mysterious world of Linux fonts.

Why Non-Unicode apps system locale makes Unicode fonts with symbol charset displayed incorrectly?

I'm trying to display Unicode chars from Wingdings font (it's Unicode TrueType font supporting symbol charset only).
It's displayed correctly on my Win7/64 system using corresponding regional OS settings:
Formats: Russian
Location: Russia
System locale (AKA Language for Non-Unicode applications): English
But if I switch System locale to Russian, Unicode characters with codes > 127 are displayed incorrectly (replaced with boxes).
My application is created as using Unicode Charset in Visual Studio, it calls only Unicode Windows API functions.
Also I noted that several Windows apps also display such chars incorrectly with symbol fonts (Symbol, Wingdings, Webdings etc), e.g. Notepad, Beyond Compare 3. But WordPad and MS Office apps aren't affected.
Here is minimal code snippet (resources cleanup skipped for brevity):
LOGFONTW lf = { 0 };
lf.lfCharSet = SYMBOL_CHARSET;
lf.lfHeight = 50;
wcscpy_s(lf.lfFaceName, L"Wingdings");
HFONT f = CreateFontIndirectW(&lf);
SelectObject(hdc, f);
// First two chars displayed OK, 3rd and 4th aren't (replaced with boxes) if
// Non-Unicode apps language is NOT English.
TextOutW(hdc, 10, 10, L"\x7d\x7e\x81\xfc");
So the question is: why the hell Non-Unicode apps language setting affects Unicode apps?
And what is the correct (and most simple) way to display SYMBOL_CHARSET fonts without dependency to OS system locale?
The root cause of the problem is that Wingdings font is actually non-Unicode font. It supports Unicode partially, so some symbols are still displayed correctly. See #Adrian McCarthy's answer for details about how it's probably works under the hood.
Also see more info here: http://www.fileformat.info/info/unicode/font/wingdings
and here: http://www.alanwood.net/demos/wingdings.html
So what can we do to avoid such problems? I found several ways:
1. Quick & dirty
Fall back to ANSI version of API, as #user1793036 suggested:
TextOutA(hdc, 10, 10, "\x7d\x7e\x81\xfc"); // Displayed correctly!
2. Quick & clean
Use special Unicode range F0 (Private Use Area) instead of ASCII character codes. It's supported by Wingdings:
TextOutW(hdc, 10, 10, L"\xf07d\xf07e\xf081\xf0fc"); // Displayed correctly!
To explore which Unicode symbols are actually supported by font some font viewer can be used, e.g. dp4 Font Viewer
3. Slow & clean, but generic
But what to do if you don't know which characters you have to display and which font actually will be used? Here is most universal solution - draw text by glyphs to avoid any undesired translations:
void TextOutByGlyphs(HDC hdc, int x, int y, const CStringW& text)
{
CStringW glyphs;
GCP_RESULTSW gcpRes = {0};
gcpRes.lStructSize = sizeof(GCP_RESULTS);
gcpRes.lpGlyphs = glyphs.GetBuffer(text.GetLength());
gcpRes.nGlyphs = text.GetLength();
const DWORD flags = GetFontLanguageInfo(hdc) & FLI_MASK;
GetCharacterPlacementW(hdc, text.GetString(), text.GetLength(), 0,
&gcpRes, flags);
glyphs.ReleaseBuffer(gcpRes.nGlyphs);
ExtTextOutW(hdc, x, y, ETO_GLYPH_INDEX, NULL, glyphs.GetString(),
glyphs.GetLength(), NULL);
}
TextOutByGlyphs(hdc, 10, 10, L"\x7d\x7e\x81\xfc"); // Displayed correctly!
Note GetCharacterPlacementW() function usage. For some unknown reason similar function GetGlyphIndicesW() would not work returning 'unsupported' dummy values for chars > 127.
Here's what I think is happening:
The Wingdings font doesn't have Unicode mappings (a cmap table?). (You can see this by using charmap.exe: the Character set drop down control is grayed out.)
For fonts without Unicode mappings, I think Windows assumes that it depends on the "Language for Non-Unicode applications" setting.
When that's English, Windows (probably) uses code page 1252, and all the values map to themselves.
When that's Russian, Windows (probably) uses code page 1251, and then tries to remap them.
The '\x81' value in code page 1251 maps to U+0403, which obviously doesn't exist in the font, so you get a box. Similarly the, '\xFC' maps to U+044C.
I assumed that if you used ExtTextOutW with the ETO_GLYPH_INDEX flag, Windows wouldn't try to interpret the values at all and just treat them as glyph indexes into the font. But that assumption is wrong.
However, there is another flag called ETO_IGNORELANGUAGE, which is reserved, but, empirically, it seems to solve the problem.

How do you get the icon, MIME type, and application associated with a file in the Linux Desktop?

Using C++ on the Linux desktop, what is the best way to get the icon, the document description and the application "associated" with an arbitrary file/file path?
I'd like to use the most "canonical" way to find icons, mime-type/file type descriptions and associated applications on both KDE and gnome and I'd like to avoid any "shelling out" to the command line and "low-level" routines as well as avoiding re-inventing the wheel myself (no parsing the mime-types file and such).
Edits and Notes:
Hey, I originally asked this question about the QT file info object and the answer that "there is no clear answer" seems to be correct as far as it goes. BUT this is such a screwed-up situation that I am opening the question looking for more information.
I don't care about QT in particular any more, I'm just looking for the most cannonical way to find the mime type via C++/c function calls on both KDE and gnome (especially Gnome, since that's where things confuse me most). I want to be able show icons and descriptions matching Nautilus in Gnome and Konquerer/whatever on KDE as well as opening files appropriately, etc.
I suppose it's OK that I get this separately for KDE and Gnome. The big question is what's the most common/best/cannonical way to get all this information for the Linux desktop? Gnome documentation is especially opaque. gnome-vsf has mime routines but it's deprecated and I can't find a mime routine for GIO/GFS, gnome-vsf's replacement. There's a vague implication that one should use the open desktop applications but which one to use is obscure. And where does libmagic and xdg fit in?
Pointers to an essay summarizing the issues gladly accepted. Again, I know the three line answer is "no such animal" but I'm looking for the long answer.
Here is an example of using GLib/GIO to get the information you want.
#include <gio/gio.h>
#include <stdio.h>
int
main (int argc, char **argv)
{
g_thread_init (NULL);
g_type_init ();
if (argc < 2)
return -1;
GError *error;
GFile *file = g_file_new_for_path (argv[1]);
GFileInfo *file_info = g_file_query_info (file,
"standard::*",
0,
NULL,
&error);
const char *content_type = g_file_info_get_content_type (file_info);
char *desc = g_content_type_get_description (content_type);
GAppInfo *app_info = g_app_info_get_default_for_type (
content_type,
FALSE);
/* you'd have to use g_loadable_icon_load to get the actual icon */
GIcon *icon = g_file_info_get_icon (file_info);
printf ("File: %s\nDescription: %s\nDefault Application: %s\n",
argv[1],
desc,
g_app_info_get_executable (app_info));
return 0;
}
You can use the tools available from xdg for that, in particular xdg-mime query.
To find out the filetype of e.g. a file index.html you would
$ xdg-mime query filetype index.html
This will return the mimetype. To query what application is associated with that mimetye use e.g.
$ xdg-mime query default text/html
This returns epiphany.desktop here, i.e. $APPNAME.desktop, so it is easy to get the application name from it. If you would just want to open the file in the default app you could of course just run
$ xdg-open index.html
which would fire up epiphany.
Query functions for icon resources do not seem to be available in xdg-utils, but you could write a small python script using pyxdg that offers tons of additional functionality, too.
For C bindings you will probably need to have a look into the portland code linked on the xdg page.
EDIT:
Concerning libmagic and friends, you will need to decide on your preferences: While libmagic seems to be more complete (and accurate) in terms of coverage for filetypes, it does not care at all about default applications or icons. It also does not provide you with tools to install extra mimetypes.
In Qt >= 4.6, there is a new function for X11 systems
QIcon QIcon::fromTheme ( const QString & name, const QIcon & fallback = QIcon() ) [static]
You can use this function. Documentation here / (Qt 5)
Neither QFileIconProvider nor QFileInfo will do anything with the OS mime database. To access icons associated with different mime types, you will have to use functions of the underlying desktop environment. In Qt there is (yet) no canonical way.
Consider you can have a different icon in Gnome, in KDE and in Windows. So for instance, in KDE you would use KMimeType.
I just found KFileItem. This class gives you everything you for icons, mime types and related things in KDE. I'm sure that there's an equivalent in gnome but this gives access at the same level as a QT application works.
You may want to use the system's "/etc/mime.types" file. It is also a good idea to maintain your program's copy of a MIME type file. That way, you are not dependent on the system, but at the same time you need to keep it fairly exhaustive. Not sure about Icons.
Maybe take a look at this code:
http://ftp.devil-linux.org/pub/devel/sources/1.2/file-4.23.tar.gz
This is the standard file util found on most Linux/Unix distributions. You will get the MIME-type and some more information.
I think both Gnome and KDE have their own ways to determine this and also to set the icon and the standard application for it.
Anyway, that file-tool is probably the best way to get the mime type and the document description. And in some cases even some details about the content.
This will get you the mime-type. That is what you need anyway to know how you can open the file. These are seperated steps. file doesn't say you about the icon nor the application to open the file with.
About 8 years late, but still useful.
To get the associated applications in KDE you can do what Joe suggested (using KFileItem). However, that requires inclusion of a lot of libraries.
The code below requires less.
#include <QCoreApplication>
#include <QMimeDatabase>
#include <QDebug>
#include <KMimeTypeTrader>
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
if (argc < 2)
{
qDebug() << "missing argument <filename>";
return 1;
}
QMimeDatabase mimeDb;
QMimeType mimeType = mimeDb.mimeTypeForFile(QString::fromLocal8Bit(argv[1]));
KService::List services = KMimeTypeTrader::self()->query(
mimeType.name(),QStringLiteral("Application"));
foreach(const QExplicitlySharedDataPointer<KService>& svc, services)
{
qDebug() << "service: " << svc->name();
qDebug() << "exec: " << svc->exec();
}
}
To compile the code add QT += KService KCoreAddons to your qmake .pro file.
Links to KMimeTypeTrader & KService documentation:
https://api.kde.org/frameworks/kservice/html/classKService.html
https://api.kde.org/frameworks/kservice/html/classKMimeTypeTrader.html
Copy/Paste of the nice example above (using GLib/Gio) just added proper release of allocated memory as per documentation. I tried to just edit the existing answer but it kept saying the edit queue was full :(
#include <gio/gio.h>
#include <stdio.h>
int
main (int argc, char **argv)
{
g_thread_init (NULL);
g_type_init ();
if (argc < 2)
return -1;
g_autoptr(GError) error;
GFile* file = g_file_new_for_path (argv[1]);
GFileInfo* file_info = g_file_query_info (file,
"standard::*",
G_FILE_QUERY_INFO_NONE,
NULL,
&error);
const char* content_type = g_file_info_get_content_type (file_info);
g_autofree gchar* desc = g_content_type_get_description (content_type);
GAppInfo* app_info = g_app_info_get_default_for_type (
content_type,
FALSE);
/* you'd have to use g_loadable_icon_load to get the actual icon */
GIcon* icon = g_file_info_get_icon (file_info);
printf ("File: %s\nDescription: %s\nDefault Application: %s\n",
argv[1],
desc,
g_app_info_get_executable (app_info));
g_object_unref(file_info);
g_object_unref(file);
return 0;
}