How do I send text from TextView to 'Export to PDF'? - c++

I'm new to GTK4, an open source toolkit for C++ binding. I want to send text from current TextView buffer (from a saved file) to a PDF file using GTK libraries (gtkmm4), but couldn't get anything printed out.
This is the code I have started from reading the documentation:
void MainWindow:export_note() {
auto op = Gtk::PrintOperation::create();
// setup op
cout << save_file_path << endl;
string content = editor.get_buffer()->get_text();
ofstream out(work_dir + save_file_path);
out << content;
out.close();
curr_state = edit_file;
op->set_export_filename("test.pdf");
auto res = op->run(Gtk::PrintOperation::Action::EXPORT);
return;
}
This only exports to a blank PDF, but I'm expecting text to show up on PDF.

It looks like you are not using any binary application to attempt conversion from text to text/pdf or more commonly binary application/pdf. You cannot simply stuff text data into a container called Test.pdf
There are simple means to convert Text to PDF, traditionally by using PostScript Printer files, but more commonly recently using a PDF printer driver direct
so start at the most basic level Hello World needs a file something like this, where the body is built up from stacked vectors or font labelled strings at X&Y co-ordinates.
Test.pdf
%PDF-1.1
%âãÏÓ
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj
2 0 obj<</Type/Pages/Kids [3 0 R]/Count 1/MediaBox [0 0 594 792]>>endobj
3 0 obj<</Type/Page/Parent 2 0 R/Resources<</Font<</F1<</Type/Font/Subtype/Type1/BaseFont/Helvetica>>>>>>/Contents 4 0 R>>endobj
4 0 obj<</Length 81
>>
stream
BT /F1 18 Tf 036 740 Td (Hello) Tj ET
BT /F1 18 Tf 036 720 Td (World!) Tj ET
endstream
endobj xref
0 5
0000000000 65535 f
0000000019 00000 n
0000000063 00000 n
0000000137 00000 n
0000000267 00000 n
trailer<</Root 1 0 R /Size 5>>startxref
399 %%EOF

Related

How do you encode the Certificate Revocation List (CRL) stream bytes in PDF?

I sign a PDF and I add update version in which I write the DSS with its CRLs, Certs, VRI.
19 0 obj
[15 0 R 16 0 R]
endobj
20 0 obj
[13 0 R 14 0 R]
endobj
11 0 obj
[15 0 R 16 0 R]
endobj
12 0 obj
[13 0 R 14 0 R]
endobj
17 0 obj
<<
/CRL 11 0 R
/Cert 12 0 R
>>
endobj
18 0 obj
<<
/5F44CF6F351DFD45FB62F3D0ED046408BC892797 17 0 R
>>
endobj
21 0 obj
<<
/VRI 18 0 R
/CRLs 19 0 R
/Certs 20 0 R
>>
I am confused about how should I write the Certificate and CRL streams.
15 0 obj
<<
/Length 1454
/Filter /FlateDecode
>>
stream
xÚ3hb0hb{ÅÄÈhÀÉƪÍÇÌ$ÅÊ`àcÈä2‡²° 3…Šˆ€8\¼®y%E¥Å%:žyÉz†ªÊ
ZbXd{0%KW÷ýY¯’ó‚-ØÂÛ„OÏó½z•î ‰`®•® K-›2}tÖ§^_8;xÉì¥Ó®~›.g9A'Õüê½—
ZbXd{0%KW÷ýY¯’ó‚-ØÂÛ„OÏó½z•î ‰`®•® K-›2}tÖ§^_8;xÉì¥Ó®~›.g9A'Õüê½—
endstream
endobj
16 0 obj
<<
/Length 1477
/Filter /FlateDecode
>>
stream
„kâR7Å41*!‡#8Íñ3 Ź˜#‰o=«‡çƒ#yë:X]r\~}¼)/Ñmç×£¦³äsËê]ÓÕ_+µ¥$Ô¿}¾ÜÏiÁÝT!¹ôi–Í9üÀ}Š¸|
ìŒH¿GÓø^ú¿ÔVÜK–qõ†µ®“¸»Ý*Žh¾JzåU7c~÷•ÔêýK*îú®¹¸DcÁ­³·NtV~Vóåíé5\‚&½|¶NäïŽ[K­
î›NRZbXd{0%KW÷ýY¯’ó‚-ØÂÛ„OÏó½z•î ‰`®•® K-›2}tÖ§^_8;xÉì¥Ó®~›.g9A'Õüê½—›oÇ:ç-¶?
endstream
endobj
13 0 obj
<<
/Length 1240
/Filter /FlateDecode
>>
stream
%ŸwC[í2×¾Iej©úkŽ-:ݳÔ<¼a£ƒô/5›‡~zÒ•7ü9uãcfk?ËÅ`ßÃ:Èb—’‚Ÿõ{ÏÅ—¢{]HçQ”9w(ÂB#í×g¥ìþè
^–F«š/r§š¿ì=#,^pëO€{äú=}RÎêð¦ÉŠ7or¼±Ëtë–x·˜§LÌŒŒ‹› Cd0€eùÿ³°03±>0P ñUY$
endstream
endobj
14 0 obj
<<
/Length 1159
/Filter /FlateDecode
>>
stream
4!>T‚êPpÎI,.V0Ò™#ûœºƒ=LÍš•ãˆ‘•¹‰‘Ÿ(ÎÅÔÄÈÈplŽ÷A¯¹7k/[‡O\}
öe™¨îö£œ¶ä'¶ÌpžªweÞª[¡$¼ØÍþþtó[½xÉO4ÞZ¥ØŸ^g ø,mu„_Rz™_PÏê.||º¶*þîÝxv½"»êôó»ø%Ü%ý
endstream
endobj
Please ignore the lengths and content of the streams above. I truncated them so the lengths don't correspond anymore. The streams are bigger than that.
The issue is that my PDF is not LTV enabled and I tested some scenarios from which I concluded that
my stream are not being written the right way.
I use the following structure from WinCrypt.h:
typedef struct _CERT_CONTEXT {
DWORD dwCertEncodingType;
BYTE *pbCertEncoded;
DWORD cbCertEncoded;
PCERT_INFO pCertInfo;
HCERTSTORE hCertStore;
} CERT_CONTEXT, *PCERT_CONTEXT;
typedef const CERT_CONTEXT *PCCERT_CONTEXT;
I go through them and get the bytes this way:
PCCERT_CONTEXT cngContext = (PCCERT_CONTEXT)(*itChain);
ByteArray certBytes(cngContext->pbCertEncoded, (size_t)cngContext->cbCertEncoded);
Then I just apply FlateDecode on the obtained bytes and write them into the PDF like a stream as you can see in the second block of code.
Am I missing any step? Like a conversion or something? I saw that the stream should be BER-Encoded. So should I transform the bytes into BER-Encoded and then apply FlateDecode?
Edit:
You can find My File here
SOLUTION
The problem was the stream of CRLs that I was writing in the PDF file.
Having the CRL_CONTEXT structures from each Certificate, I just took the pbCrlEncoded variable and write it directly in the stream of the CRL.
It seemed correct but I noticed I didn't have any CRL_ENTRY in the CRL_INFO of this structure so the encoded BYTEs didn't contain any list of revoked certificates.
Therefore, found that the certificates have a URL from where you can download the updated CRL. You can do that by opening Manage Computer Certificates in Windows -> find your Certificate and Open the Certificate -> Details -> CRL Distribution Points -> URL = "..". By accessing this url, the browser automatically downloads the CRL Info. You can access it and see some informations like Next Update which is the last day that this list is valid. After that, I'm assuming that you need to download it again for getting an updated version. Also you can see the list itself of revoked certificates.
This is the list I needed to put into the CRLs streams in PDF.
So I found a method to do that download process by code. This is a snippet of code used:
PCERT_CHAIN_ELEMENT chainElement; // this is the certification in the chain
pExtension = CertFindExtension(szOID_CRL_DIST_POINTS, chainElement->pCertContext->pCertInfo->cExtension, chainElement->pCertContext->pCertInfo->rgExtension);
if (!pExtension)
return ByteArray();
if (!CryptDecodeObject(X509_ASN_ENCODING, szOID_CRL_DIST_POINTS, pExtension->Value.pbData, pExtension->Value.cbData, 0, 0, &cbStructInfo))
return ByteArray();
if (!(pvStructInfo = LocalAlloc(LMEM_FIXED, cbStructInfo)))
return ByteArray();
CryptDecodeObject(X509_ASN_ENCODING, szOID_CRL_DIST_POINTS, pExtension->Value.pbData, pExtension->Value.cbData, 0, pvStructInfo, &cbStructInfo);
pInfo = (CRL_DIST_POINTS_INFO*)pvStructInfo;
Net::HttpRequest req;
Net::HttpRequestOptions ops;
ops.verb = Net::GET;
crllist = req.send(pInfo->rgDistPoint->DistPointName.FullName.rgAltEntry->pwszURL);
This way I obtained the Bytes that I could paste in PDF after applying FlateDecode on them.
Now the PDF is LTV Enabled.

Pandas Dataframe Wildcard Values in List

How can I filter a dataframe to rows with values that are contained within a list? Specifically, the values in the dataframe will only be partial matches with the list and never exact match.
I've tried using pandas.DataFrame.isin but this only works if the values in the dataframe are the same as in the list.
list = ["123 MAIN STREET", "456 BLUE ROAD", "789 SKY DRIVE"]
df =
address
0 123 MAIN
1 456 BLUE
2 987 PANDA
target_df = df[df["address"].isin(list)
Ideally the result should be
target_df =
address
0 123 MAIN
1 456 BLUE
Use str.contains and a simple regex using | to connect the terms.
f = '|'.join
mask = f(map(f, map(str.split, list)))
df[df.address.str.contains(mask)]
address
0 123 MAIN
1 456 BLUE
Ending up using for loop
df[[any(x in y for y in l) for x in df.address]]
Out[257]:
address
0 123 MAIN
1 456 BLUE

Trying to save a PDF string results in UnicodeDecodeError with WeasyPrint

So far this is my code:
from django.template import (Context, Template) # v1.11
from weasyprint import HTML # v0.42
import codecs
template = Template(codecs.open("/path/to/my/template.html", mode="r", encoding="utf-8").read())
context = Context({})
html = HTML(string=template.render(context))
pdf_file = html.write_pdf()
#with open("/path/to/my/file.pdf", "wb") as f:
# f.write(self.pdf_file)
Errorstack:
[17/Jan/2019 08:14:13] INFO [handle_correspondence:54] 'utf8' codec can't
decode byte 0xe2 in position 10: invalid continuation byte. You passed in
'%PDF-1.3\n%\xe2\xe3\xcf\xd3\n1 0 obj\n<</Author <> /Creator (cairo 1.14.6
(http://cairographics.org))\n /Keywords <> /Producer (WeasyPrint 0.42.3
\\(http://weasyprint.org/\\))>>\nendobj\n2 0 obj\n<</Pages 3 0 R /Type
/Catalog>>\nendobj\n3 0 obj\n<</Count 1 /Kids [4 0 R] /Type
/Pages>>\nendobj\n4 0 obj\n<</BleedBox [0 0 595 841] /Contents 5 0 R
/Group\n <</CS /DeviceRGB /I true /S /Transparency /Type /Group>>
MediaBox\n [0 0 595 841] /Parent 3 0 R /Resources 6 0 R /TrimBox [0 0 595
841]\n /Type /Page>>\nendobj\n5 0 obj\n<</Filter /FlateDecode /Length 15
0 R>>\nstream\nx\x9c+\xe4*T\xd0\x0fH,)I-\xcaSH.V\xd0/0U(N\xceS\xd0O4PH/\xe62P0P0\xb54U\xb001T(JUH\xe3\n\x04B\x00\x8bi\r\x89\nendstream\nendobj\n6 0
obj\n<</ExtGState <</a0 <</CA 1 /ca 1>>>> /Pattern <</p5 7 0
R>>>>\nendobj\n7 0 obj\n<</BBox [0 1123 794 2246] /Length 8 0 R /Matrix
[0.75 0 0 0.75 0 -843.5]\n /PaintType 1 /PatternType 1 /Resources
<</XObject <</x7 9 0 R>>>>\n /TilingType 1 /XStep 1588 /YStep
2246>>\nstream\n /x7 Do\n \n\nendstream\nendobj\n8 0 obj\n10\nendobj\n9 0
obj\n<</BBox [0 1123 794 2246] /Filter /FlateDecode /Length 10 0 R
/Resources\n 11 0 R /Subtype /Form /Type /XObject>>\nstream\nx\x9c+\xe4\nT(\xe42P0221S0\xb74\xd63\xb3\xb4T\xd05442\xd235R(JU\x08W\xc8\xe3*\xe42T0\x00B\x10\t\x942VH\xce\xe5\xd2O4PH/V\xd0\xaf04Tp\xc9\xe7\n\x04B\x00`\xf0\x10\x11\nendstream\nendobj\n10 0 obj\n77\nendobj\n11 0 obj\n<</ExtGState
<</a0 <</CA 1 /ca 1>>>> /XObject <</x11 12 0 R>>>>\nendobj\n12 0
obj\n<</BBox [0 1123 0 1123] /Filter /FlateDecode /Length 13 0 R
/Resources\n 14 0 R /Subtype /Form /Type /XObject>>\nstream\nx\x9c+\xe4\n
xe4\x02\x00\x02\x92\x00\xd7\nendstream\nendobj\n13 0 obj\n12\nendobj\n14 0
obj\n<<>>\nendobj\n15 0 obj\n58\nendobj\nxref\n0 16\n0000000000 65535
f\r\n0000000015 00000 n\r\n0000000168 00000 n\r\n0000000215 00000
n\r\n0000000270 00000 n\r\n0000000489 00000 n\r\n0000000620 00000
n\r\n0000000697 00000 n\r\n0000000923 00000 n\r\n0000000941 00000
n\r\n0000001165 00000 n\r\n0000001184 00000 n\r\n0000001264 00000
n\r\n0000001422 00000 n\r\n0000001441 00000 n\r\n0000001462 00000
n\r\ntrailer\n\n<</Info 1 0 R /Root 2 0 R /Size 16>>\nstartxref\n1481
n%%EOF\n' (<type 'str'>)
Actually it works via web request (returning the PDF as response) and via shell (manually writting the code). The code is tested and never gaves me problems. The files are saved with correct encoding, and setting the encoding kwarg in HTML doesn't help; also, the mode value of the template is correct, because I've seen other questions whose problem could be that.
However, I was adding a management command to use it periodically (for bigger PDFs I cannot do it via web request because the server's timeout could activate before finishing), and when I try to call it, I only get a UnicodeDecodeError saying 'utf8' codec can't decode byte 0xe2 in position 10: invalid continuation byte.
The PDF (at least from what I see) renders initially with this characters:
%PDF-1.3\n%\xe2\xe3\xcf\xd3\n1 0
which translates into this:
%PDF-1.3
%âãÏÓ
1 0 obj
So the problem is all about the character â. But it's a trap!
Instead, the problem is this line of code:
pdf_file = html.write_pdf()
Changing it to:
html.write_pdf()
Just works as expected!
So my question is: what type of reason could exists for Python to throw an UnicodeDecodeError when trying to assign a variable to a string? I've digged into weasyprint's code in my virtualenv, but I didn't see conversions out there.
So I don't know why, but now suddenly it works. I literally didn't modify anything: I just run the command again and it works.
I'm not marking the question as answered, as maybe in the future someone could have the same problem as me can try to post a correct one.
So disturbing.
EDIT
So it looks like I'm a very intelligent person who tries to set up the value of self.pdf_file, which is a models.FileField, to the content of the created PDF instead of the file itself.

load multiple csv files into Dataframe: columns names issue

I have multiple csv files with the same format (14 rows 4 columns).
I tried to load all of them into a single dataFrame, and use file's name to rename the values of the first column (1-14)
1 500 0 0
2 350 0 1
3 500 1 0
.............
13 600 0 0
14 800 0 0
I tried the following code but I am not getting what I am expecting:
filenames = os.listdir('Threshold/')
Y = pd.DataFrame () #empty df
# file name are in the following foramt "subx_ICA_thre.csv"
# need to get x (subject number to be used later for renaming columns values)
Sub_list=[]
for filename in filenames:
s= int(''.join(filter(str.isdigit, filename)))
Sub_list.append(int(s))
S_Sub_list= sorted(Sub_list)
for x in S_Sub_list: # get the file according to the subject number
temp = pd.read_csv('sub' +str(x)+'_ICA_thre.csv' )
df = pd.concat([Y, temp]) # concat the obtained frame with the empty frame
df.columns = ['id', 'data', 'isEB', 'isEM']
# replace the column values using subject id
for sub in range(1,15):
df['id'].replace(sub, 'sub' +str(x)+'_ICA_'+str(sub) ,inplace=True)
print (df)
output:
id data isEB isEM
0 sub1_ICA_2 200 0 0
1 sub1_ICA_3 275 0 0
2 sub1_ICA_4 500 1 0
................................
11 sub1_ICA_13 275 0 0
12 sub1_ICA_14 300 0 0
id data isEB isEM
0 sub2_ICA_2 275 0 0
1 sub2_ICA_3 500 0 0
2 sub2_ICA_4 400 0 0
.................................
11 sub2_ICA_13 300 0 0
12 sub2_ICA_14 450 0 0
First, it seems that the code makes different dataFrame not a single one.Second, the first row is removed (sub1_ICA_1 is missing, may be replaced with column names).
I couldn't find the problem in the loop that I am using
I think you need create list of DataFrames first, then concat with parameter keys for new values by range in MultiIndex, then modify column id and last remove MultiIndex by reset_index:
Also was added parameter names to read_csv for custom columns names.
Y = []
for x in S_Sub_list:
n = ['id', 'data', 'isEB', 'isEM']
temp = pd.read_csv('sub' + str(x) +'_ICA_thre.csv', names = n)
Y.append(temp)
#list comprehension alternative
#n = ['id', 'data', 'isEB', 'isEM']
#Y = [pd.read_csv('sub' + str(x) +'_ICA_thre.csv', names = n) for x in S_Sub_list]
df = pd.concat(Y, keys=range(1,len(S_Sub_list) + 1))
df['id'] = 'sub' + df.index.get_level_values(0).astype(str) +'_ICA_'+ df['id'].astype(str)
df = df.reset_index(drop=True)

Saving binary data in ColdFusion

I have a problem with saving a binary representation of a file to a file...
Let me show you my pain:
Everything starts with a file, file.pdf
Then the file is send via POST to a website with some additional data:
curl --data "sector=4&name=John&surname=Smith&email=john#smith.com&isocode=PL&theFile=$(cat file.pdf | base64)" http://localhost/awesomeUpload
then the data is received and decoded:
var decoded = BinaryDecode(data.theFile, "Base64");
then I attempt to save it by:
var theFilePath = ExpandPath("/localserver/temp/theFile.pdf");
fileWrite(theFilePath , data.theFile);
or:
var file_output_steam = CreateObject("java","java.io.FileOutputStream").init(theFilePath);
file_output_steam.write(data.theFile);
file_output_steam.close();
My files does not match ;(
the original one looks like
%PDF-1.5
%µµµµ
1 0 obj
<</Type/Catalog/Pages 2 0 R/Lang(pl-PL) /StructTreeRoot 13 0 R/MarkInfo<</Marked true>>>>
endobj
2 0 obj
<</Type/Pages/Count 1/Kids[ 3 0 R] >>
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Resources<</Font<</F1 5 0 R/F2 10 0 R>>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 595.32 841.92] /Contents 4 0 R/Group<</Type/Group/S/Transparency/CS/DeviceRGB>>/Tabs/S/StructParents 0>>
where as the copy that went through ColdFusion looks like:
%PDF-1.5
%µµµµ
1 0 obj
<</Type/Catalog/Pages 2 0 R/Lang(pl-PL) /StructTreeRoot 13 0 R/MarkInfo<</Marked true>> B™[™ŘšBŚŘšBŹŐ\KÔYŮ\ËĐŰÝ[ťKŇÚYÖČČ—H€Đ¦VćFö& ĐŁ2ö& ĐŁĂÂőG—RővRő&VçB""ő&W6÷W&6W3ĂÂôföçCĂÂôcR"ôc"#ŕ˝AÉ˝ŤM•Ńl˝A˝Q•áĐ˝%µ…ť•˝%µ…ť•˝%µ…ť•%t€>/MediaBox[ 0 0 595.32 841.92] /Contents 4 0 R/Group<</Type/Group/S/Transparency/CS/DeviceRGB>>/Tabs/S/StructParents 0>B™[™ŘšBŤŘšBŹŃš[\‹Ń›]QXŰŮKÓ[™ÝMŚOŹBśÝ™X[CBž'cłB°Ś!8Ě1Ď]CsôŘQ&‰2  PäV˝ëËöĽ¨QŰge•ź
ďÂŃ,đť#"aKR•˘<1™[ä¸
(ÄňĄyoâ9S\Śĺ <ę8I±D¬‰#…Ć”ťLé‘ا÷ÍnU|WŸ‰t`ýuşąĽ\hlu&âĂ7ß
ů"Ĺ\Ŕ>pÇč÷÷.°ß’Ř——•‹ĚB™[™Ý™X[CB™[™ŘšBŤHŘšBŹŐ\Kћ۝ÔÝXť\KŐ\LĐ\ŮQ›ŰťĐPŃQJĐŘ[XśšKŃ[ŰŮ[™ËŇY[ť]KRŃ\ŘŮ[™[ť›ŰťČ
‹ŐŐ[šXŰŮHŚŹŹB™[™ŘšBŤŘšB–Č
Č—HB™[™ŘšBŤČŘšBŹĐ\ŮQ›ŰťĐPŃQJĐŘ[XśšKÔÝXť\KĐŇQ›Űť\L‹Ő\Kћ۝ĐŇQŃŇQX\ŇY[ť]KŃČLĐŇQŢ\Ý[R[™›Č‹Ń›Űť\ŘÜš\ÜH‹ŐČŚŕЦVćFö& ĐŁ‚ö& ĐŁĂÂô÷&FW&–ćr„–FVçF—G’’ő&Vv—7G'’„Fö&R’ő7WĆVÖVçBăŕЦVćFö& ĐŁ’ö& ĐŁĂÂőG—RôföçDFW67&—F÷"ôföçDć
please help