QImage/QPixmap + .xpm file size limitations?

QImage/QPixmap + .xpm file size limitations? - c++

In my application, I am trying to load multiple .xpm images.
#include "Images/Destroyed_xpm.h"
//...
QPixmap(Destroyed_xpm).toImage()
When trying to load a 35 * 35 pixels object (above) everything works fine (app compiles, runs. Corretly initialised QImage obj), but when trying to load an image
#include "Images/MainWindowBackGround_xpm.h"
//...
QPixmap(MainWindowBackGround_xpm).toImage()
with 892 * 419 pixels the program produces a huge number of errors during compilation:
3 !3 Ne L9 '3 q9 m~ t= t= o= t= m= o= t= o= o= w9 ;. Oe r= Pe Qe 1~ Re Se Te Ue Ue Ve We Ve Xe Ye Ze `e f .f .f +f #f 23 #f .f $f %f &f *f 53 A9 =f -f ;f >f *f 7 ,f 'f 'f )f !f F:-1: ошибка: ~f {f ]f ^f ). /f :3 Qe (f _f 03 I9 :f <f [f }f [f p~ d3 |f 1f }f q9 2f Ne 3f 4f c3 5f 6f Q9 7f 8f 9f 0f af R9 bf cf df ef ff gf hf if jf kf lf mf nf :} of pf qf F3 rf sf Y] tf uf vf :} wf #0 ,0 %- A| xf yf zf Af Bf ,0 *{ Cf Df Ef Ff Gf Hf If |0 Jf Kf m+ Kf Lf [- Mf Nf Of b< Pf Qf 70 Rf Sf Tf 90 Sf Pf 70 _{ Uf Vf Wf Xf Yf Zf U{ `f g .g +g #g #g $g %g &g *g =g 4{ -g ;g >g ,g 'g )g !g ~g {g ]g ^g /g (g _g :g <g [g }g |g 1g 2g 3g 4g 5g 6g 7g 8g ^{ 9g 0g ag bg cg dg eg fg gg hg ig 8; jg kg lg Kf mg 10 1< mg ng og pg qg rg sg tg ug vg wg xg yg zg ] <[ Ag *g Bg Cg Dg 1; !] Eg W4 Fg Gg Hg Ig Jg Gg O{ Kg Lg Mg Ng Og Pg Qg Rg Sg Tg Y5 Ug Vg R- Wg Xg Yg Zg `g h .h +h #h #h $h %h &h *h =h -h ;h >h ,h 'h )h >9 !h ~h {h ]h ^h /h (h _h :h <h [h Cf }h |h 1h 2h 3h 4h 5h 6h 7h 8h 9h 0h ah bh ch dh eh fh gh hh ih jh kh lh mh nh oh lg ph qh rh sh th uh 6) *b vh wh xh yh zh Ah Bh Ch Dh Eh Fh Gh Hh Ih Jh Kh Lh Mh Nh Oh Ph Qh Rh Sh Th Uh Vh Wh Xh Yh Zh `h i .i +i #i #i $i %i &i *i =i -i ;i >i ,i f} E- 'i )i !i ~i {i ]i ^i /i (i _i :i Kb <i [i }i |i 1i 2i 3i 4i 5i 6i 7i 8i 9i w| m' .6 0i ai xb bi #6 Hc L^ ci di ei M6 X> fi gi Qb hi ii Ub x6 Vb ji ki E6 ^/ =| li *| mi #$ a, c ]/ ;| k( Z6 ni oi F6 // '| F6 )| ^/ {/ F6 pi a/ qi ri si ti ui C# vi wi xi {b yi zi Ai 'c Bi Ci t7 Di B$ Ei Fi e$ D, 6, I, .$ I, k, Gi Hi {$ A( Ii Ji Ki :c Li Mi Ni (_ b& Oi Pi .7 Qi h$ o| Ri Si Ti Ui Vi Wi Xi Yi Zi `i j .j +j #j #j $j %j &j *j =j $8 2i -j ;j >j ,j 'j )j !j ~j {j ]j ^j /j (j _j :j <j [j }j |j 1j 2j 3j 4j 5j [8 6j 7j 8j 9j 0j aj T^ bj cj dj ej fj gj hj h' ij jj kj lj mj nj _, h] oj pj md qj rj sj tj uj vj wj xj +d yj zj Aj `| 0, Bj o( Cj Dj Ej Fj Gj Hj Ij V* z2 Jj Kj Lj Mj Nj Oj Pj Qj Rj Sj Tj Uj Vj Wj Xj Yj Zj `j k .k +k #k #k $k %k &k *k =k -k ;k >k ,k 'k )k !k ~k {k ]k ^k /k (k _k :k <k [k 9# }k F> |k a8 1k =, 2k 3k c' )_ '_ 4k 5k )_ 9/ lj 6k &1 7k 8k 9k 0k ak bk x2 ck dk ek ]> A) ek fk .! Ld +! q8 gk ~2 i_ e_ Od ]2 7_ U) 8_ hk ik jk kk lk mk nk ok lk Vd pk Xd $> qk rk sk tk ]> ]> uk uk vk Gd ,2 wk xk yk zk Ak Bk Ck Ck Dk Ek Fk Gk Hk Ik Ck Jk s_ >e Kk Lk Mk Nk Ok Pk Qk :e Rk Sk Tk Uk Vk Wk Xk Yk .9 Zk `k l .l +l #l #b #b #l $l %l !9 &: &l &: *l =l F2 be -l ;l k* >l H! j* ^9 ,l ]9 M* ]9 'l ve ve J2 y! )l ~d !l 9e v( ~l {l ]l z* ^l T2 /l oe F* (l :9 :9 :9 qe Fj _l :l <l [l E! #: *) D! D! *) }l *) |l 1l K2 %: ",
102850 | " . + # # $ % & & % * = - ; > , ' ) ! ~ { ] ^ / ( _ :-1: ошибка: < [ < } | [ 1 2 2 2 3 4 5 6 7 8 9 0 a b c 5 d e f g h i j f k d l 5 m m n k o n g 2 < 1 1 p q r s t u v 1 w x e y z A A A B C C D E F G H I J H H H K L M N O N P Q R S T U T V W X Y Z ` . l .. +. #. #. $. %. %. &. *. =. -. -. ;. >. ,. '. ). *. !. ~. {. .. ]. ^. /. (. _. :. <. [. }. |. 1. 2. 3. 4. 5. 6. 7. 8. 9. 0. a. b. c. d. e. f. f. g. g. h. i. j. k. j. l. m. n. o. p. q. p. r. s. t. u. v. w. x. y. y. z. A. B. B. C. D. E. F. G. H. I. J. K. L. M. N. O. P. Q. R. S. T. U. V. W. X. Y. Z. `. + .+ ++ #+ #+ $+ %+ &+ *+ =+ *+ -+ ;+ >+ ,+ '+ )+ !+ ~+ {+ ]+ ^+ /+ (+ _+ :+ <+ [+ }+ |+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+ 9+ 0+ a+ b+ c+ d+ e+ f+ g+ h+ i+ j+ k+ l+ m+ n+ o+ p+ q+ r+ s+ t+ u+ v+ w+ x+ y+ z+ A+ B+ C+ D+ E+ F+ G+ H+ I+ J+ K+ L+ ;+ M+ N+ O+ P+ Q+ R+ S+ T+ U+ V+ W+ X+ Y+ Z+ `+ # .# +# ## ## $# %# &# *# =# -# ;# ># ,# '# )# !# ~# {# ]# ^# /# (# _# :# <# [# }# |# 1# 2# 3# 4# 5# 6# 7# 8# 9# 0# a# b# c# d# e# f# g# h# i# j# k# l# m# n# o# p# q# r# s# t# u# v# w# x# y# z# A# B# C# / D# E# F# G# H# I# J# K# L# M# N# O# P# Q# R# S# T# U# V# W# X# Y# Z# `# # .# +# ## ## $# %# &# *# =# -# ;# ># ,# '# )# !# ~# {# ]# ^# /# (# _# :# <# [# }# |# 1# 2# 3# 4# 5# 6# 7# 8# 9# 0# a# b# c# d# d# e# f# g# h# i# j# 8# k# l# m# n# o# p# q# r# s# t# u# v# w# x# y# z# A# B# C# D# E# F# G# H# I# J# K# L# M# N# O# P# Q# R# S# T# U# V# W# X# Y# Z# `# $ .$ +$ #$ #$ $$ %$ &$ *$ =$ -$ ;$ >$ ,$ '$ )$ !$ ~$ ~$ {$ ]$ ^$ /$ ($ _$ :$ <$ [$ }$ |$ 1$ 2$ 3$ 4$ 5$ 6$ ^$ 7$ 8$ 9$ 0$ a$ >$ b$ c$ d$ e$ .$ f$ g$ h$ i$ $ j$ k$ l$ m$ n$ o$ p$ q$ r$ s$ t$ u$ v$ w$ x$ y$ z$ A$ B$ C$ D$ E$ W# F$ G$ H$ I$ J$ K$ L$ M$ N$ O$ P$ Q$ R$ S$ T$ U$ V$ W$ X$ Y$ Z$ `$ % .% U# +% #% #% $% %% &% *% =% -% ;% >% ,% '% )% !% ~% {% ]% x# ^% /% (% _% :% <% [% }% |% 1% 2% 3% 4% 5% 6% 7% 8% 9% 0% a% b% c% d% e% f% g% h% i% j% k% l% m% n% o% p% q% r% k% s% t% u% v% w% x% y% z% A% B% C% D% E% F% G% H% I% J% K% L% M% N% O% P% Q% R% S% T% U% V% W% X% Y% Z% `% & .& +& #& #& $& %& && *& =& -& ;& >& ,& '& )& !& ~& {& ]& ^& /& (& _& :& <& [& }& |& 1& 2& 3& 4& 5& 6& 7& 8& 9& 0& m% a& b& c& d& e& f& g& g& h& i& j& k& l& l& l& m& n& n& o& m& p& q& r& s& t& u& v& w& x& y& z& A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z& `& * .* +* #* #* $* %* &* ** =* +* -* ;* >* ,* '* )* !* '* ~* {* ]* ^* /* (* _* :* <* [* }* |* 1* 2* 3* Q% Q% 4* 5* 6* 7* 8* 9* 0* a* b* c* d* e* f* g* h* 8* i* j* i* k* l* m* n* m* o* p* q* r* q* s* t* u* v* w* x* y* x* y* z* A* B* C* D* E* F* G* H* I* J* i* K* k* k* k* k* L* j* k* M* M* N* ",
102858 | ".3 ze 29 29 we 49 Dr Er &} Fr Gr Hr Ir Jr Kr f9 :-1: ошибка: w #. 93 Lr Mr Je <. &3 Nr Or Pr Qr Rr Sr Tr Or Ur Vr Wr Xr Wr kl Yr nl x~ :. nl L9 Zr `r s .s +s Zr ^f ^f q9 )3 ^f o9 !3 )3 #s )3 )3 L9 #s o= o~ o~ o= B: c3 c3 t= o= $s 73 %s &s *s =s 1f -s ;s >s ,s 's )s !s ~s ~s 's 's {s ]s ]s Ve ^s ^s Ve /s Ve (s xl {s _s 's )s {s ~s :s :s <s J 4= 6= 6= [s }s |s |f rl 1s 2s '3 ;. Ql 3s 03 I9 4s 5s 6s 7s 8s 7s 9s 0s as bs cs ds es J9 J9 fs gs hs X: X: X: is js ks ls ms ns os 0f ps af qs rs ss ts us vs ws xs ys zs As Bs Cs Ds Es Fs Gs Hs Is Js Ks << Cf Ls Sn Ms Ns Os Ps Qs Rs Ss ^0 Ts Us Vs Ws Xs Ys `0 Zs Zs A4 4 4 `s Mf t hm 3[ .t +t #t #t $t #t %t &t *t =t -t 90 ;t >t ,t 't )t Wf Im !t ~t {t ]t ,n ^t h0 5{ /t (t _t :t ,g <t [t }t |t 1t 2t 3t 4t 5t Pm 6t 7t 8t 9t 0t at bt ct dt et ft gt ht it jt kt lt mt nt ot pt qt l; rt st tt ut vt wt xt W5 60 yt C4 zt At Bt Ct Dt Et Ft Gt Ht It Jt Kt Lt Mt Kt Nt Ot Pt Qt Rt St Tt Ut Vt Wt Xt Yt Zt `t 2{ u .u 8{ +u #u 8- #u $u %u &u *u pn =u -u ;u >u ,u 'u )u !u ~u 2- {u ]u ^u /u (u v3 A= _u :u <u [u }u |u 1u 2u 3u 4u 5u 6u 7u 8u 9u 0u au Tf bu cu du eu fu gu hu iu ju ku lu mu nu ou pu ;j qu ru &e su tu Q, uu e$ vu wu xu hu $| yu zu Au F> )7 Bu Cu Du Eu Fu Gu Hu Z* Iu Ju Ku Lu Mu Nu Ou Pu Qu Ru Su Tu Uu Vu Wu Xu Yu Zu `u v .v +v #v #v $v %v &v y1 *v =v }# -v ;v >v ,v 'v )v !v ~v {v ]v ^v {) /v (v _v :v O< <v [v m; }v |v 1v 2v 3v ^i 4v 5v 6v 7v 8v 9v 0v av bv cv *p dv e_ ev fv gv p hv iv jv kv lv s^ -% po mv $p nv E> ov pv qv X> Rb rv sv tv uv vv wv xv yv ~| Gi zv `, Av =| 0, Bv b, 6, c/ .c Cv Dv ~| 5p Ev Fv :q 7) Gv ^c Hv Iv Jv Kv Lv Mv Nv Ov Pv Qv Rv )1 #c Sv nd X, $ Tv Uv 6/ i, m, h, xv h, g, Vv Gi Wv Vb Xv e, Yv n) Zv `v w .w +w #w #w bp $w %w &w *w =w -w ;w >w ,w 'w <e )w !w ~w {w ]w ^w /w (w _w :w <w [w }w |w 1w 2w 3w 4w 5w 6w 7w 8w 9w 0w aw bw cw dw ew ,k [) &u fw gw hw iw jw kw lw mw nw Ic ow pw qw rw sw tw uw , vw 9l ww xw yw zw Aw Bw Cw l, 8k Dw Ew Fw Gw Hw Iw Jw Kw yj Lw Mw J7 Nw Ow Pw Qw Rw Sw Tw Uw Vw Ww Xw -) =l Yw Zw `w x .x +x #x #x $x %x &x Gw *x =x -x ;x >x ,x 'x )x !x ~x {x ]x 07 ^x /x (x _x :x <x [x }x |x 1x 2x 3x 4x 5x 6x 7x 8x 9x 0x ax bx cx !& =, k/ dx ex j% fx Nq 8( _c )( gx hx Ba ix jx kx n) lx 98 .7 mx C# nx C# m% [% ox px qx rx s8 y& [2 sx k& #2 U) tx }2 e_ #! ux vx I& gk P) Qd Jd y& wx Md G& xx yx zx Ax #r Bx Cx Dx Ex Fx Gx Hx Ix Jx Kx Lx ,r ]> n1 G6 Mx n8 Nx Ox Px Qx Rx _r Sx Tx Ux Ux )! (r N_ Vx Wx Xx Yx Q8 Zx `x y .y +y #y #y Qk $y Rk #y %y &y *y =y -y P_ ;y l 32 `k >y ,y r_ 'y )y !y ~y &: {y &l hr ir ]y ^y lr /y (y H! N2 j* ^9 _y ^9 ,l :y <y [y 'l -) }y )9 rr |y rr 1y 2y 3y 4y ]l y* (9 g1 !: P! D* 5y :9 :9 6y 7y 8y 9y 0y ay by v! U7 D! ~9 E! D! E! P2 D! [y -) cy ",
The top three lines of both files in the filesystem
/* XPM */
static char * Destroyed_xpm[] = {
"35 34 2 1",
,
/* XPM */
static char * MainWindowBackGround_xpm[] = {
"892 419 102846 3",
Fs shows: 1.4 kB and 2.9 MB size.
Also, here was noted that
Both are limited to 32767x32767 pixels.
What am I doing wrong?

Related

filtering the aminoacid sequence in a UniProt dat.-file

Here i have a example of my input:
ID CAR16_HUMAN Reviewed; 197 AA.
AC Q5EG05; Q96RJ9;
DT 02-SEP-2008, integrated into UniProtKB/Swiss-Prot.
DT 15-MAR-2005, sequence version 1.
DT 26-FEB-2020, entry version 116.
DE RecName: Full=Caspase recruitment domain-containing protein 16;
DE AltName: Full=Caspase recruitment domain-only protein 1;
DE Short=CARD-only protein 1;
DE AltName: Full=Caspase-1 inhibitor COP;
DE AltName: Full=Pseudo interleukin-1 beta converting enzyme;
DE Short=Pseudo-ICE;
DE Short=Pseudo-IL1B-converting enzyme;
GN Name=CARD16; Synonyms=COP, COP1;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 2), FUNCTION, TISSUE SPECIFICITY, AND
RP INTERACTION WITH CASP1 AND RIPK2.
RX PubMed=11536016; DOI=10.1038/sj.cdd.4400881;
RA Druilhe A., Srinivasula S.M., Razmara M., Ahmad M., Alnemri E.S.;
RT "Regulation of IL-1beta generation by Pseudo-ICE and ICEBERG, two dominant
RT negative caspase recruitment domain proteins.";
RL Cell Death Differ. 8:649-657(2001).
RN [2]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
RA Wang P.Z., Wang F., Wang X., Wu J.;
RT "Novel splicing variants of some human genes.";
RL Submitted (JAN-2005) to the EMBL/GenBank/DDBJ databases.
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
RC TISSUE=Spleen;
RX PubMed=14702039; DOI=10.1038/ng1285;
RA Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R.,
RA Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H.,
RA Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S.,
RA Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K.,
RA Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H.,
RA Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M.,
RA Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K.,
RA Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T.,
RA Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M.,
RA Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S.,
RA Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H.,
RA Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K.,
RA Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N.,
RA Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S.,
RA Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O.,
RA Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H.,
RA Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B.,
RA Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y.,
RA Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K.,
RA Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T.,
RA Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T.,
RA Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y.,
RA Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H.,
RA Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y.,
RA Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H.,
RA Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O.,
RA Isogai T., Sugano S.;
RT "Complete sequencing and characterization of 21,243 full-length human
RT cDNAs.";
RL Nat. Genet. 36:40-45(2004).
RN [4]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RA Mural R.J., Istrail S., Sutton G.G., Florea L., Halpern A.L., Mobarry C.M.,
RA Lippert R., Walenz B., Shatkay H., Dew I., Miller J.R., Flanigan M.J.,
RA Edwards N.J., Bolanos R., Fasulo D., Halldorsson B.V., Hannenhalli S.,
RA Turner R., Yooseph S., Lu F., Nusskern D.R., Shue B.C., Zheng X.H.,
RA Zhong F., Delcher A.L., Huson D.H., Kravitz S.A., Mouchard L., Reinert K.,
RA Remington K.A., Clark A.G., Waterman M.S., Eichler E.E., Adams M.D.,
RA Hunkapiller M.W., Myers E.W., Venter J.C.;
RL Submitted (JUL-2005) to the EMBL/GenBank/DDBJ databases.
RN [5]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2).
RX PubMed=15489334; DOI=10.1101/gr.2596504;
RG The MGC Project Team;
RT "The status, quality, and expansion of the NIH full-length cDNA project:
RT the Mammalian Gene Collection (MGC).";
RL Genome Res. 14:2121-2127(2004).
RN [6]
RP FUNCTION, TISSUE SPECIFICITY, SUBUNIT, AND INTERACTION WITH CASP1 AND
RP RIPK2.
RX PubMed=11432859; DOI=10.1074/jbc.m101415200;
RA Lee S.H., Stehlik C., Reed J.C.;
RT "Cop, a caspase recruitment domain-containing protein and inhibitor of
RT caspase-1 activation processing.";
RL J. Biol. Chem. 276:34495-34500(2001).
RN [7]
RP INTERACTION WITH CARD8.
RX PubMed=11821383; DOI=10.1074/jbc.m107811200;
RA Razmara M., Srinivasula S.M., Wang L., Poyet J.-L., Geddes B.J.,
RA DiStefano P.S., Bertin J., Alnemri E.S.;
RT "CARD-8 protein, a new CARD family member that regulates caspase-1
RT activation and apoptosis.";
RL J. Biol. Chem. 277:13952-13958(2002).
RN [8]
RP INDUCTION.
RX PubMed=16354923; DOI=10.1523/jneurosci.4181-05.2005;
RA Wang X., Wang H., Figueroa B.E., Zhang W.-H., Huo C., Guan Y., Zhang Y.,
RA Bruey J.-M., Reed J.C., Friedlander R.M.;
RT "Dysregulation of receptor interacting protein-2 and caspase recruitment
RT domain only protein mediates aberrant caspase-1 activation in Huntington's
RT disease.";
RL J. Neurosci. 25:11645-11654(2005).
RN [9]
RP FUNCTION, AND INTERACTION WITH CASP4.
RX PubMed=16920334; DOI=10.1016/j.bbadis.2006.06.015;
RA Wang X., Narayanan M., Bruey J.-M., Rigamonti D., Cattaneo E., Reed J.C.,
RA Friedlander R.M.;
RT "Protective role of Cop in Rip2/caspase-1/caspase-4-mediated HeLa cell
RT death.";
RL Biochim. Biophys. Acta 1762:742-754(2006).
RN [10]
RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RX PubMed=21269460; DOI=10.1186/1752-0509-5-17;
RA Burkard T.R., Planyavsky M., Kaupe I., Breitwieser F.P., Buerckstuemmer T.,
RA Bennett K.L., Superti-Furga G., Colinge J.;
RT "Initial characterization of the human central proteome.";
RL BMC Syst. Biol. 5:17-17(2011).
CC -!- FUNCTION: Caspase inhibitor. Acts as a regulator of procaspase-1/CASP1
CC activation implicated in the regulation of the proteolytic maturation
CC of pro-interleukin-1 beta (IL1B) and its release during inflammation.
CC Inhibits the release of IL1B in response to LPS in monocytes. Also
CC induces NF-kappa-B activation during the pro-inflammatory cytokine
CC response. Also able to inhibit CASP1-mediated neuronal cell death, TNF-
CC alpha, hypoxia-, UV-, and staurosporine-mediated cell death but not ER
CC stress-mediated cell death. Acts by preventing activation of caspases
CC CASP1 and CASP4, possibly by preventing the interaction between CASP1
CC and RIPK2. {ECO:0000269|PubMed:11432859, ECO:0000269|PubMed:11536016,
CC ECO:0000269|PubMed:16920334}.
CC -!- SUBUNIT: Homooligomer. Interacts with CASP1, CASP4, CARD8 and RIPK2.
CC {ECO:0000269|PubMed:11432859, ECO:0000269|PubMed:11536016,
CC ECO:0000269|PubMed:11821383, ECO:0000269|PubMed:16920334}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=2;
CC Name=1;
CC IsoId=Q5EG05-1; Sequence=Displayed;
CC Name=2;
CC IsoId=Q5EG05-2; Sequence=VSP_035216;
CC -!- TISSUE SPECIFICITY: Widely expressed. Expressed at higher level in
CC placenta, spleen, lymph node and bone marrow. Weakly or not expressed
CC in thymus. {ECO:0000269|PubMed:11432859, ECO:0000269|PubMed:11536016}.
CC -!- INDUCTION: Down-regulated in patients suffering of Huntington disease.
CC {ECO:0000269|PubMed:16354923}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AF367017; AAK71682.1; -; mRNA.
DR EMBL; AY885669; AAW78563.1; -; mRNA.
DR EMBL; AK311902; BAG34843.1; -; mRNA.
DR EMBL; CH471065; EAW67062.1; -; Genomic_DNA.
DR EMBL; BC117478; AAI17479.1; -; mRNA.
DR EMBL; BC117480; AAI17481.1; -; mRNA.
DR CCDS; CCDS31661.1; -. [Q5EG05-1]
DR CCDS; CCDS41705.1; -. [Q5EG05-2]
DR RefSeq; NP_001017534.1; NM_001017534.1. [Q5EG05-1]
DR RefSeq; NP_443121.1; NM_052889.2. [Q5EG05-2]
DR SMR; Q5EG05; -.
DR BioGrid; 125339; 3.
DR IntAct; Q5EG05; 1.
DR MINT; Q5EG05; -.
DR STRING; 9606.ENSP00000364858; -.
DR iPTMnet; Q5EG05; -.
DR PhosphoSitePlus; Q5EG05; -.
DR BioMuta; CARD16; -.
DR DMDM; 74722547; -.
DR jPOST; Q5EG05; -.
DR MassIVE; Q5EG05; -.
DR MaxQB; Q5EG05; -.
DR PaxDb; Q5EG05; -.
DR PeptideAtlas; Q5EG05; -.
DR PRIDE; Q5EG05; -.
DR ProteomicsDB; 62773; -. [Q5EG05-1]
DR ProteomicsDB; 62774; -. [Q5EG05-2]
DR Ensembl; ENST00000375706; ENSP00000364858; ENSG00000204397. [Q5EG05-1]
DR Ensembl; ENST00000375704; ENSP00000364856; ENSG00000204397. [Q5EG05-2]
DR GeneID; 114769; -.
DR KEGG; hsa:114769; -.
DR UCSC; uc001pio.2; human. [Q5EG05-1]
DR CTD; 114769; -.
DR DisGeNET; 114769; -.
DR GeneCards; CARD16; -.
DR HGNC; HGNC:33701; CARD16.
DR HPA; HPA053981; -.
DR HPA; HPA062805; -.
DR MIM; 615680; gene.
DR neXtProt; NX_Q5EG05; -.
DR OpenTargets; ENSG00000204397; -.
DR PharmGKB; PA164717628; -.
DR eggNOG; KOG3573; Eukaryota.
DR eggNOG; ENOG410ZQIE; LUCA.
DR GeneTree; ENSGT00940000159114; -.
DR HOGENOM; CLU_119795_0_0_1; -.
DR InParanoid; Q5EG05; -.
DR KO; K12806; -.
DR OMA; CITDICE; -.
DR OrthoDB; 1327703at2759; -.
DR PhylomeDB; Q5EG05; -.
DR TreeFam; TF330675; -.
DR GeneWiki; COP1; -.
DR GenomeRNAi; 114769; -.
DR Pharos; Q5EG05; Tbio.
DR PRO; PR:Q5EG05; -.
DR Proteomes; UP000005640; Chromosome 11.
DR RNAct; Q5EG05; protein.
DR Bgee; ENSG00000204397; Expressed in leukocyte and 170 other tissues.
DR Genevisible; Q5EG05; HS.
DR GO; GO:0097179; C:protease inhibitor complex; IDA:UniProtKB.
DR GO; GO:0032991; C:protein-containing complex; IDA:UniProtKB.
DR GO; GO:0050700; F:CARD domain binding; IPI:UniProtKB.
DR GO; GO:0089720; F:caspase binding; IPI:UniProtKB.
DR GO; GO:0004869; F:cysteine-type endopeptidase inhibitor activity; IDA:UniProtKB.
DR GO; GO:0042802; F:identical protein binding; IDA:UniProtKB.
DR GO; GO:0019900; F:kinase binding; IPI:UniProtKB.
DR GO; GO:0071456; P:cellular response to hypoxia; IDA:UniProtKB.
DR GO; GO:0071222; P:cellular response to lipopolysaccharide; IDA:UniProtKB.
DR GO; GO:0071494; P:cellular response to UV-C; IDA:UniProtKB.
DR GO; GO:0097340; P:inhibition of cysteine-type endopeptidase activity; IDA:UniProtKB.
DR GO; GO:0043154; P:negative regulation of cysteine-type endopeptidase activity involved in apoptotic process; IDA:UniProtKB.
DR GO; GO:0050713; P:negative regulation of interleukin-1 beta secretion; IDA:UniProtKB.
DR GO; GO:0031665; P:negative regulation of lipopolysaccharide-mediated signaling pathway; IDA:UniProtKB.
DR GO; GO:0032091; P:negative regulation of protein binding; IDA:UniProtKB.
DR GO; GO:0010804; P:negative regulation of tumor necrosis factor-mediated signaling pathway; IMP:UniProtKB.
DR GO; GO:0043123; P:positive regulation of I-kappaB kinase/NF-kappaB signaling; IDA:UniProtKB.
DR GO; GO:0051092; P:positive regulation of NF-kappaB transcription factor activity; IDA:UniProtKB.
DR InterPro; IPR001315; CARD.
DR InterPro; IPR011029; DEATH-like_dom_sf.
DR Pfam; PF00619; CARD; 1.
DR SMART; SM00114; CARD; 1.
DR SUPFAM; SSF47986; SSF47986; 1.
DR PROSITE; PS50209; CARD; 1.
PE 1: Evidence at protein level;
KW Alternative splicing; Polymorphism; Protease inhibitor; Reference proteome;
KW Thiol protease inhibitor.
FT CHAIN 1..197
FT /note="Caspase recruitment domain-containing protein 16"
FT /id="PRO_0000349180"
FT DOMAIN 1..91
FT /note="CARD"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00046"
FT VAR_SEQ 92..197
FT /note="ALQAVQDNPAMPTCSSPEGRIKLCFLEDAQRIWKQKLQRCHVQNTIIKWSER
FT YTSGSFEMQWLFLRTNFIERFWRNILLLPLHKGSLYPRIPGLGKELQTGTHKLS -> G
FT PIPGN (in isoform 2)"
FT /evidence="ECO:0000303|PubMed:11536016,
FT ECO:0000303|PubMed:14702039, ECO:0000303|PubMed:15489334"
FT /id="VSP_035216"
FT VARIANT 33
FT /note="R -> S (in dbSNP:rs35966314)"
FT /id="VAR_046279"
FT VARIANT 37
FT /note="Q -> K (in dbSNP:rs1042744)"
FT /id="VAR_046280"
FT VARIANT 56
FT /note="A -> D (in dbSNP:rs34534919)"
FT /id="VAR_046281"
FT VARIANT 167
FT /note="N -> I (in dbSNP:rs542571)"
FT /id="VAR_046282"
SQ SEQUENCE 197 AA; 22625 MW; 5DCAC6A9B2FAE82F CRC64;
MADKVLKEKR KLFIHSMGEG TINGLLDELL QTRVLNQEEM EKVKRENATV MDKTRALIDS
VIPKGAQACQ ICITYICEED SYLAETLGLS AALQAVQDNP AMPTCSSPEG RIKLCFLEDA
QRIWKQKLQR CHVQNTIIKW SERYTSGSFE MQWLFLRTNF IERFWRNILL LPLHKGSLYP
RIPGLGKELQ TGTHKLS
//
As you can see in the example, i need the aminoacid sequence in the end of the example.
My idea was to you a Regex after the SQ with the (?<=SQ)(.*) expression.
But this only gives me the
SQ SEQUENCE 197 AA; 22625 MW; 5DCAC6A9B2FAE82F CRC64;
line.
What i search for is the line with this sequence :
MADKVLKEKR KLFIHSMGEG TINGLLDELL QTRVLNQEEM EKVKRENATV MDKTRALIDS
VIPKGAQACQ ICITYICEED SYLAETLGLS AALQAVQDNP AMPTCSSPEG RIKLCFLEDA
QRIWKQKLQR CHVQNTIIKW SERYTSGSFE MQWLFLRTNF IERFWRNILL LPLHKGSLYP
RIPGLGKELQ TGTHKLS
Could some give me an idea how i can get this?

We can try using re.findall twice:
inp = "..."
sq = re.findall(r'SQ\s+SEQUENCE(.*?;.*?;.*?;.*?[A-Z]{5,}(?:\s+[A-Z]{5,})*)', inp, flags=re.DOTALL)[0]
matches = re.findall(r'\b[A-Z]{5,}\b', sq, flags=re.DOTALL)
print(matches)
This prints:
['MADKVLKEKR', 'KLFIHSMGEG', 'TINGLLDELL', 'QTRVLNQEEM', 'EKVKRENATV',
'MDKTRALIDS', 'VIPKGAQACQ', 'ICITYICEED', 'SYLAETLGLS', 'AALQAVQDNP',
'AMPTCSSPEG', 'RIKLCFLEDA', 'QRIWKQKLQR', 'CHVQNTIIKW', 'SERYTSGSFE',
'MQWLFLRTNF', 'IERFWRNILL', 'LPLHKGSLYP', 'RIPGLGKELQ', 'TGTHKLS']
The strategy here is to first isolate the ending portion of the text beginning with SQ SEQUENCE. Then, we use re.findall to repeatedly find the sequences you actually want.

Allocated table variation

PROGRAM satellite
IMPLICIT NONE
INTEGER :: i, j, ok, nc
REAL :: alph, bet, chi, ninf1, C1
REAL, DIMENSION(:), ALLOCATABLE :: uexact, x, Econs
REAL :: E, k, Lc, hc, eps, h
Read*,nc
E=25. ; k=125. ; hc=0.01 ; eps=0.01 ; Lc=1 ;
h = Lc/nc ; chi=sqrt((E*hc)/k) ; alph= -(1/h**2) ; bet=(2/h**2)+(k/(E*hc))
ALLOCATE(x(0:nc), uexact(0:nc), Econs(0:nc))
OPEN(UNIT=888,FILE="uetuexact.out",ACTION="write",STATUS='old')
OPEN(UNIT=889,FILE="consistance.out",ACTION="write",STATUS='old')
DO i = 0, nc
x(i) = h*i-Lc/2
uexact(i) = eps*chi*((sinh(x(i)/chi))/(cosh(Lc/(2*chi))))
END DO
!-------------------------------------------------------------------------------
DO i=1,nc-1
Econs(i)=(alph*(uexact(i-1)))+(bet*(uexact(i)))+(alph*(uexact(i+1)))
END DO
ninf1=maxval(Econs)
C1=ninf1*(nc**2)
DO i = 0, nc
WRITE(888,fmt='(3E15.6)') x(i), uexact(i)
WRITE(889,fmt='(3E15.6)') x(i), uexact(i), -Econs(i)
END DO
Print* , 'nc :', nc
Print* , 'h :', h
Print* , 'ninf1 :', ninf1
Print* , 'C1 :', C1
END PROGRAM satellite
I need my nc variable to change from 10,50,100,500,1000,5000,10000 in order to write out for each given nc, the ninf1 and C1 values. For now, i was doing nc manually but i will need a file .out : that gives me nc | ninf1 | C1. I want to know how can i vary my nc to this values precisely.

You could do the following:
define an array nsizes that would hold all the values you want nc to take.
declare a variable iter that would run along this array
iterate with iter over the length of nsizes
At the beginning of each iteration assign nc = nsizes(iter)
At the end of each iteration deallocate the arrays
Here is a patch that does just that.
--- satellite.f90 2020-02-16 18:13:35.662123215 +0700
+++ satellite_loop.f90 2020-02-16 18:50:09.662029872 +0700
## -1,11 +1,15 ##
PROGRAM satellite
IMPLICIT NONE
- INTEGER :: i, j, ok, nc
+ INTEGER :: i, j, ok, nc, iter
REAL :: alph, bet, chi, ninf1, C1
REAL, DIMENSION(:), ALLOCATABLE :: uexact, x, Econs
REAL :: E, k, Lc, hc, eps, h
+ INTEGER, DIMENSION(7) :: nsizes = (/ 10, 50, 100, 500, 1000, 5000, 10000/)
+
+ ! Read*,nc
+ do iter = 1, 7
+ nc = nsizes(iter)
- Read*,nc
E=25. ; k=125. ; hc=0.01 ; eps=0.01 ; Lc=1 ;
h = Lc/nc ; chi=sqrt((E*hc)/k) ; alph= -(1/h**2) ; bet=(2/h**2)+(k/(E*hc))
## -38,4 +42,8 ##
Print* , 'ninf1 :', ninf1
Print* , 'C1 :', C1
+deallocate(x, uexact, econs)
+
+end do
+
END PROGRAM satellite
The output would look like this:
nc : 10
h : 0.100000001
ninf1 : 1.17741162E-02
C1 : 1.17741168
nc : 50
h : 1.99999996E-02
ninf1 : 2.39971280E-03
C1 : 5.99928188
nc : 100
h : 9.99999978E-03
ninf1 : 7.46726990E-04
C1 : 7.46726990
nc : 500
h : 2.00000009E-03
ninf1 : 1.44958496E-04
C1 : 36.2396240
nc : 1000
h : 1.00000005E-03
ninf1 : 8.23974609E-04
C1 : 823.974609
nc : 5000
h : 1.99999995E-04
ninf1 : 2.73437500E-02
C1 : 683593.750
nc : 10000
h : 9.99999975E-05
ninf1 : 0.125000000
C1 : 12500000.0

How to split a string between two character into sub groups in R

I have a list of codes in the second column of a table and I want to extract some elements of each code then store them in new columns associated with each of the codes.
Each code consists of letters followed by some numbers. The letters are P, F, I , R, C repeated with the same order in all codes but the number of digits are varying in each code.
For example: consider the codes as below:
P1F2I235R15C145 P1 F2 I23 R15 C145
P24F1I12R124C96 P24 F1 I12 R124 C96
so in this way I can split each code into its constitutes sub-codes, and store these components into new columns in the same table.
thanks

Here's a possible stringi solution
library(stringi)
x <- c("P1F2I235R15C145","P24F1I12R124C96")
res <- stri_split_regex(x,"(?=([A-Za-z]=?))",perl = TRUE,simplify = TRUE,omit_empty = TRUE)
cbind.data.frame(x, res)
# x 1 2 3 4 5
# 1 P1F2I235R15C145 P1 F2 I235 R15 C145
# 2 P24F1I12R124C96 P24 F1 I12 R124 C96

Try this:
#simulate your data frame
df<-data.frame(code=c("P1F2I235R15C145","P24F1I12R124C96"),stringsAsFactors=FALSE)
#split the columns
cbind(df,do.call(rbind,regmatches(df$code,gregexpr("[PFIRC][0-9]+",df$code))))
# code 1 2 3 4 5
#1 P1F2I235R15C145 P1 F2 I235 R15 C145
#2 P24F1I12R124C96 P24 F1 I12 R124 C96
What #AnandaMatho suggested in the comment was to let the letter in front of the code go away and name the columns accordingly. Something like that:
res<-cbind(df,do.call(rbind,regmatches(df$code,gregexpr("(?<=[PFIRC])[0-9]+",df$code,perl=TRUE))))
names(res)<-c("Code","P","F","I","R","C")
# Code P F I R C
#1 P1F2I235R15C145 1 2 235 15 145
#2 P24F1I12R124C96 24 1 12 124 96

A data.table solution:
library(data.table)
dt<-data.table(code=c("P1F2I235R15C145","P24F1I12R124C96"))
dt[,c("P","F","I","R","C"):=
lapply(c("P","F","I","R","C"),
function(x)regmatches(code,regexpr(paste0(x,"[0-9]+"),code)))]
> dt
code P F I R C
1: P1F2I235R15C145 P1 F2 I235 R15 C145
2: P24F1I12R124C96 P24 F1 I12 R124 C96
And if you do end up deciding to drop the letters from the front, a minor adjustment:
dt[,c("P","F","I","R","C"):=
lapply(c("P","F","I","R","C"),
function(x)regmatches(code,regexpr(paste0("(?<=",x,")[0-9]+"),
code,perl=T)))]
> dt
code P F I R C
1: P1F2I235R15C145 1 2 235 15 145
2: P24F1I12R124C96 24 1 12 124 96
Or using devel version of data.table (v1.9.5+):
dt[, c("P", "F", "I", "R", "C") :=
tstrsplit(code, "(?<=.)(?=[[:alpha:]][0-9]+)", perl=TRUE)]
# code P F I R C
# 1: P1F2I235R15C145 P1 F2 I235 R15 C145
# 2: P24F1I12R124C96 P24 F1 I12 R124 C96

Perl: RegEX: Capture group multiple times

I'm developing a piece of code to filter a text as follows:
<DATA>
.SUBCKT SVI A B C D E F
+ G H I
+ J K L
.....
+ X Y Z
*.PININFO AA BB CC
*.PININFO DD EE FF
<DATA>
I need the output to be
A B C D E F
G H I
J K L
.....
X Y Z
I already made a regular expression to do so:
m/\.SUBCKT\s+SVI\s(.*)|\+(.*)/gm
The problem is that I have many similar sections like this input but I only need to detect + lines which are following .SUBCKT SVI header not any other header.
How I could match group many times like (\+\s+(.*)). I want to match this repeated capture group as it repeated many times.
Any advice to get this expression.

Perhaps this is closer to what you need.
m/\.SUBCKT\s+SVI\s(.*)\n(\+\s+(.*)\n)*/gm

Does this do what you want? Note that it stops at the ..... because it doesn't begin with a + or .SUBCKT
It won't handle the case where a range of + lines is immediately followed by another .SUBCKT line; is that a problem?
use strict;
use warnings;
while ( <DATA> ) {
next unless my $in_range = s/^\.SUBCKT\s+// ... /^[^+]/;
next if $in_range =~ /E/;
s/^\S+\s+//;
print;
}
__DATA__
<DATA>
.SUBCKT SVI A B C D E F
+ G H I
+ J K L
.....
+ X Y Z
*.PININFO AA BB CC
*.PININFO DD EE FF
<DATA>
output
A B C D E F
G H I
J K L
Update
Here's a state machine version that deals with the special case described above
use strict;
use warnings;
my $state;
while ( <DATA> ) {
if ( /^\.SUBCKT\s+\S+\s+(.+)/ ) {
$state = 1;
print $1, "\n";
}
elsif ( /^\+\s+(.+)/ ) {
print $1, "\n" if $state;
}
else {
$state = 0;
}
}
__DATA__
<DATA>
.SUBCKT SVI A B C D E F
+ G H I
+ J K L
.SUBCKT SVI A B C D E F
+ M N O
+ P Q R
*.PININFO AA BB CC
*.PININFO DD EE FF
<DATA>
output
A B C D E F
G H I
J K L
A B C D E F
M N O
P Q R

I made use of #shawnt00 answer and modified the regular expression and it made the job.
\.SUBCKT\s+SVI_TRX201TH\s(.*\n(\+\s+.*\n)*)

Extracting text strings using data.table in R

I have a data.table similar to the one as follows
Data
library(data.table)
DT <- structure(list(N = 1:6, VN = c("v1", "v3", "v6", "v7a", "v18",
"v23"), T1 = c("bigby (wolf)", "white", "red (rose)", "piggy (straw)",
"(curse) beast", "prince"), T2 = c("jack (bean)", "snow (dwarves)",
"beard (blue)", "bhageera (jungle) mowgli (book)", "beauty",
"glass (slipper)"), T3 = c("hk (34)", "VL (r45)", "tg (h5)",
"tt (HG) (45)", "gh", "vlp"), Val = c(36, 25, 0.84, 12, 78, 258
)), .Names = c("N", "VN", "T1", "T2", "T3", "Val"), class = "data.frame", row.names = c(NA,
-6L))
setDT(DT)
DT
N VN T1 T2 T3 Val
1: 1 v1 bigby (wolf) jack (bean) hk (34) 36.00
2: 2 v3 white snow (dwarves) VL (r45) 25.00
3: 3 v6 red (rose) beard (blue) tg (h5) 0.84
4: 4 v7a piggy (straw) bhageera (jungle) mowgli (book) tt (HG) (45) 12.00
5: 5 v18 (curse) beast beauty gh 78.00
6: 6 v23 prince glass (slipper) vlp 258.00
I want to extract all the strings within parentheses from columns T1 and T2 to a new column C.
I can do it to single rows as follows.
Rowwise calculations
setDF(DT)
dtf <- c("T1", "T2")
paste(unique(unlist(regmatches(DT[4,dtf], gregexpr("(?=\\().*?(?<=\\))", DT[4,dtf], perl=T)))), collapse=" ")
[1] "(straw) (jungle) (book)"
paste(unique(unlist(regmatches(DT[3,dtf], gregexpr("(?=\\().*?(?<=\\))", DT[3,dtf], perl=T)))), collapse=" ")
[1] "(rose) (blue)"
I am not able to get similar results using data.table.
Try with data.table
setDT(DT)
DT[, C := paste(unique(unlist(regmatches(get(dtf), gregexpr("(?=\\().*?(?<=\\))", get(dtf), perl=T)))), collapse=" ")]
How to use data.table to get the desired result?
Desired result
out <- structure(list(N = 1:6, VN = c("v1", "v3", "v6", "v7a", "v18",
"v23"), T1 = c("bigby (wolf)", "white", "red (rose)", "piggy (straw)",
"(curse) beast", "prince"), T2 = c("jack (bean)", "snow (dwarves)",
"beard (blue)", "bhageera (jungle) mowgli (book)", "beauty",
"glass (slipper)"), T3 = c("hk (34)", "VL (r45)", "tg (h5)",
"tt (HG) (45)", "gh", "vlp"), Val = c(36, 25, 0.84, 12, 78, 258
), C = c("(wolf) (bean)", "(dwarves)", "(rose) (blue)", "(straw) (jungle) (book)",
"(curse)", "(slipper)")), .Names = c("N", "VN", "T1", "T2", "T3",
"Val", "C"), class = "data.frame", row.names = c(NA, -6L))
out
N VN T1 T2 T3 Val C
1 1 v1 bigby (wolf) jack (bean) hk (34) 36.00 (wolf) (bean)
2 2 v3 white snow (dwarves) VL (r45) 25.00 (dwarves)
3 3 v6 red (rose) beard (blue) tg (h5) 0.84 (rose) (blue)
4 4 v7a piggy (straw) bhageera (jungle) mowgli (book) tt (HG) (45) 12.00 (straw) (jungle) (book)
5 5 v18 (curse) beast beauty gh 78.00 (curse)
6 6 v23 prince glass (slipper) vlp 258.00 (slipper)

You can use by and .SDcols to do this.
setDT(DT)
dtf <- c("T1", "T2")
DT[, C := paste(unique(unlist(regmatches(.SD, gregexpr("(?=\\().*?(?<=\\))", .SD, perl=T)))),
collapse=" "),
by = N,
.SDcols = dtf]
DT
## N VN T1 T2 T3 Val C
## 1: 1 v1 bigby (wolf) jack (bean) hk (34) 36.00 (wolf) (bean)
## 2: 2 v3 white snow (dwarves) VL (r45) 25.00 (dwarves)
## 3: 3 v6 red (rose) beard (blue) tg (h5) 0.84 (rose) (blue)
## 4: 4 v7a piggy (straw) bhageera (jungle) mowgli (book) tt (HG) (45) 12.00 (straw) (jungle) (book)
## 5: 5 v18 (curse) beast beauty gh 78.00 (curse)
## 6: 6 v23 prince glass (slipper) vlp 258.00 (slipper)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

QImage/QPixmap + .xpm file size limitations? - c++

Related

filtering the aminoacid sequence in a UniProt dat.-file

Allocated table variation

How to split a string between two character into sub groups in R

Perl: RegEX: Capture group multiple times

Extracting text strings using data.table in R

Categories

Resources