Weka Apriori RHS - weka
I am mining the associate rules using Apriori algorithm on the Titanic dataset using Weka v3.6. The 10 best rules are created as follows:
1. Class=Crew 885 ==> Sex=Male Age=Adult 862 conf:(0.97) < lift:(1.29)
2. Sex=Male Age=Adult 1667 ==> Class=Crew 862 conf:(0.52) < lift:(1.29)
3. Class=Crew 885 ==> Sex=Male 862 conf:(0.97) < lift:(1.24)
4. Sex=Male 1731 ==> Class=Crew 862 conf:(0.5) < lift:(1.24)
5. Sex=Male 1731 ==> Class=Crew Age=Adult 862 conf:(0.5) < lift:(1.24)
6. Class=Crew Age=Adult 885 ==> Sex=Male 862 conf:(0.97) < lift:(1.24)
7. Sex=Male Age=Adult 1667 ==> Survived=No 1329 conf:(0.8) < lift:(1.18)
8. Survived=No 1490 ==> Sex=Male Age=Adult 1329 conf:(0.89) < lift:(1.18)
9. Sex=Male 1731 ==> Age=Adult Survived=No 1329 conf:(0.77) < lift:(1.18)
10. Age=Adult Survived=No 1438 ==> Sex=Male 1329 conf:(0.92) < lift:(1.18)
However, I wish to restrict the rules with RHS containing Survived: "No","Yes"only. I know in R, RHS can be realized by
APappearance-class. Is is possible to achieve the similar functionality in Weka?
Related
All unique combinations of given length from list of values in Libre Office
I have several, let's say six, different values. Can be numbers from 1 to 6. I want to quickly list all unique combinations of four, so 1-2-3-4, 1-2-3-5 ... 3-4-5-6, all of them, but without any numbers showing more than once. I'd like to do it in Libre Office Calc or Libre Office Base, but thus far I haven't had much luck searching for a way to do it. I'd be really grateful for any ideas.
there you go: 1234 1235 1236 1243 1245 1246 1253 1254 1256 1263 1264 1265 1324 1325 1326 1342 1345 1346 1352 1354 1356 1362 1364 1365 1423 1425 1426 1432 1435 1436 1452 1453 1456 1462 1463 1465 1523 1524 1526 1532 1534 1536 1542 1543 1546 1562 1563 1564 1623 1624 1625 1632 1634 1635 1642 1643 1645 1652 1653 1654 2134 2135 2136 2143 2145 2146 2153 2154 2156 2163 2164 2165 2314 2315 2316 2341 2345 2346 2351 2354 2356 2361 2364 2365 2413 2415 2416 2431 2435 2436 2451 2453 2456 2461 2463 2465 2513 2514 2516 2531 2534 2536 2541 2543 2546 2561 2563 2564 2613 2614 2615 2631 2634 2635 2641 2643 2645 2651 2653 2654 3124 3125 3126 3142 3145 3146 3152 3154 3156 3162 3164 3165 3214 3215 3216 3241 3245 3246 3251 3254 3256 3261 3264 3265 3412 3415 3416 3421 3425 3426 3451 3452 3456 3461 3462 3465 3512 3514 3516 3521 3524 3526 3541 3542 3546 3561 3562 3564 3612 3614 3615 3621 3624 3625 3641 3642 3645 3651 3652 3654 4123 4125 4126 4132 4135 4136 4152 4153 4156 4162 4163 4165 4213 4215 4216 4231 4235 4236 4251 4253 4256 4261 4263 4265 4312 4315 4316 4321 4325 4326 4351 4352 4356 4361 4362 4365 4512 4513 4516 4521 4523 4526 4531 4532 4536 4561 4562 4563 4612 4613 4615 4621 4623 4625 4631 4632 4635 4651 4652 4653 5123 5124 5126 5132 5134 5136 5142 5143 5146 5162 5163 5164 5213 5214 5216 5231 5234 5236 5241 5243 5246 5261 5263 5264 5312 5314 5316 5321 5324 5326 5341 5342 5346 5361 5362 5364 5412 5413 5416 5421 5423 5426 5431 5432 5436 5461 5462 5463 5612 5613 5614 5621 5623 5624 5631 5632 5634 5641 5642 5643 6123 6124 6125 6132 6134 6135 6142 6143 6145 6152 6153 6154 6213 6214 6215 6231 6234 6235 6241 6243 6245 6251 6253 6254 6312 6314 6315 6321 6324 6325 6341 6342 6345 6351 6352 6354 6412 6413 6415 6421 6423 6425 6431 6432 6435 6451 6452 6453 6512 6513 6514 6521 6523 6524 6531 6532 6534 6541 6542 6543 PS: i don't think that there is a way to generate them in libre office, since i'm not aware of programming languages in that program, however you can compute them online or with a your script If you need the script, save this code in a .html file and open it in a browser <html> <body> <script> function finish(arr, n){ for(let el in arr) if(el != n) return true; return false; } function updateIndexes(arr, n){ for( i = 0; i < arr.length ; i++ ){ if(arr[i] < n-1){ arr[i]++; return true; } arr[i] = 0; } return false } let from = [1,2,3,4,5,6].map((el)=>el.toString()); let length = 4; let separator = '-' let indexes = Array(length).fill().map(el=>el=0); let results = []; do{ results.push(indexes.map(index => from[index]).join(separator)); } while (updateIndexes(indexes, from.length)); body = document.getElementsByTagName('body')[0]; results.filter((el)=>{ for(i = 0; i < el.length ; i++) for(j = i+1 ; j < el.length ; j++) if(el.charAt(i) == el.charAt(j) && el.charAt(i) != separator) return false; return true; }).forEach(el => body.innerHTML+= el.toString()+'<br>'); </script> </body> </html> what you can customize is: let from = [1,2,3,4,5,6]; to what numbers/letters you want let length = 4; to the length of the string you want let separator = '-' to the separator you want (the separator here intended is the one between each sequence generated, so in this case will be 1-2-3-4 for example)
Python has a library called itertools that does this. import itertools l = itertools.permutations(range(1,7), 4) # between 1 and 6 of length 4 for t in list(l): print("{}, ".format("-".join(str(i) for i in t)), end='') Result: 1-2-3-4, 1-2-3-5, 1-2-3-6, 1-2-4-3, 1-2-4-5, 1-2-4-6, 1-2-5-3, 1-2-5-4, 1-2-5-6, 1-2-6-3, 1-2-6-4, 1-2-6-5, 1-3-2-4, 1-3-2-5, 1-3-2-6, 1-3-4-2, 1-3-4-5, 1-3-4-6, 1-3-5-2, 1-3-5-4, 1-3-5-6, 1-3-6-2, 1-3-6-4, 1-3-6-5, 1-4-2-3, 1-4-2-5, 1-4-2-6, 1-4-3-2, 1-4-3-5, 1-4-3-6, 1-4-5-2, 1-4-5-3, 1-4-5-6, 1-4-6-2, 1-4-6-3, 1-4-6-5, 1-5-2-3, 1-5-2-4, 1-5-2-6, 1-5-3-2, 1-5-3-4, 1-5-3-6, 1-5-4-2, 1-5-4-3, 1-5-4-6, 1-5-6-2, 1-5-6-3, 1-5-6-4, 1-6-2-3, 1-6-2-4, 1-6-2-5, 1-6-3-2, 1-6-3-4, 1-6-3-5, 1-6-4-2, 1-6-4-3, 1-6-4-5, 1-6-5-2, 1-6-5-3, 1-6-5-4, 2-1-3-4, 2-1-3-5, 2-1-3-6, 2-1-4-3, 2-1-4-5, 2-1-4-6, 2-1-5-3, 2-1-5-4, 2-1-5-6, 2-1-6-3, 2-1-6-4, 2-1-6-5, 2-3-1-4, 2-3-1-5, 2-3-1-6, 2-3-4-1, 2-3-4-5, 2-3-4-6, 2-3-5-1, 2-3-5-4, 2-3-5-6, 2-3-6-1, 2-3-6-4, 2-3-6-5, 2-4-1-3, 2-4-1-5, 2-4-1-6, 2-4-3-1, 2-4-3-5, 2-4-3-6, 2-4-5-1, 2-4-5-3, 2-4-5-6, 2-4-6-1, 2-4-6-3, 2-4-6-5, 2-5-1-3, 2-5-1-4, 2-5-1-6, 2-5-3-1, 2-5-3-4, 2-5-3-6, 2-5-4-1, 2-5-4-3, 2-5-4-6, 2-5-6-1, 2-5-6-3, 2-5-6-4, 2-6-1-3, 2-6-1-4, 2-6-1-5, 2-6-3-1, 2-6-3-4, 2-6-3-5, 2-6-4-1, 2-6-4-3, 2-6-4-5, 2-6-5-1, 2-6-5-3, 2-6-5-4, 3-1-2-4, 3-1-2-5, 3-1-2-6, 3-1-4-2, 3-1-4-5, 3-1-4-6, 3-1-5-2, 3-1-5-4, 3-1-5-6, 3-1-6-2, 3-1-6-4, 3-1-6-5, 3-2-1-4, 3-2-1-5, 3-2-1-6, 3-2-4-1, 3-2-4-5, 3-2-4-6, 3-2-5-1, 3-2-5-4, 3-2-5-6, 3-2-6-1, 3-2-6-4, 3-2-6-5, 3-4-1-2, 3-4-1-5, 3-4-1-6, 3-4-2-1, 3-4-2-5, 3-4-2-6, 3-4-5-1, 3-4-5-2, 3-4-5-6, 3-4-6-1, 3-4-6-2, 3-4-6-5, 3-5-1-2, 3-5-1-4, 3-5-1-6, 3-5-2-1, 3-5-2-4, 3-5-2-6, 3-5-4-1, 3-5-4-2, 3-5-4-6, 3-5-6-1, 3-5-6-2, 3-5-6-4, 3-6-1-2, 3-6-1-4, 3-6-1-5, 3-6-2-1, 3-6-2-4, 3-6-2-5, 3-6-4-1, 3-6-4-2, 3-6-4-5, 3-6-5-1, 3-6-5-2, 3-6-5-4, 4-1-2-3, 4-1-2-5, 4-1-2-6, 4-1-3-2, 4-1-3-5, 4-1-3-6, 4-1-5-2, 4-1-5-3, 4-1-5-6, 4-1-6-2, 4-1-6-3, 4-1-6-5, 4-2-1-3, 4-2-1-5, 4-2-1-6, 4-2-3-1, 4-2-3-5, 4-2-3-6, 4-2-5-1, 4-2-5-3, 4-2-5-6, 4-2-6-1, 4-2-6-3, 4-2-6-5, 4-3-1-2, 4-3-1-5, 4-3-1-6, 4-3-2-1, 4-3-2-5, 4-3-2-6, 4-3-5-1, 4-3-5-2, 4-3-5-6, 4-3-6-1, 4-3-6-2, 4-3-6-5, 4-5-1-2, 4-5-1-3, 4-5-1-6, 4-5-2-1, 4-5-2-3, 4-5-2-6, 4-5-3-1, 4-5-3-2, 4-5-3-6, 4-5-6-1, 4-5-6-2, 4-5-6-3, 4-6-1-2, 4-6-1-3, 4-6-1-5, 4-6-2-1, 4-6-2-3, 4-6-2-5, 4-6-3-1, 4-6-3-2, 4-6-3-5, 4-6-5-1, 4-6-5-2, 4-6-5-3, 5-1-2-3, 5-1-2-4, 5-1-2-6, 5-1-3-2, 5-1-3-4, 5-1-3-6, 5-1-4-2, 5-1-4-3, 5-1-4-6, 5-1-6-2, 5-1-6-3, 5-1-6-4, 5-2-1-3, 5-2-1-4, 5-2-1-6, 5-2-3-1, 5-2-3-4, 5-2-3-6, 5-2-4-1, 5-2-4-3, 5-2-4-6, 5-2-6-1, 5-2-6-3, 5-2-6-4, 5-3-1-2, 5-3-1-4, 5-3-1-6, 5-3-2-1, 5-3-2-4, 5-3-2-6, 5-3-4-1, 5-3-4-2, 5-3-4-6, 5-3-6-1, 5-3-6-2, 5-3-6-4, 5-4-1-2, 5-4-1-3, 5-4-1-6, 5-4-2-1, 5-4-2-3, 5-4-2-6, 5-4-3-1, 5-4-3-2, 5-4-3-6, 5-4-6-1, 5-4-6-2, 5-4-6-3, 5-6-1-2, 5-6-1-3, 5-6-1-4, 5-6-2-1, 5-6-2-3, 5-6-2-4, 5-6-3-1, 5-6-3-2, 5-6-3-4, 5-6-4-1, 5-6-4-2, 5-6-4-3, 6-1-2-3, 6-1-2-4, 6-1-2-5, 6-1-3-2, 6-1-3-4, 6-1-3-5, 6-1-4-2, 6-1-4-3, 6-1-4-5, 6-1-5-2, 6-1-5-3, 6-1-5-4, 6-2-1-3, 6-2-1-4, 6-2-1-5, 6-2-3-1, 6-2-3-4, 6-2-3-5, 6-2-4-1, 6-2-4-3, 6-2-4-5, 6-2-5-1, 6-2-5-3, 6-2-5-4, 6-3-1-2, 6-3-1-4, 6-3-1-5, 6-3-2-1, 6-3-2-4, 6-3-2-5, 6-3-4-1, 6-3-4-2, 6-3-4-5, 6-3-5-1, 6-3-5-2, 6-3-5-4, 6-4-1-2, 6-4-1-3, 6-4-1-5, 6-4-2-1, 6-4-2-3, 6-4-2-5, 6-4-3-1, 6-4-3-2, 6-4-3-5, 6-4-5-1, 6-4-5-2, 6-4-5-3, 6-5-1-2, 6-5-1-3, 6-5-1-4, 6-5-2-1, 6-5-2-3, 6-5-2-4, 6-5-3-1, 6-5-3-2, 6-5-3-4, 6-5-4-1, 6-5-4-2, 6-5-4-3, LibreOffice allows Python scripting, so the code can be added to Calc or Base by including it in a Python-UNO macro.
pandas - group by: create aggregation function using multiple columns
I have the following data frame: id my_year my_month waiting_time target 001 2018 1 95 1 002 2018 1 3 3 003 2018 1 4 0 004 2018 1 40 1 005 2018 2 97 1 006 2018 2 3 3 007 2018 3 4 0 008 2018 3 40 1 I want to groupby my_year and my_month, then in each group I want to compute the my_rate based on (# of records with waiting_time <= 90 and target = 1)/ total_records in the group i.e. I am expecting output like: my_year my_month my_rate 2018 1 0.25 2018 2 0.0 2018 3 0.5 I wrote the following code to compute the desired value my_rate: def my_rate(data): waiting_time_list = data['waiting_time'] target_list = data['target'] total = len(data) my_count = 0 for i in range(len(data)): if total_waiting_time_list[i] <= 90 and target_list[i] == 1: my_count += 1 rate = float(my_count)/float(total) return rate df.groupby(['my_year','my_month']).apply(my_rate) However, I got the following error: KeyError 0 KeyErrorTraceback (most recent call last) <ipython-input-29-5c4399cefd05> in <module>() 17 ---> 18 df.groupby(['my_year','my_month']).apply(my_rate) /opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/groupby.pyc in apply(self, func, *args, **kwargs) 714 # ignore SettingWithCopy here in case the user mutates 715 with option_context('mode.chained_assignment', None): --> 716 return self._python_apply_general(f) 717 718 def _python_apply_general(self, f): /opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/groupby.pyc in _python_apply_general(self, f) 718 def _python_apply_general(self, f): 719 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 720 self.axis) 721 722 return self._wrap_applied_output( /opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/groupby.pyc in apply(self, f, data, axis) 1727 # group might be modified 1728 group_axes = _get_axes(group) -> 1729 res = f(group) 1730 if not _is_indexed_like(res, group_axes): 1731 mutated = True <ipython-input-29-5c4399cefd05> in conversion_rate(data) 8 #print total_waiting_time_list[i], target_list[i] 9 #print i, total_waiting_time_list[i], target_list[i] ---> 10 if total_waiting_time_list[i] <= 90:# and target_list[i] == 1: 11 convert_90_count += 1 12 #print 'convert ', convert_90_count /opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/series.pyc in __getitem__(self, key) 599 key = com._apply_if_callable(key, self) 600 try: --> 601 result = self.index.get_value(self, key) 602 603 if not is_scalar(result): /opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/indexes/base.pyc in get_value(self, series, key) 2426 try: 2427 return self._engine.get_value(s, k, -> 2428 tz=getattr(series.dtype, 'tz', None)) 2429 except KeyError as e1: 2430 if len(self) > 0 and self.inferred_type in ['integer', 'boolean']: pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4363)() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4046)() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5085)() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:13913)() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:13857)() KeyError: 0 Any idea what I did wrong here? And how do I fix it? Thanks!
I believe better is use mean of boolean mask per groups: def my_rate(x): return ((x['waiting_time'] <= 90) & (x['target'] == 1)).mean() df = df.groupby(['my_year','my_month']).apply(my_rate).reset_index(name='my_rate') print (df) my_year my_month my_rate 0 2018 1 0.25 1 2018 2 0.00 2 2018 3 0.50 Any idea what I did wrong here? Problem is waiting_time_list and target_list are not lists, but Series: waiting_time_list = data['waiting_time'] target_list = data['target'] print (type(waiting_time_list)) <class 'pandas.core.series.Series'> print (type(target_list)) <class 'pandas.core.series.Series'> So if want indexing it failed, because in second group are indices 4,5, not 0,1. if waiting_time_list[i] <= 90 and target_list[i] == 1: For avoid it is possible convert Series to list: waiting_time_list = data['waiting_time'].tolist() target_list = data['target'].tolist()
SAS - plot actual and ARIMA model
How to plot in SAS, the estimated ARIMA model with the actual data on the same graph? The plot I've got, using the code below, does not show the actual and the model clearly. The model estimated is MA(15). data project; input dj 1-6 aus 7-14; datalines; 3651 1962.2 3645 1977.1 3626 1968.4 3634 1952.0 3620.5 1962.5 3607 1967.8 3589 1939.5 3590 1931.4 3622 1941.5 3634 1938.3 3616 1912.9 3634 1903.6 3631 1902.6 3613 1925.5 3576 1924.1 3537 1925.2 3547 1919.3 3540 1928.6 3543 1946.5 3568 1943.0 3566 1942.3 3566 1951.4 3555 1964.4 3581 1972.7 3578 1977.0 3587 1998.5 3599 2018.8 3584 2022.5 3585 2026.2 3593 2039.8 3593 2028.0 3603 2038.6 3622 2062.0 3630 2074.1 3642 2085.5 3635 2075.5 3645 2051.7 3636 2060.4 3649 2061.4 3674 2046.9 3672 2055.7 3665 2068.3 3688 2076.3 3681 2112.2 3693 2132.4 3698 2125.3 3662 2108.4 3625 2101.6 3643 2079.9 3648 2054.2 3640 2050.8 3664 2042.9 3662 2052.4 3684 2074.0 3678 2082.9 3711 2083.8 3704 2104.3 3685 2108.0 3694 2083.2 3670 2049.3 3674 2009.6 3688 2032.4 3686 2042.0 3684 2043.1 3678 2010.3 3684 2009.4 3697 2005.4 3702 2047.3 3704 2047.4 3710 2053.7 3719 2073.9 3734 2096.0 3730 2095.7 3741 2084.9 3764 2094.5 3743 2086.6 3717 2069.9 3726 2074.8 3752 2080.2 3755 2076.0 3745 2067.0 3762 2053.2 3758 2068.8 3776 2089.2 3794 2126.9 3776 2154.5 3757 2173.6 3784 2174.3 3799 2193.4 3804 2200.3 3821 2186.0 3866 2198.6 3850 2206.7 3849 2195.6 3842 2177.5 3867 2206.4 3870 2238.2 3870 2232.1 3884 2248.2 3892 2266.2 3914 2250.3 3913 2224.5 3895 2221.9 3926 2250.7 3945 2259.9 3978 2310.8 3964 2310.1 3976 2312.1 3968 2340.6 3871 2332.8 3906 2281.1 3906 2305.4 3932 2270.9 3895 2234.3 3895 2241.4 3904 2238.6 3928 2234.0 3937 2249.0 3923 2240.9 3888 2223.2 3900 2178.5 3912 2202.5 3892 2218.9 3840 2197.0 3839 2148.8 3832 2180.1 3809 2181.7 3832 2154.0 3824 2151.4 3832 2116.8 3856 2144.7 3852 2171.7 3853 2146.8 3831 2155.1 3863 2153.1 3863 2179.3 3850 2172.5 3848 2173.5 3865 2164.4 3896 2163.5 3865 2140.5 3863 2140.8 3869 2180.9 3821 2169.8 3775 2151.6 3762 2108.9 3699 2100.8 3627 2092.4 3636 2053.1 3675 2050.0 3680 2084.1 3693 2087.4 3674 2082.0 3689 2076.0 3682 2095.1 3662 2114.7 3663 2095.0 3662 2080.6 3620 2095.9 3620 2061.4 3599 2046.6 3653 2029.6 3649 2042.5 3700 2069.4 3684 2059.7 3668 2069.1 3682 2066.1 3701 2047.9 3714 2044.2 3698 2018.4 3696 1988.1 3670 2004.3 3629 2009.3 3656 2008.2 3629 2034.6 3653 2041.4 3660 2070.0 3672 2110.9 3721 2096.0 3733 2107.8 3759 2093.7 3766 2103.9 3742 2121.0 3745 2132.4 3755 2105.9 3754 2096.9 3757 2102.2 3757.5 2091.8 3758 2081.8 3761 2097.2 3759 2077.0 3772 2078.6 3768 2072.5 3756 2070.2 3749 2079.7 3753 2076.7 3773 2069.4 3815 2076.6 3790 2074.4 3811 2056.0 3777 2051.2 3742 2024.4 3708 1993.6 3725 2010.9 3699 2022.5 3637 2017.9 3686 1957.4 3670 1974.4 3667 1975.1 3625 1989.1 3647 1965.8 3649.5 1987.1 3652 2003.4 3674 1991.2 3688 1962.2 3709 1964.9 3703 1961.2 3703 1972.9 3704 1978.6 3739 2007.7 3754 2058.0 3755 2072.3 3748 2077.4 3727 2078.6 3732 2049.2 3735 2052.5 3742 2048.3 3736 2041.3 3720 2041.7 3731 2042.1 3764 2061.5 3798 2082.1 3796 2086.9 3793 2072.3 3766 2083.5 3747 2091.9 3754 2081.1 3756 2086.8 3767 2076.5 3751 2062.8 3769 2052.0 3760 2055.9 3785 2040.0 3776 2059.5 3755 2066.8 3755 2061.3 3751 2063.6 3776 2051.6 3847 2061.1 3830 2077.8 3881 2077.2 3899 2111.8 3917 2116.5 3913 2122.1 3901 2105.5 3886 2107 3892.5 2095.5 3899 2103.6 3886 2104.4 3908 2089.1 3875 2070.6 3860 2032.9 3880 2043.6 3895 2050.5 3954 2050.8 3933 2059 3937 2049.1 3869 2045.1 3852 2026.6 3837 2028.2 3832 2027.7 3849 2030 3863 2013.8 3878 2014.2 3855 2030.6 3843 2028.7 3847 2030.9 3801 1998 3787 1979.8 3776 1976.3 3797 1967.5 3821 1988 3877 2003.6 3875 2002.6 3890 1998.9 3910 2006 3924 2014.2 3918 2003.4 3936 2013.4 3911 2016.3 3891 2034.6 3855 2034.2 ; run; proc print data= project; run; proc arima data=project; identify var=aus run; proc arima data = project plots(only)=(forecast(FORECAST)); identify var=aus(1) nlag=20; estimate q=(1,15); forecast lead = 90 out= results; run;
I'd suggest using the new SAS GTL (graphics template language) to do it. Plenty of examples can be found in the SAS documentation: http://support.sas.com/documentation/cdl/en/grstatgraph/65377/HTML/default/viewer.htm#p07ssfftzsass9n1x8lb94xnref5.htm Note that GTL is only available in newer versions of SAS (9.1 onwards I believe?).
Reducing thousands of compiler warnings
I have just started working with C++ code compiled in Visual Studio 2008. The default warning level on the project was set to 3 and there were no warnings. I turned this up to level 4, and it turns out that there are about 35000 warnings in our code. The majority of these warnings are unreferenced formal parameters, which I'd like to remove eventually. In the meantime, I would like to make sure that any level 3 or lower warnings stand out from the crowd, so I was wondering if there was a way of making these particular warnings be treated as errors. I'm aware that specific warnings can be tagged as errors, but I can't find any listings for error numbers. I was wondering if anybody might have any suggestions about how to deal with this?
You could make two separate build configurations, one showing warnings level 3, and one shows level 4 as well. Then when you're not working on fixing warnings, use the level 3 configuration. If you do go down this route, you might want to look into using property sheets, so you can reuse as much as possible of the configuration, instead of having to duplicate it. I don't think there's any way to treat warnings L1-3 as errors, while still allowing/showing L4 warnings.
I have now compiled a list of all the warnings at different levels. I used http://msdn.microsoft.com/en-us/library/8x5x43k7(v=VS.90).aspx as my reference. Here they are, use at your own peril: // level 2 & 3: #pragma warning ( error : 4008 ) // level 1 & 3 #pragma warning (error : 4793 ) // level 1 & 4 #pragma warning (error : 4112 4115 4223 4355 4949 4700) //level 2 & 4 #pragma warning (error : 4200) // level 3 & 4 #pragma warning (error : 4244) // nolevel: #pragma warning (error : 4335 4368 4394 4430 4439 4484 4485 4687 4693 4694 4801 4867) // level 1 #pragma warning (error : 4002 4003 4005 4006 4010 4015 4020 4022 4024 4025 4026 4027 4028 4029 4030 4031 4033 4034 4036 4038 4041 4042 4045 4047 4048 4049 4052 4054 4055 4067 4068 4074 4075 4076 4077 4079 4080 4081 4083 4085 4086 4087 4088 4089 4090 4091 4096 4097 4098 4103 4109 4113 4114 4116 4117 4119 4120 4122 4124 4129 4137 4138 4141 4142 4143 4144 4145 4153 4154 4155 4157 4158 4160 4162 4163 4164 4165 4166 4167 4168 4172 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4185 4186 4187 4190 4215 4216 4218 4224 4226 4227 4228 4229 4230 4237 4251 4258 4264 4269 4272 4273 4276 4286 4288 4291 4293 4297 4303 4305 4311 4312 4313 4318 4319 4325 4326 4329 4333 4340 4342 4344 4346 4348 4350 4351 4353 4358 4364 4369 4374 4375 4376 4377 4378 4379 4381 4382 4383 4384 4391 4392 4393 4395 4397 4399 4401 4402 4403 4405 4406 4407 4409 4410 4411 4420 4440 4441 4445 4461 4470 4482 4486 4488 4489 4490 4502 4503 4506 4508 4518 4519 4526 4530 4532 4533 4537 4539 4540 4541 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4555 4556 4558 4561 4566 4572 4581 4584 4600 4602 4606 4612 4613 4615 4616 4618 4620 4621 4624 4628 4630 4631 4632 4650 4651 4652 4655 4656 4657 4659 4661 4662 4667 4669 4674 4677 4678 4679 4683 4684 4685 4688 4691 4692 4711 4715 4716 4717 4722 4727 4729 4730 4731 4733 4734 4739 4742 4743 4744 4747 4772 4788 4789 4794 4799 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4817 4819 4821 4822 4829 4832 4835 4836 4900 4905 4906 4912 4917 4920 4925 4926 4927 4928 4929 4930 4935 4939 4944 4945 4946 4947 4951 4952 4953 4955 4964 4965 4997 4999) // level 2 #pragma warning (error : 4007 4051 4056 4094 4099 4146 4150 4156 4244 4250 4275 4285 4302 4307 4308 4309 4345 4356 4412 4653 4756 4826 4948) // level 3 #pragma warning (error : 4013 4018 4023 4060 4062 4065 4066 4069 4073 4101 4102 4133 4159 4161 4191 4192 4197 4231 4240 4243 4265 4267 4278 4280 4281 4282 4283 4287 4290 4306 4310 4334 4341 4357 4359 4390 4398 4404 4414 4509 4511 4520 4521 4522 4523 4534 4535 4538 4543 4554 4557 4570 4580 4608 4619 4622 4633 4635 4636 4637 4638 4640 4641 4645 4646 4686 4723 4724 4738 4748 4792 4800 4823 4980 4995 4996) // level 4 #pragma warning (error : 4001 4019 4032 4053 4057 4061 4063 4064 4092 4100 4121 4125 4127 4130 4131 4132 4152 4189 4201 4202 4203 4204 4205 4206 4207 4208 4210 4211 4212 4213 4214 4220 4221 4232 4233 4234 4235 4238 4239 4242 4245 4254 4255 4256 4263 4266 4268 4289 4295 4296 4324 4336 4337 4339 4343 4347 4365 4366 4389 4400 4408 4428 4429 4431 4432 4433 4434 4460 4480 4481 4487 4505 4510 4512 4513 4514 4515 4516 4517 4536 4559 4564 4565 4571 4610 4611 4623 4625 4626 4629 4634 4639 4668 4670 4672 4673 4680 4681 4682 4690 4701 4702 4706 4709 4710 4714 4718 4725 4740 4764 4815 4816 4820 4913 4918 4931 4932 4937 4938 4960)
improve my code for collapsing a list of data.frames
Dear StackOverFlowers (flowers in short), I have a list of data.frames (walk.sample) that I would like to collapse into a single (giant) data.frame. While collapsing, I would like to mark (adding another column) which rows have came from which element of the list. This is what I've got so far. This is the data.frame that needs to be collapsed/stacked. > walk.sample [[1]] walker x y 1073 3 228.8756 -726.9198 1086 3 226.7393 -722.5561 1081 3 219.8005 -728.3990 1089 3 225.2239 -727.7422 1032 3 233.1753 -731.5526 [[2]] walker x y 1008 3 205.9104 -775.7488 1022 3 208.3638 -723.8616 1072 3 233.8807 -718.0974 1064 3 217.0028 -689.7917 1026 3 234.1824 -723.7423 [[3]] [1] 3 [[4]] walker x y 546 2 629.9041 831.0852 524 2 627.8698 873.3774 578 2 572.3312 838.7587 513 2 633.0598 871.7559 538 2 636.3088 836.6325 1079 3 206.3683 -729.6257 1095 3 239.9884 -748.2637 1005 3 197.2960 -780.4704 1045 3 245.1900 -694.3566 1026 3 234.1824 -723.7423 I have written a function to add a column that denote from which element the rows came followed by appending it to an existing data.frame. collapseToDataFrame <- function(x) { # collapse list to a dataframe with a twist walk.df <- data.frame() for (i in 1:length(x)) { n.rows <- nrow(x[[i]]) if (length(x[[i]])>1) { temp.df <- cbind(x[[i]], rep(i, n.rows)) names(temp.df) <- c("walker", "x", "y", "session") walk.df <- rbind(walk.df, temp.df) } else { cat("Empty list", "\n") } } return(walk.df) } > collapseToDataFrame(walk.sample) Empty list Empty list walker x y session 3 1 -604.5055 -123.18759 1 60 1 -562.0078 -61.24912 1 84 1 -594.4661 -57.20730 1 9 1 -604.2893 -110.09168 1 43 1 -632.2491 -54.52548 1 1028 3 240.3905 -724.67284 1 1040 3 232.5545 -681.61225 1 1073 3 228.8756 -726.91980 1 1091 3 209.0373 -740.96173 1 1036 3 248.7123 -694.47380 1 I'm curious whether this can be done more elegantly, with perhaps do.call() or some other more generic function?
I think this will work... lengths <- sapply(walk.sample, function(x) if (is.null(nrow(x))) 0 else nrow(x)) cbind(do.call(rbind, walk.sample[lengths > 1]), session = rep(1:length(lengths), ifelse(lengths > 1, lengths, 0)))
I'm not claiming this to be the most elegant approach, but I think it is working library(plyr) ldply(sapply(1:length(walk.sample), function(i) if (length(walk.sample[[i]]) > 1) cbind(walk.sample[[i]],session=rep(i,nrow(walk.sample[[i]]))) ),rbind) EDIT After applying Marek's apt remarks do.call(rbind,lapply(1:length(walk.sample), function(i) if (length(walk.sample[[i]]) > 1) cbind(walk.sample[[i]],session=i) ))