How to create a multilevel (numbers/alphabetical/Roman) list in Wiki? - wiki

I am writing some documentation about our algorithm and it's flow and I am doing it in Wiki.
Please advise how can I create a multilevel list with
level 1 = 1,2,3...
level 2 = a,b,c...
level 3 = i,ii,iii...
level 4 = I,II,III...
I know I can use #, ## but I need a mixed one.
Thanks in advance!

The best way is to use Template:ordered list, it give you the ability to create a specific lists. For example if you need to make lower alpha style, use this code:
{{ordered list|type=lower-alpha
| Hello
| Word
}}
It will produce the following:
a.Hello
b.Word
See the other styles here: https://www.w3.org/TR/REC-CSS2/generate.html#lists
For more; see this page: https://en.wikipedia.org/wiki/Help:List#Changing_the_list_type

Related

one line regex independent the number of items

Can I have a one-line regex code that matches the values between a pipe line "|" independent of the number if items between the pipe lines. E.g. I have the following regex:
^(.*?)\|(.*?)\|(.*?)\|(.*)\|(.*)\|(.*)\|(.*)\|(.*)\|(.*)\|(.*)\|(.*)\|(.*)$
which works only if I have 12 items. How can I make the same work for e.g. 6 items as well?
([^|]+)+
This is the pattern I've used in the past for that purpose. It matches 1 or more group that does not contain the pipe delimeter.
For Adobe Classification Rule Builder (CRB), there is no way to write a regex that will match an arbitrary number of your pattern and push them to $n capture group. Most regex engines do not allow for this, though some languages offer certain ways to more or less effectively do this as returned arrays or whatever. But CRB doesn't offer that sort of thing.
But, it's mostly pointless to want this anyways, since there's nothing upstream or downstream that really dynamically/automatically accommodates this sort of thing anyways.
For example, there's no way in the CRB interface to dynamically populate the output value with an arbitary $1$2$3[$n..] value, nor is there a way to dynamically generate an arbitrary number of rules in the rule set.
In addition, Adobe Analytics (AA) does not offer arbitrary on-the-fly classification column generation anyways (unless you want to write a script using the Classification API, but you can't say the same for CRBs anyways).
For example if you have
s.eVar1='foo1|foo2';
And you want to classify this into 2 classification columns/reports, you have to go and create them in the classification interface. And then let's say your next value sent in is:
s.eVar1='foo1|foo2|foo3';
Well AA does not automatically create a new classification level for you; you have to go in and add a 3rd, and so on.
So overall, even though it is not possible to return an arbitrary number of captured groups $n in a CRB, there isn't really a reason you need to.
Perhaps it would help if you explain what you are actually trying to do overall? For example, what report(s) do you expect to see?
One common reason I see this sort of "wish" come up for is when someone wants to track stuff like header or breadcrumb navigation links that have an arbitrary depth to them. So they push e.g. a breadcrumb
Home > Electronics > Computers > Monitors > LED Monitors
...or whatever to an eVar (but pipe delimited, based on your question), and then they want to break this up into classified columns.
And the problem is, it could be an arbitrary length. But as mentioned, setting up classifications and rules for them doesn't really accommodate this sort of thing.
Usually the best practice for a scenario like this is to to look at the raw data and see how many levels represents the bulk of your data, on average. For example if you look at your raw eVar report and see even though upwards of like 5 or 6 levels in the values can be found, but you can also see that most of values on average are between 1-3 levels, then you should create 4 classification columns. The first 3 classifications represent the first 3 levels, and the 4th one will have everything else.
So going back to the example value:
Home|Electronics|Computers|Monitors|LED Monitors
You can have:
Level1 => Home
Level2 => Electronics
Level3 => Computers
Level4+ => Monitors|LED Monitors
Then you setup a CRB with 4 rules, one for each of the levels. And you'd use the same regex in all 4 rule rows:
^([^|]+)(?:\|([^|]+))?(?:\|([^|]+))?(?:\|(.+))?
Which will return the following captured groups to use in the CRB outputs:
$1 => Home
$2 => Electronics
$3 => Computers
$4 => Monitors|LED Monitors
Yeah, this isn't the same as having a classification column for every possible length, but it is more practical, because when it comes to analytics, you shouldn't really try to be too granular about things in the first place.
But if you absolutely need to have something for every possible amount of delimited values, you will need to find out what the max possible is and make that many, hard coded.
Or as an alternative to classifications, consider one of the following alternatives:
Use a list prop
Use a list variable (e.g. list1)
Use a Merchandising eVar (product variable syntax)
This isn't exactly the same thing, and they each have their caveats, but you didn't provide details for what you are ultimately trying to get out of the reports, so this may or may not be something you can work with.
Well anyways, hopefully some of this is food for thought for you.

How to set the alignment, or how to find PyGObject documentation

I am trying to right-align some text content in a CellRenderer. Through several searches I found two approaches, but these do not work. With bold-setting, you need to enable this feature first, so I am guessing I also need to enable alignment setting, but have not found how to do this. This runs without exception:
r = Gtk.CellRendererText()
r.props.width_chars = 10
r.set_property('alignment', Pango.Alignment.RIGHT) # no effect
r.props.alignment = Pango.Alignment.RIGHT # no effect
r.props.weight_set = True # not really needed
r.props.weight = Pango.Weight.BOLD # works, output is bold
This is what I guessed from the bold example but does NOT work:
r.props.alignment_set = True
The error is: 'gi._gobject.GProps' object has no attribute 'alignment_set'
Looking at these references I do not find something on GProps:
GObject Ref Manual
Gtk3 Ref Manual
This resource does say something about alignment, but it is unclear to me how to convert this C code to Python/GObject:
Gnome Gtk3 Manual for C
My question is how to fix this problem, where is the ref manual for this PyGObject error message, or how should I code the right-alignment?
Update:
I am currently looking at this similar SO question for clues.
As for the comment of TingPing, I looked at the set_alignment() method,and tried these:
r.set_alignment(Pango.Alignment.RIGHT) # error: set_alignment() takes exactly 3 arguments (2 given)
r.set_alignment(200, 0) # no error, no effect
besides this method seems intended to create some pixels padding at the left, which is not what I need: align text to the right of the cell space.
Update:
Perhaps the above code is good, but perhaps the CellRenderer() has no intrinsic width, no excess space to put to the left of the content. I thought of this because of thinking of simply left-padding my numberic cell content with spaces. Then I need to decide on the maximum field length. Perhaps the CellRenderer does not 'know' about a default field length. I added a settinge to .props.width_chars, but this did unfortunately not cause any rignt-alignment.
Through this C example I tried this to right-align:
r.props.xalign = 1.0
and it works! The xalign is the fraction of free space 0..1.0 to put to the left of the text. The value 0.5 will center the text.

The most performant way to merge N lists, track duplicates, and sort them by date

I am new to Haskell and am wanting to know the most efficient way to merge an arbitrary number of lists of an arbitrary number of items. Here's example data:
LIST 1: steve
2014-01-20 | cookies | steve
LIST 2: chris
2014-02-05 | cookies | chris
LIST 3: mark
2014-09-30 | brownies | mark
2014-03-30 | candy | mark
2014-05-12 | pie | mark
LIST 4: anthony
2014-05-18 | cookies | anthony
2013-12-25 | fudge | anthony
LIST 5: andy
2014-10-04 | cookies | andy
LIST 7: john
2014-06-19 | pie | john
RESULTING LIST
2014-10-04 | cookies | andy chris steve anthony
2014-09-30 | brownies | mark
2014-06-19 | pie | john mark
2014-03-30 | candy | mark
2013-12-25 | fudge | anthony
Notice the lists are all oriented around people and may or may not be sorted by date, and the result needs to merge the prior lists, group and create a list where the dessert is unique but has a list of the constituent people who ate it, sorted by date reverse chronologically.
What is the most performant way to solve a problem, is in most cases not answerable neither in haskell nor in any other programming language I think.
A better approach would be to think about, how can I solve this problem (at all) and keep a few principles in the back of your mind.
testability
abstraction and expressiveness
maintainability
readability
performance
Maybe I've forgot about something but for your problem I want to give a hintlist
If I know all the items and names in advance I would use algebraic datatypes to model this situation
data Name = Mark | Chris ...
deriving (Ord,Eq,Show)
data Items = Pie | Cookies ...
deriving (Ord,Eq,Show)
If I do not already know how haskell represents a date datatype I can use a plain old String to model this, or I would use hoogle to see if there already exists a date-thingy.
> hoogle date
...
Data.Time.Calendar...
...
So I guess the Data.Time.Calendar module seems a good choice for that, and I would look at its documentation which can both be found online or if you install the package locally you can use haddock to generate it yourself from the source files.
Next step I would approach is to model the "database" of course there exists libraries to work with sqly stuff or acid-state a database that uses algebraic datatypes instead of a database backend. But for getting a better grasp of haskell I would try to reinvent the wheel for once and use either a list of tupels, or a dictionary-like collection, which is in haskell called Map. But working with Map one has to be careful and do a qualified import as most of its provided functions would lead to a name collision with the functions in the standard library (Prelude).
import qualified Map as M
and to model my database I would use the Items as keys and a tuple of date and list of names as the values and as I want to be aware that this is my database I would provide a type alias for that.
type DB = M.Map Item (Date, [Name])
For working with that I would again have a glance at the Map docu and be happy to find the functions insertWith, empty and toList. And for the insertWith functions I would think of a mixture of max and list cons (:) functions to make new entries.
To get a better feel for the whole thing I would fire up ghci and import qualified Data.Map as M and fool around with some examples using M.Map String (String,[Int]) or whatnot to model my data in a first approximation.
For the result I have to sort the toList of my Map by date, which is just a little problem. The type of my toList myDb is [(Item,(Date,[Name]))] so sorting by the fst.snd with sortBy should lead to the desired result.
After I'd done this much, I'd take a break and read something about parsers - to get all my files in context with my program. A search with the search engine of your least distrust will turn up a few articles worth reading (Parser Parsec Haskell).
If all of this is too complicated I would go back and change all my types to be Strings and hope I wouldn't have any typeos until I had time to read again about parsers ;-).
For any problems in the intermediate steps people here will be glad to help you, assumed you provide a concrete question/problem description.
If all of this were not performant enough, the profiling tools provided by haskell are good enough to help me, but this is my last concern to solve.

Best practices for managing workarounds (for broken data)

I have to work with government-provided data that is sometimes broken in strange ways. My code already contains snippets like:
for row in governmental_data:
# XXX Workaround for that one row among thousands
# that was mislabeled by a clerk and will not be fixed
# before form A-320-Tango-5 is completed and submitted
# on the first Sunday after a solstice.
if row is the_spawn_of_satan:
row = fix_row_A320(row)
# XXX end of workaround
process_row(row)
which before the error was just
for row in governmental_data:
process_row(row)
I can not make a mirror of the data with applied fixes, because the data is dynamic.
What can I do to manage these workarounds as they grow in number? Are there any best practices (besides "do not provide broken data to begin with")?
I suggest use Decorator Design Pattern for handling this data conversion issue. Wikipedia page
has a coffee making example. In the same way I suggest every data conversion should be decorator which takes a row and makes some operations on it and gives back a row. This design pattern is well established one. Intercepting filters design pattern is similar to this idea which is implemented both in java (servlet filters) and .net (Asp.Net Mvc Filters).
Your code should be as following
listOfDataConversionFilters = [XXXWorkaround,formA_320Tango5,...]
for row in governmental_data:
for filter in listOfDataConversionFilters
filteredRow = filter(row)
process_row(filteredRow)

Is it possible to detect and handle string collisions among grouped values when grouping in Hadoop Pig?

Assuming I have lines of data like the following that show user names and their favorite fruits:
Alice\tApple
Bob\tApple
Charlie\tGuava
Alice\tOrange
I'd like to create a pig query that shows the favorite fruit of each user. If a user appears multiple times, then I'd like to show "Multiple". For example, the result with the data above should be:
Alice\tMultiple
Bob\tApple
Charlie\tGuava
In SQL, this could be done something like this (although it wouldn't necessarily perform very well):
select user, case when count(fruit) > 1 then 'Multiple' else max(fruit) end
from FruitPreferences
group by user
But I can't figure out the equivalent PigLatin. Any ideas?
Write a "Aggregate Function" Pig UDF (scroll down to "Aggregate Functions"). This is a user-defined function that takes a bag and outputs a scalar. So basically, your UDF would take in the bag, determine if there is more than one item in it, and transform it accordingly with an if statement.
I can think of a way of doing this without a UDF, but it is definitely awkward. After your GROUP, use SPLIT to split your data set into two: one in which the count is 1 and one in which the count is more than one:
SPLIT grouped INTO one IF COUNT(fruit) == 0, more IF COUNT(fruit) > 0;
Then, separately use FOREACH ... GENERATE on each to transform it:
one = FOREACH one GENERATE name, MAX(fruit); -- hack using MAX to get the item
more = FOREACH more GENERATE name, 'Multiple';
Finally, union them back:
out = UNION one, more;
I haven't really found a better way of handing the same data set in two different ways based on some conditional, like you want. I typically do some sort of split/recombine like I did here. I believe Pig will be smart and make a plan that doesn't use more than 1 M/R job.
Disclaimer: I can't actually test this code at the moment, so it may have some mistakes.
Update:
In looking harder, I was reminded of the bicond operator and I think that will work here.
b = FOREACH a GENERATE name, (COUNT(fruit)==1 ? MAX(FRUIT) : 'Multiple');