How to put Amazon Mechanical Turkers in a waitlist? - django

I am a social psychology researcher. I have developed an online game that identifies players' specific behavioral factors. Each game requires a specific number of players to play simultaneously. In addition, all players should pass a screening phase through which we identify their skills and my program matches players with similar skills to play with each other.
My problem is how to make the players go through the screening phase and wait for others to pass the screening phase before starting the games? Is there anything on MTurk like a wait list? What is the average number of users who participate in a typical study at the same time? Is it possible to make them wait till we reach a specific number of players in the wait list?

Otree supports this functionality. As user Thomas pointed out, wait pages accomplish it.
In particular, when defining a View (same you would in Django) you just specify the variable wait_for_all_groups to be True. The app will not progress until all participants arrive. The definition of the wait page can be a simple as below, but you can also trigger methods like the reassignment of groups with wait pages.
class MyWait(WaitPage):
wait_for_all_groups=True
In the past, I have put such a wait page at the beginning of the instructions for the game.

Related

Redis for handling high-concurrecy and limited-capacity model?

I have a legacy system for managing courses at the university. Every half year, this happens:
limited capacity course (30 people) opens
1000 people trying to enroll in that course at the same time (literally waiting at computers to hit the "enroll" button at 8:00am sharp)
dozens/hundreds of courses like that, thousands of people in the system fighting for free slots at the same time
system goes down...
I wonder if Redis could help here. I cannot replace the legacy system (PHP based). I cannot spread the load either - all people have to have equal opportunity here.
My questions, please:
is Redis a good solution here?
Which data types and commands would you use for this use case? A rough outline of potential solution would be highly appreciated. I think it would be something with INCR but nor sure how to put it together with the rest.
Can this be realistically handler in (semi)real-time? i.e. if 1000 ppl hit the enroll button, 30 of them get the "yes" answer immediately, and the rest gets the "no" answer also immediately (matter of seconds, at most)
Thank you very much!

Is Process Mining used just to infer business process models?

I have been searching about mining event logs (Process Mining). I wonder if there are other uses besides infering the process model (eg. improving the process). Until now I haven't found any other practical application. Can someone recommend me authors, publications about it (if there is other application), or recommend any keywords that I can search for to find it. Thank you!
Please take a look at this twitter thread: https://twitter.com/JorgeMunozGama/status/1236967153825275904
Many interesting applications from soccer analyzing to wind turbines monitoring.
I would suggest having a look at this wonderful book: https://www.springer.com/gp/book/9783662498507
It gives a detailed understanding of process mining and its applications.
Three alternative uses of process mining other than creating business models are:
Discovering patient pathways (a patient moving between different healthcare providers or different departments in a hospital). This information may also be relevant for parties that are not healthcare providers themselves, for example the insurer. This can also help with fraud detection. For example, if the process map shows that procedure X (for example an x-ray) is usually followed by procedure Y (for example a knee operation), and the insurer finds that in certain cases procedure Y is done but X is missing, that may be indicative of a type of fraud. In this example, process mining can easily identify all the cases that had an x-ray but never showed op for a knee operation.
Discovering networks (who refers work to who and at what intensity). In this case you do not use the product for the 'activity' column in the event log, but instead label the name of the provider as the 'activity' column in the event log. This is also known as a 'work hand over' map. It is slightly different than a regular process map because it does not visualize activities anymore, but instead visualizes the flow between key players.
Process mining allows for exact quantifications of throughput times and bottlenecks, which regular BPMN models can not do.
Process mining can be used to obtain process models, even if there are no event logs to be mined.
See the BPMN Sketch Miner for how to do so.

Can Amazon Alexa Skills Kit (ASK) detect where it was interrupted (if it was)?

I want to write an Alexa skill that would read a list of items out to me and let me interrupt when I wanted and have the backend know where I was in the list that was interrupted.
For example:
Me: Find me a news story about pigs.
Alexa: I found 4 news stories about pigs. The first is titled 'James the pig goes to Mexico', the second is titled 'Pig Escapes Local Farm' [I interrupt]
Me: Tell me about that.
Alexa: The article is by James Watson, is dated today, and reads, "Johnny the Potbelly Pig found a hole in the fence and..."
I can't find anything to indicate that my code can know where an interruption occurs. Am I missing it?
I believe you are correct: the ASK does not provide any way to know when you were interrupted, however, this is all happening in real-time so you could figure it out by observing the amount of time that passes between doing the first ASK 'tell' (ie. where you call context.success( response )), and when you receive the "Tell me that" intent.
Note that the time it takes to read in US-en could be different then for US-gb, so you'll have to do separate calibrations. Also, you might have to add some pauses into your speech text to improve accuracy since there will of course be some variability in the results due to processing times.
If you are using a service like AWS Lambda or Google App Engine that add extra latency when there are no warm instances available, then you will probably need to take that into account.

How do you model a business workflow in ColdFusion?

Since there's no complete BPM framework/solution in ColdFusion as of yet, how would you model a workflow into a ColdFusion app that can be easily extensible and maintainable?
A business workflow is more then a flowchart that maps nicely into a programming language. For example:
How do you model a task X that follows by multiple tasks Y0,Y1,Y2 that happen in parallel, where Y0 is a human process (need to wait for inputs) and Y1 is a web service that might go wrong and might need auto retry, and Y2 is an automated process; follows by a task Z that only should be carried out when all Y's are completed?
My thoughts...
Seems like I need to do a whole lot of storing / managing / keeping
track of states, and frequent checking with cfscheuler.
cfthread ain't going to help much since some tasks can take days
(e.g. wait for user's confirmation).
I can already image the flow is going to be spread around in multiple UDFs,
DB, and CFCs
any opensource workflow engine in other language that maybe we can port over to CF?
Thank you for your brain power. :)
Study the Java Process Definition Language specification where JBoss has an execution engine for it. Using this Java based engine may be your easiest solution, and it solves many of the problems you've outlined.
If you intend to write your own, you will probably end up modelling states and transitions, vertices and edges in a directed graph. And this as Ciaran Archer wrote are the components of a State Machine. The best persistence approach IMO is capturing versions of whatever data is being sent through workflow via serialization, capturing the current state, and a history of transitions between states and changes to that data. The mechanism probably needs a way to keep track of who or what has responsibility for taking the next action against that workflow.
Based on your question, one thing to consider is whether or not you really need to represent parallel tasks in your solution. Where instead it might be possible to en-queue a set of messages and then specify a wait state for all of those to complete. Representing actual parallelism implies you are moving data simultaneously through several different processes. In which case when they join again you need an algorithm to resolve deltas, which is very much a non trivial task.
In the context of ColdFusion and what you're trying to accomplish, a scheduled task may be necessary if the system you're writing needs to poll other systems. Consider WDDX as a serialization format. JSON, while seductively simple, I recall has some edge cases around numbers and dates that can cause you grief.
Finally see my answer to this question for some additional thoughts.
Off the top of my head I'm thinking about the State design pattern with state persisted to a database. Check out the Head First Design Patterns's Gumball Machine example.
Generally this will work if you have something (like a client / order / etc.) going through a number of changes of state.
Different things will happen to your object depending on what state you are in, and that might mean sitting in a database table waiting for a flag to be updated by a user manually.
In terms of other languages I know Grails has a workflow module available. I don't know if you would be better off porting to CF or jumping ship to Grails (right tool for the job and all that).
It's just a thought, hope it helps.

How exactly does sharkscope or PTR data mine all those hands?

I'm very curious to know how this process works. These sites (http://www.sharkscope.com and http://www.pokertableratings.com) data mine thousands of hands per day from secure poker networks, such as PokerStars and Full Tilt.
Do they have a farm of servers running applications that open hundreds of tables (windows) and then somehow spider/datamine the hands that are being played?
How does this work, programming wise?
There are a few options. I've been researching it since I wanted to implement some of this functionality in a web app I'm working on. I'll use PokerStars for example, since they have, by far, the best security of any online poker site.
First, realize that there is no way for a developer to rip real time information from the PokerStars application itself. You can't access the API. You can, though, do the following:
Screen Scraping/OCR
PokerStars does its best to sabotage screen/text scraping of their application (by doing simple things like pixel level color fluctuations) but with enough motivation you can easily get around this. Google AutoHotkey combined with ImageSearch.
API Access and XML Feeds
PokerStars doesn't offer public access to its API. But it does offer an XML feed to developers who are pre-approved. This XML feed offers:
PokerStars Site Summary - shows player, table, and tournament counts
PokerStars Current Tournament data - files with information about upcoming and active tournaments. The data is provided in two files:
PokerStars Static Tournament Data - provides tournament information that does not change frequently, and
PokerStars Dynamic Tournament Data - provides frequently changing tournament information
PokerStars Tournament Results - provides information about completed tournaments. The data is provided in two files:
PokerStars Tournament Results – provides basic information about completed tournaments, and
PokerStars Tournament Expanded Results – provides expanded information about completed tournaments.
PokerStars Tournament Leaders Board - provides information about top PokerStars players ranked using PokerStars Tournament Ranking System
PokerStars Tournament Leaders Board BOP - provides information about top PokerStars players ranked using PokerStars Battle Of Planets Ranking System
Team PokerStars – provides information about Team PokerStars players and their online activity
It's highly unlikely that these sites have access to the XML feed (or an improved one which would provide all the functionality they need) since PokerStars isn't exactly on good terms with most of these sites.
This leaves two options. Scraping the network connection for said data, which I think is borderline impossible (I don't have experience with this so I'm not sure; I've heard it's highly encrypted and not easy to tinker with, but I'm not sure) and, mentioned above, screen scraping/OCR.
Option #2 is easy enough to implement and, with some work, can avoid detection. From what I've been able to gather, this is the only way they could be doing such massive data mining of PokerStars (I haven't looked into other sites but I've heard security on anything besides PokerStars/Full Tilt is quite horrendous).
[edit]
Reread your question and realized I didn't unambiguously answer it.
Yes, they likely have a massive amount of servers running watching all currently running tables, tournaments, etc. Realize that there is a decent amount of money in what they're doing.
This, for instance, could be how they do it (speculation):
Said bot applications watch the tables and data mine all information that gets "posted" to the chat log. They do this by already having a table of images that correspond to, for example, all letters of the alphabet (since PokerStars doesn't post their text as... text. All text in their software is actually an image). So, the bot then rips an image of the chat log, matches it against the store, converts the data to a format they can work with, and throws it in a database. Done.
[edit]
No, the data isn't sold to them by the poker sites themselves. This would be a PR nightmare if it ever got out, which it would. And it wouldn't account for the functionality of these sites, which appears to be instantaneous. OPR, Sharkscope, etc. There are, without a doubt, applications running that are ripping the data real time from the poker software, likely using the methods I listed.
maybe I can help.
I play poker, run a HUD, look at the stats and am a software developer.
I've seen a few posts on this suggesting it's done by OCR software grabbing the screen. Well, that's really difficult and processor hungry, so a programmer wouldn't choose to do that unless there were no other options.
Also, because you can open multiple windows, the poker window can be hidden or partially obscured by other things on the screen, so you couldn't guarantee to be able to capture the screen.
In short, they read the log files that are output by the poker software.
When you install your HUD like Sharkscope or Jivaro etc, than they run client software on your PC. It reads the log files and updates its own servers with every hand you play.
Most poker software is similar, but lets start with Pokerstars, as thats where I play. The Poker software outputs to local log files for every action you/it makes. It shows your cards, any opponents cards that you see plus what you do. eg. which button you have pressed, how much you/they bet etc. It posts these updates in near real time and timestamps the log file.
You can look at your own files to see this in action.
On a PC do this (not sure what you do on a Mac, but will be similar)
1. Load File Explorer
2. Select VIEW from the menu
3. Select HIDDEN ITEMS so that you can see the hidden data files
4. Goto C:\Users\Dave\AppData\Local\PokerStars.UK (you may not be called DAVE...)
5. Open the PokerStars.log.0 file in NOTEPAD
6. In Notepad, SEARCH for updateMyCard
7. It will show your card numerically
3c for 3 of Clubs
14d for Ace of Diamonds
You can see your opponents cards only where you saw them at the table.
Here is a few example lines from the log file.
OnTableData() round -2
:::TableViewImpl::updateMyCard() 8s (0) [2A0498]
:::TableViewImpl::updateMyCard() 13h (1) [2A0498]
:::TableViewImpl::updatePlayerCard() 7s (0) [2A0498]
:::TableViewImpl::updatePlayerCard() 14s (1) [2A0498]
[2015/12/13 12:19:34]
cheers, hope this helps
Dave
I've thought about this, and have two theories:
The "sniffer" sites have every table open, AND:
Are able to pull the hand data from the network stream. (or:)
Are obtaining the hand data from the GUI (screen scraping, pulling stuff out via the GUI API).
Alternately, they may have developed/modified clients to log everything for them, but I think one of the above solutions is likely simpler.
Well, they have two choices:
they spider/grab the data without consent. Then they risk being shut down anytime. The poker site can easily detect such monitoring at this scale and block it. And even risk a lawsuit for breach of the terms of service, which probably disallow the use of robots.
they pay for getting the data directly. This saves a lot of bandwidth (e.g. not having to load the full pages, extraction, updates with html changes etc.) and makes their business much less risky (legally and technically).
Guess which one they more likely chose; at least if the site has been around for some time without being shut down every now and then.
I'm not sure how it works but I have an application id and a key- which you get as a gold or silver subscriber- sign up for a month and send them an email and you will get access and the API documentation.