Trying to create a system to merge two lists of "buzz words" together, forming every possibility for domain availability checking? - list

My problem originates from me trying to create names for all my crazy (brilliant?) ideas for business and products, which then need to have their purchasing availability checked for .com domain names.
So I have a pen and paper system where I create two lists of words... List A and List B for example.
I want to find or create a little app where I can create and store custom lists which takes each word from List A, appends each word from List B (to create a total of List A * List B results?)
After the list is compiled of "ListAListB" results, I want to check if the .com domain is available for purchase online via some other method...
And ultimately, create a new list of each combination, along with some sort of visual status like maybe a color or word representing if the combined word is available as a .com...
So I'm basically using a nested for loop structure to index each word in List A, Loop through each word in List B, and create List C?
Then when the list is fully completed, send a CSV? to somewhere online and then somehow get a new list back.
I guess that is my rough thought process.
Any advice in the algorithm to create the list from the two original lists is appreciated.
Any help in the process to check the available domain names online via godaddy, ICANN, etc is appreciated..
Any help as to where I might find this tool already is even more appreciated..
I could probably download a free sdk or tool and write this in a language I suppose, based on my c++ experience from a few years ago, but I am rusty for sure, and haven't actually created anything since college like 3 years ago.
Thank you.

Here's a quick shell script that leverages Chris's answer.
#!/bin/sh
ids_url="http://instantdomainsearch.com/services/quick/?name="
for a in $(< listA); do
for b in $(< listB); do
avail=`wget -qO- $ids_url$a$b | sed -e "s/.*'com':'u'.*//g"`
if [ "$avail" == "" ]; then
echo "$a$b.com unavailable"
else
echo "$a$b.com available"
fi
done
done
It iterates through both lists, hits the DNS service with wget and looks for any results that contain "'com':'a'". Supposing List A contains 'goo', 'foo', and 'arglbar' and List B contains 'gle', the output should look like this:
google.com unavailable
foogle.com unavailable
arglbargle.com available
Pipe it through grep -v unavail to see only the available names.

For checking if a domain you can give try this out:
Request this page:
http://instantdomainsearch.com/services/quick/?name=example
Which will return this json (u = unavailable, a = available)
{'name':'example','com':'a','net':'u','org':'a'}
Then you just need to parse it. You may get blocked if you are checking lots of domains but I doubt it since this site recieves a ton of request from one session so you should be good (I'd pause a 800 miliseconds between each request at least).
C# code for list creation:
// Load up all the lines of each list into string arrays
string[] listA = File.ReadAllLines("listA.txt");
string[] listB = File.ReadAllLines("listB.txt");
// Create a list to hold the combinations
List<string> listC = new List<string>();
// Loop through each line in listA
foreach(string buzzwordA in listA)
{
// Now loop through each word in listB
foreach(string buzzwordB in listB)
listC.Add(buzzwordA + buzzwordB); // Combine them and add it to the listC
}
File.WriteAllLines("listC.txt", listC.ToArray()); // Save all the combos
I didn't check the code but thats the general idea. This is a bad idea for huge lists though because it reads the lists completely into memory at the same time. A better solution is probably reading the files line-by-line using FileStream and StreamReader
Edit (Same idea but this uses filestreams):
// Open up file streams for the lists
using(FileStream _listA = new FileStream("listA.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
using(FileStream _listB = new FileStream("listB.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
using(StreamReader listA = new StreamReader(_listA))
using(StreamReader listB = new StreamReader(_listB))
using(StreamWriter listC = new StreamWriter("listC.txt"))
{
string buzzwordA = listA.ReadLine();
while(buzzwordA != null)
{
string buzzwordB = listB.ReadLine();
while(buzzwordB != null)
{
listC.WriteLine(buzzwordA + buzzwordB);
buzzwordB = listB.ReadLine();
}
buzzwordA = listA.ReadLine();
// reset the listB stream to the begining
listB.BaseStream.Seek(0, SeekOrigin.Begin);
}
} // All streams and readers are disposed by using statement
For parsing the json try this out: C# Json Parser Library

check out www.bustaname.com
sounds exactly like what you're doing

Related

How to convert list into DataFrame in Python (Binance Futures API)

Using Binance Futures API I am trying to get a proper form of my position regarding cryptocurrencies.
Using the code
from binance_f import RequestClient
request_client = RequestClient(api_key= my_key, secret_key=my_secet_key)
result = request_client.get_position()
I get the following result
[{"symbol":"BTCUSDT","positionAmt":"0.000","entryPrice":"0.00000","markPrice":"5455.13008723","unRealizedProfit":"0.00000000","liquidationPrice":"0","leverage":"20","maxNotionalValue":"5000000","marginType":"cross","isolatedMargin":"0.00000000","isAutoAddMargin":"false"}]
The type command indicates it is a list, however adding at the end of the code print(result) yields:
[<binance_f.model.position.Position object at 0x1135cb670>]
Which is baffling because it seems not to be the list (in fact, debugging it indicates object of type Position). Using PrintMix.print_data(result) yields:
data number 0 :
entryPrice:0.0
isAutoAddMargin:True
isolatedMargin:0.0
json_parse:<function Position.json_parse at 0x1165af820>
leverage:20.0
liquidationPrice:0.0
marginType:cross
markPrice:5442.28502271
maxNotionalValue:5000000.0
positionAmt:0.0
symbol:BTCUSDT
unrealizedProfit:0.0
Now it seems like a JSON format... But it is a list. I am confused - any ideas how I can convert result to a proper DataFrame? So that columns are Symbol, PositionAmt, entryPrice, etc.
Thanks!
Your main question remains as you wrote on the header you should not be confused. In your case you have a list of Position object, you can see the structure of Position in the GitHub of this library
Anyway to answer the question please use the following:
df = pd.DataFrame([t.__dict__ for t in result])
For more options and information please read the great answers on this question
Good Luck!
you can use that
df = pd.DataFrame([t.__dict__ for t in result])
klines=df.values.tolist()
open = [float(entry[1]) for entry in klines]
high = [float(entry[2]) for entry in klines]
low = [float(entry[3]) for entry in klines]
close = [float(entry[4]) for entry in klines]

Keeping an index with flatMap in Swift

This is a follow-up to this question:
flatMap and `Ambiguous reference to member` error
There I am using the following code to convert an array of Records to an array of Persons:
let records = // load file from bundle
let persons = records.flatMap(Person.init)
Since this conversion can take some time for big files, I would like to monitor an index to feed into a progress indicator.
Is this possible with this flatMap construction? One possibility I thought of would be to send a notification in the init function, but I am thinking counting the records is also possible from within flatMap ?
Yup! Use enumerated().
let records = // load file from bundle
let persons = records.enumerated().flatMap { index, record in
print(index)
return Person(record)
}

Using Regex in Pig in hadoop

I have a CSV file containing user (tweetid, tweets, userid).
396124436476092416,"Think about the life you livin but don't think so hard it hurts Life is truly a gift, but at the same it is a curse",Obey_Jony09
396124436740317184,"“#BleacherReport: Halloween has given us this amazing Derrick Rose photo (via #amandakaschube, #ScottStrazzante) http://t.co/tM0wEugZR1” yes",Colten_stamkos
396124436845178880,"When's 12.4k gonna roll around",Matty_T_03
Now I need to write a Pig Query that returns all the tweets that include the word 'favorite', ordered by tweet id.
For this I have the following code:
A = load '/user/pig/tweets' as (line);
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(line,'(.*)[,”:-](.*)[“,:-](.*)')) AS (tweetid:long,msg:chararray,userid:chararray);
C = filter B by msg matches '.*favorite.*';
D = order C by tweetid;
How does the regular expression work here in splitting the output in desired way?
I tried using REGEX_EXTRACT instead of REGEX_EXTRACT_ALL as I find that much more simpler, but couldn't get the code working except for extracting just the tweets:
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT(line,'[,”:-](.*)[“,:-]',1)) AS (msg:chararray);
the above alias gets me the tweets, but if I use REGEX_EXTRACT to get the tweet_id, I do not get the desired o/p: B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT(line,'(.*)[,”:-]',1)) AS (tweetid:long);
(396124554353197056,"Just saw #samantha0wen and #DakotaFears at the drake concert #waddup")
(396124554172432384,"#Yutika_Diwadkar I'm just so bright 😁")
(396124554609033216,"#TB23GMODE i don't know, i'm just saying, why you in GA though? that's where you from?")
(396124554805776385,"#MichaelThe_Lion me too 😒")
(396124552540852226,"Happy Halloween from us 2 #maddow & #Rev_AlSharpton :) http://t.co/uC35lDFQYn")
grunt>
Please help.
Can't comment, but from looking at this and testing it out, it looks like your quotes in the regex are different from those in the csv.
" in the csv
” in the regex code.
To get the tweetid try this:
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT(line,'.*(,")',1)) AS (tweetid:long);

Best way to compare phone numbers using Regex

I have two databases that store phone numbers. The first one stores them with a country code in the format 15555555555 (a US number), and the other can store them in many different formats (ex. (555) 555-5555, 5555555555, 555-555-5555, 555-5555, etc.). When a phone number unsubscribes in one database, I need to unsubscribe all references to it in the other database.
What is the best way to find all instances of phone numbers in the second database that match the number in the first database? I'm using the entity framework. My code right now looks like this:
using (FusionEntities db = new FusionEntities())
{
var communications = db.Communications.Where(x => x.ValueType == 105);
foreach (var com in communications)
{
string sRegexCompare = Regex.Replace(com.Value, "[^0-9]", "");
if (sMobileNumber.Contains(sRegexCompare) && sRegexCompare.Length > 6)
{
var contact = db.Contacts.Where(x => x.ContactID == com.ContactID).FirstOrDefault();
contact.SMSOptOutDate = DateTime.Now;
}
}
}
Right now, my comparison checks to see if the first database contains at least 7 digits from the second database after all non-numeric characters are removed.
Ideally, I want to be able to apply the regex formatting to the point in the code where I get the data from the database. Initially I tried this, but I can't use replace in a LINQ query:
var communications = db.Communications.Where(x => x.ValueType == 105 && sMobileNumber.Contains(Regex.Replace(x.Value, "[^0-9]", "")));
Comparing phone numbers is a bit beyond the capability of regex by design. As you've discovered there are many ways to represent a phone number with and without things like area codes and formatting. Regex is for pattern matching so as you've found using the regex to strip out all formatting and then comparing strings is doable but putting logic into regex which is not what it's for.
I would suggest the first and biggest thing to do is sort out the representation of phone numbers. Since you have database access you might want to look at creating a new field or table to represent a phone number object. Then put your comparison logic in the model.
Yes it's more work but it keeps the code more understandable going forward and helps cleanup crap data.

How to change a node's property based on one of its other properties in Neo4j

I just started using Neo4j server 2.0.1. I am having trouble with the writing a cypher script to change one of the nodes property to something based one of its already defined properties.
So if I created these node's:
CREATE (:Post {uname:'user1', content:'Bought a new pair of pants today', kw:''}),
(:Post {uname:'user2', content:'Catching up on Futurama', kw:''}),
(:Post {uname:'user3', content:'The last episode of Game of Thrones was awesome', kw:''})
I want the script to look at the content property and pick out the word "Bought" and set the kw property to that using a regular expression to pick out word(s) larger then five characters. So, user2's post kw would be "Catching, Futurama" and user3's post kw would be "episode, Thrones, awesome".
Any help would be greatly appreciated.
You could do something like this:
MATCH (p:Post { uname:'user1' })
WHERE p.content =~ "Bought .+"
SET p.kw=filter(w in split(p.content," ") WHERE length(w) > 5)
if you want to do that for all posts, which might not be the fastest operation:
MATCH (p:Post)
WHERE p.content =~ "Bought .+"
SET p.kw=filter(w in split(p.content," ") WHERE length(w) > 5)
split splits a string into a collection of parts, in this case words separated by space
filter filters a collection by a condition behind WHERE, only the elements that fulfill the condition are kept
Probably you'd rather want to create nodes for those keywords and link the post to the keyword nodes.