How do I reformat the output of re.findall?

How do I reformat the output of re.findall? - python-2.7

Here is my code:
def cost_channelID(received_frame):
received_frame.columns = ['Ad', 'Impressions', 'eCPM', 'Ad Spend']
Ads = received_frame['Ad']
ID = []
for ad in Ads:
num = re.findall(r'\d{6}',ad)
ID.append(num)
ID = pd.Series(ID)
return(ID)
The output is like this:
[111234]
[111235]
......
[111444]
I would like the output to be without the brackets:
111234
111235
......
144444

Getting single element from sequence in Python is fairly simple
num = re.findall(r'\d{6}',ad)[0]
Although I'd think about why re.findall (method that returns sequence) was used in first place.

Related

Groovy groupby List generated from Map values

This is the List that I have
List a = ["Sep->Day02->FY21;Inter01","Sep->Day02->FY21;Inter02","Sep->Day02->FY21;Inter03","Sep->Day01->FY21;Inter18","Sep->Day01->FY21;Inter19"]
I am trying to group this and generate a new string
Expected result
"Sep->Day02->FY21",Inter01:Inter03
"Sep->Day01->FY21",Inter18:Inter19
Tried to do this
List a = ["Sep->Day02->FY21;Inter01","Sep->Day02->FY21;Inter02","Sep->Day02->FY21;Inter03","Sep->Day01->FY21;Inter18","Sep->Day01->FY21;Inter19"]
List b = []
a.each{
b.add(it.split(";"))
}
def c = b.groupBy{it[0]}
println c
c.each{
k, v -> println "${v}"
}
I cant find a way to get the range of Inter01:Inter03 in the string. Please advice.
EDIT
The solution provided by Marmite works in the groovy console as expected. The list I am generating is from values in a map.
a.add(map[z]) Where z is the key.
When I am trying to use it, it gives me max and min method not found errors.
Tried using map[z].toString(). Still the same. Is the fact that the values are from a map affecting the same?
Code Snippet
Below is how I generate the map
def map = [:]
itr.each{
def Per = it.getMemberName("Period") //getMemberName is a product specific function . Sample output May
def Day = it.getMemberName("Day") //Day01 sample output
def Hour = it.getMemberName("Hour") //Interval01 sample output
def HourInt = it.getMemberName("Hour").reverse().take(2).reverse()
def Year = it.getMemberName("Years")
map.put(it.DataAsDate.format("dd/MM/yyyy")+"-"+Hour,Year+"->"+Per+"->"+Day+";"+HourInt)
}
Below is where I generate the List
def OSTinterval = OST.reverse().take(2).reverse() as Integer //This creates 01 out of Interval01
def OETinterval = OET.reverse().take(2).reverse() as Integer //This creates 03 out of Interval03
D1 = new Date(OSDDay)
D2 = new Date(OEDDay)
if (D1 == D2)
{
(OSTinterval..OETinterval).each
{inter ->
z = OSDDay+"-"+"Interval"+inter.toString().padLeft(2,'0')
coll.add(map[z].toString())
}
}
else
{
(D1..D2).each {
if (it == D1){
(OSTinterval..48).each
{inter ->
z = OSDDay+"-"+"Interval"+inter.toString().padLeft(2,'0')
coll.add(map[z].toString())
}

Basically you are looking for min and max after grouping.
I'm using a bit simplified data to focus on the processing:
def a = ['D02;I01','D02;I02','D02;I03','D01;I18','D01;I18']
println a.collect{it.split(";")}
.groupBy{it[0]}
.collect{k,v -> [k,v.collect {it[1]}]}
.collect{[it[0],"${it[1].min()}:${it[1].max()}"]}
.collect{it.join(',')}
returns a list of the keys with the min and max of the values
[D02,I01:I03, D01,I18:I18]
result after groupBy
[D02:[[D02, I01], [D02, I02], [D02, I03]], D01:[[D01, I18], [D01, I18]]]
The next collect removes the duplicated keys
[[D02, [I01, I02, I03]], [D01, [I18, I18]]]
Finally you find the min and max of the lists
[[D02, I01:I03], [D01, I18:I18]]

Django - Type Error "is not subscriptable"

I'm trying to create a list which consists of several calculations. next, the idea is to render it in a template.
This is what I have so far:
views.py :
def calculation(request, itemslug):
#All the Values ordered chronologically:
values = Value.objects.filter(item__slug=itemslug).order_by('date')
dates = []
results =[]
#Create a list consisting of the dates
for value in values:
a = value.date
dates.append(a)
#Peform a calculation per date
for date in dates:
latestvalue = Value.objects.filter(item__slug=itemslug).get(date=date)['amount']
paidup = CashFlow.objects.filter(item__slug=itemslug).filter(date__lt=date).filter(type='cashin').aggregate(sum=Sum('amount'))['sum']
try:
result = round(latestvalue/paidup * 100,2)
except ZeroDivisionError :
result = 0
results.append(result)
return render(request, 'overview/detail.html',
{
'result':results,
})
unfortunately I get the TypeError : 'Value object is not subscriptable'
+ might be the case there are some other errors in my code.. Many thanks to have a look !!
Thanks,

How about simplifying it a bit:
latestvalue = Value.objects.get(item__slug=itemslug, date=date).amount
paidup = (CashFlow.objects
.filter(item__slug=itemslug, date__lt=date, type='cashin')
.aggregate(sum=Sum('amount'))['sum'])

proper formatting printing the length of items in a dictionary

My code seems to be working, but I have having trouble with the print statement, which I will eventually write out to a CSV. I am able to get the print to work for the first two items, but when I try to add the len part as the third thing to print, it get an error "'str' object is not callable". When I print the len part by itself, it seems to work fine. Any insight as to what I am doing wrong to print all together?
inFile = open(file.txt,'r')
reader = csv.reader(inFile)
allrows = list(reader)
dd = defaultdict(OrderedDict)
ids = OrderedDict()
output = {}
iterallrows = iter(allrows)
next(iterallrows)
for row in iterallrows:
id_ = row[2]
name = row[3]
dd[id_][name] = None
ids[id_] = None
print('{} {} {}'.format(id_,','.join(dd[id_],','(len(dd[id_])))))

You have this:
[...],','(...)[...]
This attempts to treat ',' as a function, which it is not. Put a comma between all arguments to a function.

How to iterate over multiple lists with different values but same variables while placing all values in one Pandas DataFrame

Sorry for the long title, didn't know how to ask it:
I am working with ExactTarget Salesforce Marketing API, trying to iterate over multiple dictionary objects from the API call but some of them are nested and have the same name as the other API responses and I am getting confused on how to iterate over the same named variables into a dataframe.
This is the output of the API Call:
(ClickEvent){
Client =
(ClientID){
ID = 11111111
}
PartnerKey = None
CreatedDate = 2016-07-12 00:40:17
ModifiedDate = 2016-07-12 00:40:17
ID = 11111111
ObjectID = "11111111"
SendID = 11111111
SubscriberKey = "azfull#usa.net"
EventDate = 2016-07-12 00:40:17
EventType = "Click"
TriggeredSendDefinitionObjectID = None
BatchID = 1
URLID = 11111111
URL = aaa.com
I want to create a separate dataframe column for the "ID" under "ClientID" but I am running into the trouble of another variable already being named "ID". How can I iterate over "ClientID" and get the ID value plus also get the other values and place them in the dataframe?
My code has been able to place the data in the dataframe but I am not getting the particular Client ID. this is what output looks like now:
BatchID ClientID CreatedDate \
0 1 (ClientID){\n ID = 10914162\n } 2016-02-23 13:08:59
1 1 (ClientID){\n ID = 10914162\n } 2016-02-23 13:11:49
As you can see only want the ID number not the other garbage under "ClientID"
Code:
import ET_Client
import pandas as pd
try:
debug = False
stubObj = ET_Client.ET_Client(False, debug)
## Modify the date below to reduce the number of results returned from the request
## Setting this too far in the past could result in a very large response size
retrieveDate = '2016-07-11T13:00:00.000'
#ET call for clicks
print '>>>ClickEvents'
getClickEvent = ET_Client.ET_ClickEvent()
getClickEvent.auth_stub = stubObj
getResponse = getClickEvent.get()
ResponseResults = getResponse.results
#print ResponseResults
Client = []
partner_keys = []
created_dates = []
modified_date = []
ID = []
ObjectID = []
SendID = []
SubscriberKey = []
EventDate = []
EventType = []
TriggeredSendDefinitionObjectID = []
BatchID = []
URLID = []
URL = []
for ClickEvent in ResponseResults:
Client.append(str(ClickEvent['Client']))
partner_keys.append(ClickEvent['PartnerKey'])
created_dates.append(ClickEvent['CreatedDate'])
modified_date.append(ClickEvent['ModifiedDate'])
ID.append(ClickEvent['ID'])
ObjectID.append(ClickEvent['ObjectID'])
SendID.append(ClickEvent['SendID'])
SubscriberKey.append(ClickEvent['SubscriberKey'])
EventDate.append(ClickEvent['EventDate'])
EventType.append(ClickEvent['EventType'])
TriggeredSendDefinitionObjectID.append('TriggeredSendDefinitionObjectID')
BatchID.append(ClickEvent['BatchID'])
URLID.append(ClickEvent['URLID'])
URL.append(ClickEvent['URL'])
df = pd.DataFrame({'ClientID': Client, 'PartnerKey': partner_keys,
'CreatedDate' : created_dates, 'ModifiedDate': modified_date,
'ID':ID, 'ObjectID': ObjectID,'SendID':SendID,'SubscriberKey':SubscriberKey,
'EventDate':EventDate,'EventType':EventType,'TriggeredSendDefinitionObjectID':TriggeredSendDefinitionObjectID,
'BatchID':BatchID,'URLID':URLID,'URL':URL})
print df
I have been trying this solution but not working:
for ClickEvent in ResponseResults():
if 'ClientID' in ClickEvent:
ID.append(ClickEvent['Client']:
print Client
Thank you in advance.
-EDIT-
The output of the API call above is exactly how the systems outputs it, how should I make it an actual JSON response?
Data frame I want to look like this:
BatchID ClientID CreatedDate \
0 1 111111111 2016-02-23 13:08:59
1 1 111111111 2016-02-23 13:11:49
Just dont want other stuff in the "ClientID" portion of the data I submitted above. Hope this helps.

Instead of appending the entire Client object to your list :
Client.append(str(ClickEvent['Client']))
Have you tried storing just the ID field of the object? Maybe something like:
Client.append(str(ClickEvent['Client']['ID']))

Printing Results from Loops

I currently have a piece of code that works in two segments. The first segment opens the existing text file from a specific path on my local drive and then arranges, based on certain indices, into a list of sub list. In the second segment I take the sub-lists I have created and group them on a similar index to simplify them (starts at def merge_subs). I am getting no error code but I am not receiving a result when I try to print the variable answer. Am I not correctly looping the original list of sub-lists? Ultimately I would like to have a variable that contains the final product from these loops so that I may write the contents of it to a new text file. Here is the code I am working with:
from itertools import groupby, chain
from operator import itemgetter
with open ("somepathname") as g:
# reads text from lines and turns them into a list sub-lists
lines = g.readlines()
for line in lines:
matrix = line.split()
JD = matrix [2]
minTime= matrix [5]
maxTime= matrix [7]
newLists = [JD,minTime,maxTime]
L = newLists
def merge_subs(L):
dates = {}
for sub in L:
date = sub[0]
if date not in dates:
dates[date] = []
dates[date].extend(sub[1:])
answer = []
for date in sorted(dates):
answer.append([date] + dates[date])
new code
def openfile(self):
filename = askopenfilename(parent=root)
self.lines = open(filename)
def simplify(self):
g = self.lines.readlines()
for line in g:
matrix = line.split()
JD = matrix[2]
minTime = matrix[5]
maxTime = matrix[7]
self.newLists = [JD, minTime, maxTime]
print(self.newLists)
dates = {}
for sub in self.newLists:
date = sub[0]
if date not in dates:
dates[date] = []
dates[date].extend(sub[1:])
answer = []
for date in sorted(dates):
print(answer.append([date] + dates[date]))
enter code here
enter code here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How do I reformat the output of re.findall? - python-2.7

Getting single element from sequence in Python is fairly simple num = re.findall(r'\d{6}',ad)[0] Although I'd think about why re.findall (method that returns sequence) was used in first place.

Related

Groovy groupby List generated from Map values

Django - Type Error "is not subscriptable"

proper formatting printing the length of items in a dictionary

How to iterate over multiple lists with different values but same variables while placing all values in one Pandas DataFrame

Printing Results from Loops

Categories

Resources